Published on in Vol 12 (2024)

Preprints (earlier versions) of this paper are available at, first published .
Forecasting Hospital Room and Ward Occupancy Using Static and Dynamic Information Concurrently: Retrospective Single-Center Cohort Study

Forecasting Hospital Room and Ward Occupancy Using Static and Dynamic Information Concurrently: Retrospective Single-Center Cohort Study

Forecasting Hospital Room and Ward Occupancy Using Static and Dynamic Information Concurrently: Retrospective Single-Center Cohort Study

Original Paper

1Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center & University of Ulsan College of Medicine, Seoul, Republic of Korea

2Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea

3Division of Cardiology, Asan Medical Center, Seoul, Republic of Korea

4Department of Digital Innovation, Asan Medical Center, Seoul, Republic of Korea

5Big Data Research Center, Asan Institute for Life Sciences, Asan Medical Center, Seoul, Republic of Korea

6Division of Cardiology, Department of Information Medicine, Asan Medical Center & University of Ulsan College of Medicine, Seoul, Republic of Korea

*these authors contributed equally

Corresponding Author:

Young-Hak Kim, MD, PhD

Division of Cardiology

Department of Information Medicine

Asan Medical Center & University of Ulsan College of Medicine

88, Olympic-ro 43-gil


Seoul, 05505

Republic of Korea

Phone: 82 2 3010 0955


Background: Predicting the bed occupancy rate (BOR) is essential for efficient hospital resource management, long-term budget planning, and patient care planning. Although macro-level BOR prediction for the entire hospital is crucial, predicting occupancy at a detailed level, such as specific wards and rooms, is more practical and useful for hospital scheduling.

Objective: The aim of this study was to develop a web-based support tool that allows hospital administrators to grasp the BOR for each ward and room according to different time periods.

Methods: We trained time-series models based on long short-term memory (LSTM) using individual bed data aggregated hourly each day to predict the BOR for each ward and room in the hospital. Ward training involved 2 models with 7- and 30-day time windows, and room training involved models with 3- and 7-day time windows for shorter-term planning. To further improve prediction performance, we added 2 models trained by concatenating dynamic data with static data representing room-specific details.

Results: We confirmed the results of a total of 12 models using bidirectional long short-term memory (Bi-LSTM) and LSTM, and the model based on Bi-LSTM showed better performance. The ward-level prediction model had a mean absolute error (MAE) of 0.067, mean square error (MSE) of 0.009, root mean square error (RMSE) of 0.094, and R2 score of 0.544. Among the room-level prediction models, the model that combined static data exhibited superior performance, with a MAE of 0.129, MSE of 0.050, RMSE of 0.227, and R2 score of 0.600. Model results can be displayed on an electronic dashboard for easy access via the web.

Conclusions: We have proposed predictive BOR models for individual wards and rooms that demonstrate high performance. The results can be visualized through a web-based dashboard, aiding hospital administrators in bed operation planning. This contributes to resource optimization and the reduction of hospital resource use.

JMIR Med Inform 2024;12:e53400




The global health care market continues to grow, but the burden of health care costs on governments and individuals is reaching its limits. Consequently, there is increasing interest in the efficient use of limited resources in health care systems, and hospitals must develop approaches to maximize medical effectiveness within budgetary constraints [1,2]. One approach to this is optimizing the use of medical resources. Medical resources can be broadly categorized into 3 categories: human resources, physical capital, and consumables. The appropriate and optimized use of these resources is critical for improving health care quality and providing care to a larger number of patients [3,4].

Among the 3 medical resources, hospital beds are considered one of the physical capitals provided by hospitals to patients. These beds are allocated for various purposes, such as rest, hospitalization, postsurgical recovery, etc. They constitute one of the factors that can directly influence the patient’s internal satisfaction within the hospital. However, owing to limited space, hospitals often have a restricted number of beds. Moreover, the number and functionality of beds are often fixed owing to budgetary or environmental constraints, making it difficult to make changes. Nonetheless, if hospital administrators can evaluate bed occupancy rates (BORs) according to different time periods, they can predict the need for health care professionals and resources. On the basis of this information, hospitals can plan resources efficiently, reduce operational costs, and achieve economic objectives [5]. In addition, excessive BORs can exert a negative effect on the health of staff members and increase the possibility of exposure to infection risks. Hence, emphasizing only maintaining a high BOR may not necessarily lead to favorable outcomes for the hospital [6,7]. Considering these reasons, BOR prediction plays a vital role in hospitals and is recognized as a broadly understood necessity for resource optimization in the competitive medical field.

In the medical field, optimizing resources is crucial in the face of limited bed capacity and intense competition. Therefore, bed planning is a vital consideration aimed at minimizing hospital costs [8]. To achieve this, hospitals need to plan staffing and vacations weeks or months in advance [9]. The use of machine learning (ML) technology for BOR prediction is necessary to address fluctuations in patient numbers due to seasonal variations or infectious diseases, ensuring continuous hospital operations. In the Netherlands, hospitals have already implemented ML-based BOR prediction [10], and Johns Hopkins Hospital uses various metrics to effectively manage bed capacity for optimization. Predicting BORs based on quantitative data contributes to validating the clinical quality and cost-effectiveness of treatments. This, in turn, enhances overall accountability throughout the wards and contributes to improving hospital efficiency [11].

Prior Work

Hospital BOR prediction has been investigated using various approaches recently. From studies predicting bed demand using mathematical statistics or regression equation models based on given data [12-15], the focus has shifted toward modeling approaches using time-series analysis. This approach observes recorded data over time to predict future values.

A previous study has taken an innovative approach using time-series analysis alongside the commonly used regression analysis for bed demand prediction, and the study demonstrated that using time-series prediction for bed occupancy yielded higher performance results than using a simple trend fitting approach [16]. Another study used the autoregressive integrated moving average (ARIMA) model for univariate data and a time-series model for multivariate data to predict BORs [17]. With the advancement of deep learning (DL) models that possess strong long-term memory capabilities, such as recurrent neural network (RNN) and long short-term memory (LSTM), there has been an increase in studies applying these models to time-series data for prediction purposes. For instance, in the study by Kutafina et al [9], hospital BORs were predicted based on dates and public holiday data from government agencies and schools, without involving the personal information of patients. The study used a nonlinear autoregressive exogenous model to predict a short-term period of 60 days, with an aim to contribute to the planning of hospital staff. The model demonstrated good performance, with an average mean absolute percentage error of 6.24%. In emergency situations, such as the recent global COVID-19 pandemic, the sudden influx of infected patients can disrupt the hospitalization plans for patients with pre-existing conditions [18]. Studies have been conducted using DL architectures to design models for predicting the BOR of patients with COVID-19 on a country-by-country basis. Some studies incorporated additional inputs, such as vaccination rate and median age, to train the models [19]. Studies have also been conducted to focus on the short-term prediction of BORs during the COVID-19 period [20,21]. Prior studies are summarized in Table 1.

Although previous research has contributed to BOR prediction and operational planning at the hospital level, more detailed and systematic predictions are necessary for practical application in real-world operations. To address this issue, studies have developed their own computer simulation hospital systems to not only predict bed occupancy but also execute scheduling for admissions and surgeries to enhance resource utilization [22-24]. Nevertheless, existing studies have the limitation of focusing solely on the overall BOR of the hospital. As an advancement to these studies, we aim to propose a strategy for predicting the BOR at the level of each ward and room using various variables in a time-series manner. Interestingly, to our knowledge, this is the first study to apply DL to predict ward- and room-specific occupancy rates using time-series analysis.

Table 1. Summary of prior studies.
StudyYearData setMethodPrediction target
Mackay and Lee [12]2007Deidentified data, the date and time of patient admission and discharge between 1998 and 2000Comparison of 2 compartment models through cross-validationEntire hospital bed occupancy (annual average)
Littig and Isken [13]2007Historical and real-time data warehouse and hospital information systems (emergency department, financial, surgical scheduling, and inpatient tracking systems)Computerized model of MLRa and LRbEntire hospital short-term occupancy (24 h or 72 h) based on LOSc
Kumar and Mo [14]2010Bed management between June 1, 2006, and June 1, 2007; Information: (1) In each class based on length of stay and admission data; (2) Historical previous year’s same week admission data; (3) Relationship between identified variables to aid bed managersThe 3 methods are: (1) Poisson bed occupancy model; (2) Simulation model; and (3) Regression modelThe 3 prediction targets are: (1) Estimation of bed occupancy and optimal bed requirements in each class; (2) Bed occupancy levels for every class for the following week; and (3) Weekly average number of occupied beds
Seematter-Bagnoud et al [15]2015Inpatient stay data in 2010 (acute somatic care inpatients and outpatients)Three models of hypothesis-based statistical forecasting of future trendsThe 3 targets are: (1) Number of hospital stays; (2) Hospital inpatient days; and (3) Beds for medical stay
Farmer and Emami [16]1990Inpatient stay data for general surgery in the age group of 15-44 years between 1969 and 1982The 2 methods are: (1) Forecasting from a structural model and (2) The time-series or Box-Jenkins methodEntire hospital short-term daily bed requirements
Kim et al [17]2014Data warehouse between January 2009 and June 2012The 2 methods are: (1) The ARIMAd model for univariate data and (2) The time-series model for multivariate dataEntire hospital bed occupancy (1 day and 1 week)
Kutafina et al [9]2019Inpatient stay data between October 14, 2002, and December 31, 2015 (patient identifier, time of admission, discharge, and name of the clinic the patient was admitted to; no personal information on the patients or staff was provided)NARXe model, a type of RNNfEntire hospital mid-term bed occupancy (60 days, bed pool in units of 30 beds)
Bouhamed et al [19]2022COVID-19 hospital occupancy data in 15 countries between December 2021 and early January 2022The 3 models are: LSTMg, GRUh, and SRNNi. Incorporate vaccination percentage and median age of the population to improve performanceEntire hospital bed occupancy
Bekker et al [20]2021Historical data publicly available until mid-October 2020The 2 methods are: (1) Using linear programming to predict admissions and (2) Fitting the remaining LOS and using results from the queuing theory to predict occupancyThe 2 targets are: (1) Patient admission and (2) Entire hospital short-term bed occupancy
Farcomeni et al [21]2021

Patients admitted to the intensive care unit between January and June 2020The 2 methods are: (1) Generalized linear mixed regression model and (2) Area-specific nonstationary integer autoregressive methodologyEntire hospital short-term intensive care bed occupancy

aMLR: multinomial logistic regression.

bLR: linear regression.

cLOS: length of stay.

dARIMA: autoregressive integrated moving average.

eNARX: nonlinear autoregressive exogenous.

fRNN: recurrent neural network.

gLSTM: long short-term memory.

hGRU: grid recurrent unit.

iSRNN: simple recurrent neural network.

Goal of This Study

The aim of this study was to predict the BORs of hospital wards and rooms using time-series data from individual beds. Although overall bed occupancy prediction is useful for macro-level resource management in hospitals, resource allocation based on the prediction of occupancy rates for each ward and room is required for specific hospital scheduling and practicality. Through this approach, we aim to contribute to the efficient operational cost optimization of the hospital and ensure the availability of resources required for patient care.

We have developed time-series prediction models based on deep neural network (DNN), among which 1 model combines data representing room-specific features (static data) with dynamic data to enhance the prediction performance for room bed occupancy rates (RBORs). Based on bidirectional long short-term memory (Bi-LSTM), the RBOR prediction model demonstrates a lower mean absolute error (MAE) of 0.049, a mean square error (MSE) of 0.042, a root mean square error (RMSE) of 0.007, and a higher R2 score of 0.291, indicating the highest performance among all RBOR models.

We developed 6 types of BOR prediction models, of which 2 types were used for predicting ward bed occupancy rates (WBORs), and the other 4 types focused on predicting RBORs. These models use LSTM and Bi-LSTM architectures with strong long-term memory capabilities as their basic structure. We created 6 models for each architecture, resulting in a total of 12 models. The WBOR models were used for predicting weekly and monthly occupancy rates, serving long-term hospital administrative planning purposes. Conversely, the RBOR models were designed for immediate and rapid occupancy planning and were trained with 3- and 7-day intervals. Each RBOR model was enhanced by combining static data, which represent room-specific features, to generate more sophisticated prediction models.

Figure 1 shows the potential application of our model as a form of web software in a hospital setting. Through an online dashboard, it can provide timely information regarding bed availability, enabling intelligent management of patient movements related to admission and discharge. It facilitates shared responsibilities within the hospital and simplifies future resource planning [25].

In the Introduction section, we explored the importance of this research and investigated relevant previous studies, providing a general overview of the direction of our research. In the Methods section, we provide descriptions of the data set used and the structure of the DNN algorithm used, and explain the model architecture and performance. In the Results section, we present the performance and outcomes of this study. Finally, in the Discussion section, we discuss the contributions, limitations, and potential avenues for improvement of the research.

Figure 1. Virtual dashboard of the status and forecast of the ward bed occupancy rate (WBOR) and room bed occupancy rate (RBOR). The first screen presents the overall bed occupancy rate of the hospital, along with the number of beds in use and available. Moreover, a predictive graph displays the anticipated WBOR for selected dates. The second screen presents the WBOR for individual beds, indicating their statuses, such as “in use,” “reserved,” “empty,” and “cleaning.” Detailed information about each room is also displayed.


We intended to predict the BORs of individual hospital wards and rooms based on the information accumulated in individual bed–level data on an hourly basis, aggregated on a daily basis. For this purpose, we developed 12 time-series models. As the base models, we applied LSTM and Bi-LSTM, which are suitable for sequence data. These models address the limitation of long-term memory loss in traditional RNNs and were chosen because of their suitability for training bed data represented as sequence data.

Based on the model architecture, there were 2 WBOR prediction model types, which were trained at 7- and 30-day intervals to predict the occupancy rate for the next day. Moreover, there were 2 RBOR prediction model types, similar to the ward models, which were trained at 3- and 7-day intervals. Furthermore, as another approach, each RBOR prediction model was augmented with static data, and 2 DL algorithms were proposed for the final comparison of their performances in predicting RBORs.

Ethical Considerations

The study was approved by the Asan Medical Center (AMC) Institutional Review Board (IRB 2021-0321) and was conducted in accordance with the 2008 Declaration of Helsinki.


Study Setting

This was a retrospective single-center cohort study. Data were collected from AMC, with information on the occupancy status of each bed recorded at hourly intervals between May 27, 2020, and November 21, 2022. The data set comprised a total of 54,632,684 records. This study used ethically preapproved data. Deidentified data used in the study were extracted from ABLE, the AMC clinical research data warehouse.

A total of 57 wards, encompassing specialized wards; 1411 rooms, including private and shared rooms; and 4990 beds were included in this study. Wards and rooms with specific characteristics, such as intensive care unit, newborn room, and nuclear medicine treatment room, were excluded from the analysis as their occupancy prediction using simple and general variables did not align with the direction of this study.

Supporting Data

Supporting data for public holidays were added in our data set. We considered that holidays have both a recurring pattern with specific dates each year and a distinctive characteristic of being nonworking days, which could affect occupancy rates. Based on Korean public holidays, which include Chuseok, Hangeul Proclamation Day, Children’s Day, National Liberation Day, Memorial Day, Buddha’s Birthday, Independence Movement Day, and Constitution Day, there were 27 days that corresponded to public holidays during the period covered by the data set. We denoted these dates with a value of “1” if they were public holidays and “0” if they were not, based on the reference date.

Preprocessing and Description of Variables

Among the variables representing individual beds, the reference date, ward and room information, patient occupancy status, bed cleanliness status, and detailed room information were available. Based on the recorded date of bed status, we derived additional variables, such as the reference year, reference month, reference week (week of the year), reference day, and reference day of the week.

Room data were derived from the input information representing the cleanliness status of beds. This variable had 2 possible states, namely, “admittable” and “discharge.” If neither of these states was indicated, it implied that a patient was currently hospitalized in the bed. As the status of hospitalized patients was indicated by missing values, we replaced them with the number “1” to indicate the presence of a patient in the bed and “0” otherwise. The sum of all “1” values represented the current number of hospitalized patients. The count of beds in each room indicated the capacity of each room. The target variable BOR was calculated by dividing the number of patients in the room by the room capacity, resulting in a room-specific patient occupancy rate variable. The ward data were subjected to a similar process as that of the room data, with the difference being that we generated ward-specific variables, such as ward capacity and WBOR, using the same approach. The static room data consisted of 14 variables, including the title of the room and the detailed information specific to each room.

For the variables in the ward and room data, we disregarded the units of the features and converted them into numerical values for easy comparison, after which we performed normalization. Regarding the variables representing detailed room information, we converted them to numerical values where “yes” was represented as “1” and “no” was represented as “0.”

The final set of variables used in this study was categorized into date, ward, room, and detailed room information. Table 2 provides the detailed descriptions of the variables used in our training, including all the administrative data related to beds that are readily available in the hospital.

The explanation of the classification for generating the data sets for training each model is provided in Table 3. The static features of the detailed room information were combined with the room data set, which has sequence characteristics, to generate a separate data set termed Room+Static.

Table 2. Description of variables by category.

Year3 categoriesReference year for bed status

Month12 categoriesReference month for bed status

Week53 categoriesReference week for bed status

Day31 categoriesReference day for bed status

Weekday7 categoriesReference day of the week for bed status

Holiday2 categoriesHoliday status

Ward abbreviation57 categoriesAbbreviations for entire ward names

Ward capacityNumericNumber of available ward beds

Ward bed capacityNumericNumber of patients currently admitted to the ward

Ward occupancy rateNumericWard bed capacity divided by ward capacity

Room abbreviation1411 categoriesAbbreviations for entire room names

Room capacityNumericNumber of available room beds

Room bed capacityNumericNumber of patients currently admitted to the room

Room occupancy rateNumericRoom bed capacity divided by room capacity
Room static feature

Room code34 categoriesRoom grade code

Nuclear2 categories (Na/Yb)Nuclear medicine room availability

Sterile2 categories (N/Y)Sterile room availability

Isolation2 categories (N/Y)Isolation room availability

EEGc testing2 categories (N/Y)EEG testing room availability

Observation2 categories (N/Y)Observation room availability

Kidney2 categories (N/Y)Kidney transplant room availability

Liver2 categories (N/Y)Liver transplant room availability

Sub-ICUd2 categories (N/Y)Sub-ICU room availability

Special2 categories (N/Y)Special room availability

Small single2 categories (N/Y)Small single room availability

Short-term2 categories (N/Y)Short-term room availability

Psy-double2 categories (N/Y)Psychiatry department double room availability

Psy-open2 categories (N/Y)Psychiatry department open room availability

aN: No.

bY: Yes.

cEEG: electroencephalogram.

dICU: intensive care unit.

Table 3. Data set classification and included variables.
Data setVariables
Ward data setWard abbreviation, year, month, week, day, weekday, holiday, ward capacity, ward bed capacity, and ward occupancy rate
Room data setRoom abbreviation, year, month, week, day, weekday, holiday, room capacity, room bed capacity, and room occupancy rate
Static data set14 static variables related to detailed room information
Room+Static data setRoom abbreviation, year, month, week, day, weekday, holiday, room capacity, room bed capacity, 14 static variables related to detailed room information, and room occupancy rate

Each data set was split into training, validation, and test sets for training and evaluation of the model. The training set consisted of 32,153 rows (67.8%), with data from May 27, 2020, to December 2021. The validation set, used for parameter tuning, included 7085 rows (15.0%), with data from January to June 2022. Finally, the test set comprised 8208 rows (17.2%), with data from July 2022 to November 21, 2022.

DL Algorithms

We used various DL algorithms for in-depth learning. In the following subsections, we will provide explanations for each model algorithm used in our research.

LSTM Network

RNN [26] is a simple algorithm that passes information from previous steps to the current step, allowing it to iterate and process sequential data. However, it encounters difficulties in handling long-term dependencies, such as those found in time-series data, owing to the vanishing gradient problem. To address this issue, LSTM [27] was developed. LSTM excels in handling sequence data and is commonly used in natural language processing, machine translation, and time-series data analysis. LSTM consists of an input gate, output gate, and forget gate. The “cell state,” is carefully controlled by each gate to determine whether the memory should be retained or forgotten for the next time step.

Bi-LSTM Network

Although RNN and LSTM possess the ability to remember previous data, they have a limitation in that their results are primarily based on immediate past patterns because the input is processed in a sequential order. This limitation can be overcome through a network architecture known as Bi-LSTM [28]. Bi-LSTM allows end-to-end learning, minimizing the loss on the output and simultaneously training all parameters. It also has the advantage of performing well even with long data sequences. Because of its suitability for models that require knowledge of dependencies from both the past and future, such as LSTM-based time-series prediction, we additionally selected Bi-LSTM as the base model.

Attention Mechanism

Attention mechanism [29,30] refers to the process of incorporating the encoder’s outputs into the decoder at each time step of predicting the output sequence. Rather than considering the entire input sequence, it focuses more on the relevant components that are related to the predicted output, allowing the model to focus on important areas. This mechanism helps minimize information loss in data sets with long sequences, enabling better learning and improving the model’s performance. It has been widely used in areas such as text translation and speech recognition. Nevertheless, as it is still based on RNN models, it has the drawbacks of slower speed and not being completely free from information loss issues.

Combining Static and Dynamic Features

Data can exhibit different characteristics even at the same time. For instance, in data collected at 1-hour intervals for each hospital bed, we can distinguish between “dynamic data,” which include features that change over time, such as the bed condition, date, and patient occupancy, and “static data,” which consist of information that remains constant, such as the ward and room number.

DL allows us to use all the available information for prediction. Therefore, for predicting the RBOR, we investigated an approach that combines dynamic and static data using an LSTM-based method [31]. This approach demonstrated better performance than LSTM alone [32]. Our approach involves adding a layer that incorporates static data as an input to the existing room occupancy prediction model.

Model Architecture

Base Model

Our objective was to predict the intermediate-term occupancy rates of wards and rooms within the hospital to contribute to hospital operation planning. Bi-LSTM was chosen as the base model owing to its improved predictive performance compared with the traditional LSTM model. However, to quantitatively compare these models, we conducted a comparison of the results for each model (6 for each, with a total of 12 models).

A typical LSTM model processes data sequentially, considering only the information from the past up to the current time step. However, Bi-LSTM, by simultaneously processing data in both forward and backward directions, has a unique feature that allows it to leverage both current and future information for predictions. This bidirectionality helps the model effectively learn temporal dependencies and intricate patterns. However, despite these advantages, Bi-LSTM comes with the trade-off of doubling the number of model parameters, resulting in increased computational costs for training and prediction. While a more complex model can better adapt to the training data, there is an increased risk of overfitting, especially with small data sets. Nevertheless, the reason for choosing Bi-LSTM for tasks like predicting BORs in hospitals, involving time-series data, lies in its ability to harness the power of bidirectional information. Bi-LSTM processes input data from both past and future directions simultaneously, enabling it to effectively incorporate future information into current predictions. This proves beneficial for handling complex patterns in long time-series data [28].

Moreover, we have enhanced the performance of our models by adding an attention layer to Bi-LSTM. The attention layer assigns higher weights to features that exert a significant impact on the prediction, allowing the model to focus on relevant information and gather necessary input features. This helps improve the accuracy of the prediction. Furthermore, the attention layer reduces the amount of information processed, resulting in improved computational efficiency. Ultimately, this contributes toward enhancing the overall performance of the model.

The window length of the input sequence was divided into 3 different intervals, namely, 3, 7, and 30 days. The WBOR model was trained on sequences with a window length of 7 and 30 days, whereas the RBOR model was trained on sequences with a window length of 3 and 7 days. The first layer of our model consisted of Bi-LSTM, which was followed by the leaky rectified linear unit (LeakyReLU) activation function. LeakyReLU is a linear function that has a small gradient for negative input values, similar to ReLU. It helps the model converge faster. After applying this process once again, the AttentionWithContext layer was applied, which focuses on important components of input sequence data and transforms outputs obtained from the previous layer. After applying the activation function again, a dense layer with 1 neuron was added for generating the final output. The sigmoid function was used to limit the output values between 0 and 1. Finally, our model was compiled using the MSE loss function, Adam optimizer, and MAE metric. The parameters for each layer were selected based on accumulated experience through research. Figure 2 visually represents the above-described structure.

Figure 2. Base bidirectional long short-term memory (Bi-LSTM) model architecture. LeakyReLU: leaky rectified linear unit; LSTM: long short-term memory.
Combining Dynamic and Static Data Using the DL Model

The accumulated bed data, which were collected on a time basis, were divided into dynamic and static data of the rooms, which were then inputted separately. To improve the performance of the BOR prediction model, we designed different DL architectures for the characteristics of these 2 types of data.

We first used a base model based on LSTM and Bi-LSTM to learn the time-series data and then focused the model’s attention using the dense layer to process fixed-size inputs. To prevent overfitting, we applied the dropout function to randomly deactivate neurons in 2 dense layers. The hidden states of the 2 networks were combined, and the resulting output was passed to a single layer, combining the time dynamic and static data.

Finally, the hidden states of the 2 networks were combined, and the combined result was passed to a single layer to effectively integrate the dynamic and static data. This allowed us to use the information from both the dynamic and static data for BOR prediction. This architecture is illustrated in Figure 3.

Figure 3. Bidirectional long short-term memory (Bi-LSTM) model architecture combining static and dynamic variables. LeakyReLU: leaky rectified linear unit; LSTM: long short-term memory.

Hyperparameter Tuning

One of the fundamental methods to enhance the performance of artificial intelligence (AI) learning models is the use of hyperparameter tuning. Hyperparameters are parameters passed to the model to modify or adjust the learning process. While hyperparameter tuning may rely on the experience of researchers, there are also functionalities that automatically search for hyperparameters, taking into account the diversity of model structures.

Various methods for search optimization have been proposed [33,34], but we implemented our models using the Keras library. By leveraging Keras Tuner, we automatically searched for the optimal combinations of units and learning rates for each model, contributing to the improvement of their performance.

Time Series Cross-Validation

Time-series data exhibit temporal dependencies between data points, making it crucial to consider these characteristics when validating a model. Commonly used K-fold cross-validation is effective for evaluating models on general data sets [35], providing effectiveness in preventing overfitting and enhancing generalizability by dividing the data into multiple subsets [36,37]. However, for time-series data, shuffling the data randomly is not appropriate owing to the inherent sequential dependency of the observations.

Time series cross-validation is a method that preserves this temporal dependence while dividing the data [38]. It involves splitting the entire hospital bed data set into 5 periods, conducting training and validation for each period, and repeating this process as the periods shift. This approach is particularly effective when observations in the dynamic data set, such as hospital bed data recorded at 1-hour intervals, play a crucial role in predicting future values based on past observations.

Shuffling data randomly using K-fold may disrupt the temporal continuity, leading to inadequate reflection of past and future observations. Therefore, time series cross-validation sequentially partitions the data, ensuring the temporal flow is maintained, and proves to be more effective in evaluating the model’s performance. This method enables the model to make more accurate predictions of future occupancy based on past trends.


We selected various metrics to evaluate the performance of time-series data predictions. Among them, MAE represents the absolute difference between the model’s predicted values and the actual BOR. We also considered MSE, which is sensitive to outliers. Moreover, to address the limitations of MSE and provide a penalty for large errors, we opted for RMSE. We also used the R2 score to measure the correlation between the predicted and actual values.

MAE is a commonly used metric to evaluate the performance of time-series prediction models. MAE is intuitive and easy to calculate, making it widely used in practice. Because MAE uses absolute values, it is less sensitive to outliers in the occupancy rate values for specific dates. MAE is calculated using the following formula:

MSE is a metric that evaluates the magnitude of errors by squaring the differences between the predicted and actual values and then taking the average. It is calculated using the following formula:

RMSE is used to address the limitations of MSE where the error scales as a square, providing a more intuitive understanding of the error magnitude between the predicted and actual values. It penalizes large errors, making it less sensitive to outliers. RMSE is calculated using the following formula:

The R2 score is used to measure the explanatory potential of the prediction model, and it is calculated using the following formula:

Here, SSR represents the sum of squared differences between the predicted and actual values, and SST represents the sum of squared differences between the actual values and the mean value of actual values. Figure 4 shows the prediction method and overall flow in this study.

Figure 4. Overall flow in this study. Bi-LSTM: bidirectional long short-term memory; LSTM: long short-term memory; MAE: mean absolute error; MSE: mean square error; RMSE: root mean square error.

We used 2 DL models, LSTM and Bi-LSTM, and compared the performance of 12 different prediction models. These models have been denoted as ward 7 days (W7D), ward 30 days (W30D), room 3 days (R3D), room 7 days (R7D), room static 3 days (RS3D), and room static 7 days (RS7D). Using Keras Tuner, we adjusted the hyperparameters of the models and subsequently validated the models through a 5-fold time series cross-validation.

The prediction performances of the models for WBOR and RBOR were compared, which showed that they were more accurate at predicting WBOR, with MAE values of 0.06 to 0.07. The W7D model based on Bi-LSTM, which used 7 days of ward data to predict the next day’s ward occupancy, had a MAE value of 0.067, MSE value of 0.009, and RMSE value of 0.094, showing high accuracy. The R2 score was also 0.544, which was approximately 0.240 higher than that of the W30D model (0.304), indicating that the variables in that model explained occupancy reasonably well.

We next compared the performances of the 8 models for RBOR prediction, and among them, the RS7D model based on Bi-LSTM, which was trained on a 7-day time step by integrating static and dynamic data, showed the best performance. It achieved a MAE value of 0.129, MSE value of 0.050, RMSE value of 0.227, and R2 score of 0.260. In particular, the R2 score outperformed that of the R3D model by 0.014. These data are summarized in Table 4. Regarding the WBOR prediction model, the model with a shorter training unit, W7D, demonstrated better performance. However, regarding the RBOR prediction model, the model with a longer training unit of 7 days, which incorporated detailed room-specific information, exhibited slightly higher performance than the model with a shorter training unit of 3 days. The model with the added room-specific information still demonstrated superior performance overall.

We visualized the predicted and actual occupancy for Bi-LSTM models and investigated the occupancy trends since July 2022 on our test data set. First, we selected a specific ward in W7D to demonstrate the change in the WBOR over 2 months. The right panel of Figure 5 shows the WBOR change over 5 months from July 2022 in W30D. The blue line represents the actual occupancy value, and the red line represents the predicted occupancy value by the model. This provides an at-a-glance view of the overall predicted occupancy level for each month and allows hospital staff to observe trends to obtain a rough understanding of the WBOR.

Figure 6 shows graphs of occupancy rate values for a randomized specific room, displaying the predicted and actual values for the 4 RBOR prediction models, with 2 graphs for each model. The left graph shows the occupancy rate change over 5 months from July to November 2022, and the right graph shows the occupancy rate for the months of July and August, providing a detailed view of the RBOR. By examining the trends of the predicted and actual values for the 4 models in this period for a specific room, we can observe that the models maintain a similar trend to the actual occupancy rate.

Table 4. Performances of the occupancy prediction models.
Model and foldMAEaMSEbRMSEcR2 score












































aMAE: mean absolute error.

bMSE: mean square error.

cRMSE: root mean square error.

dLSTM: long short-term memory.

eBi-LSTM: bidirectional long short-term memory.

fW30D: ward 30 days.

gW7D: ward 7 days.

hR7D: room 7 days.

iR3D: room 3 days.

jRS7D: room static 7 days.

kRS3D: room static 3 days.

Figure 5. Examples of the predicted and actual bed occupancy rates for the 2-month period from July to August 2022 for ward 7 days and the 5-month period from July to November 2022 for ward 30 days.
Figure 6. Examples of the predicted and actual bed occupancy rates for the 2-month period from July to August 2022 and the 5-month period from July to November 2022.

Principal Findings

The entire data set of this study consisted of administrative data collected at AMC at an hourly interval for each ward from May 27, 2020, to November 21, 2022. To improve the hospital’s challenges, we developed a model to predict the occupancy rate of wards and rooms. Our aim was to contribute toward administrative and financial planning for bed management within the hospital.

During the specified period, we compared the results of using DL models to predict the overall BOR for each ward and individual rooms. In the case of WBOR prediction, the MAE of the 7-day window model based on Bi-LSTM was approximately 0.067, demonstrating a remarkably close prediction to the occupancy compared with that of the 30-day window model based on LSTM, with a difference of approximately 0.035. Furthermore, the MSE and RMSE were 0.009 and 0.094, respectively, indicating high accuracy in the predictions. Moreover, the R2 score of 0.544 indicated that the model had better explanatory potential than the average. For the individual RBOR prediction, among the 8 models, the RS7D model based on Bi-LSTM performed the best, exhibiting a MAE of approximately 0.129, which was remarkably lower than that of the other models. Moreover, the MSE and RMSE were significantly lower than those of the RBOR models, with differences of 0.042 and 0.07, respectively. The R2 score of 0.260 indicated that it had higher explanatory potential than the RS3D models based on LSTM, with the value being higher by 0.291.

Finally, we visualized the predicted and actual values on a graph for a specific period and observed that each model captured the trend of the actual BOR quite well. Although the models were less accurate in predicting low occupancy periods, they followed the general trend closely. Overall, these findings demonstrate that our DL models effectively predicted BORs for both wards and individual rooms, with certain models demonstrating superior performance in different scenarios.

Strengths and Limitations

Although the models in this study demonstrated good performance in following the trends of BORs and achieved good results, there were several limitations in this research. First, there were limitations in the data. Although we used administrative data and detailed room information available from the hospital to enable the models to capture occupancy trends, the relationship between the variables and the model’s explanatory potential showed room for improvement, as indicated by the R2 score. To achieve higher prediction accuracy, it would be beneficial to incorporate diverse data sources and real-time updated information.

Second, there was variability in external factors. Hospital BORs are heavily influenced by external environmental factors. Sudden events, such as environmental factors and outbreaks of infectious diseases like COVID-19, can render accurate prediction of bed occupancy challenging [18,32]. Furthermore, seasonal effects and accidents can increase the number of patients. Sufficient collection of long-term data on these external factors would be necessary, but such uncertainties can reduce the accuracy of predictions.

Despite these limitations, our study demonstrated a significant level of adherence to trends in the prediction of individual ward and room occupancy. More detailed variables and a longer period of data accumulation would be required to predict the specific number of beds.


We presented models that can predict the occupancy rates of wards and individual hospital rooms using artificial neural networks based on time-series data. The predicted results of these models demonstrated a high level of accuracy in capturing the future trends of the BOR. In particular, we presented 8 RBOR models with structure and window changes to compare their performance and found that the RS7D model showed the best performance. Our results can be implemented as a web application on hospital online dashboards, as depicted in Figure 1 [25]. In fact, Johns Hopkins University has been applying these methods in their command center to monitor hospital capacity and achieve effectiveness in patient management planning [39].

Furthermore, predicting BORs supports patient admission and discharge planning, helping to alleviate overcrowding in emergency departments and reduce patient waiting times. Staff members can effectively schedule patient admission and discharge, and minimize waiting times by understanding the BOR, providing urgent treatment to emergency patients. Moreover, providing appropriate information to patients waiting in the emergency department can increase patient satisfaction and facilitate efficient transition to hospital admission [40,41]. By applying AI models that combine BOR prediction, which contributes toward reducing emergency department waiting times with individual patient admission and discharge prediction, hospitals can achieve resource optimization and cost savings, resulting in improved patient satisfaction.


This work was supported by a Korea Medical Device Development Fund grant funded by the Korean government (the Ministry of Science and ICT; the Ministry of Trade, Industry and Energy; the Ministry of Health & Welfare, Republic of Korea; the Ministry of Food and Drug Safety) (project number: 1711195603, RS-2020-KD000097, 50%) and by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HR20C0026).

Conflicts of Interest

None declared.

  1. Reuben DB, Cassel CK. Physician stewardship of health care in an era of finite resources. JAMA. Jul 27, 2011;306(4):430-431. [CrossRef] [Medline]
  2. National Health Expenditure Projections 2011-2021. Centers for Medicare and Medicaid Services. URL: [accessed 2024-02-21]
  3. The world health report 2000 - Health systems: improving performance. World Health Organisation. URL: https:/​/cdn.​​media/​docs/​default-source/​health-financing/​whr-2000.​pdf?sfvrsn=95d8b803_1&download=true [accessed 2024-02-21]
  4. Kabene SM, Orchard C, Howard JM, Soriano MA, Leduc R. The importance of human resources management in health care: a global context. Hum Resour Health. Jul 27, 2006;4(1):20. [FREE Full text] [CrossRef] [Medline]
  5. Page K, Barnett AG, Graves N. What is a hospital bed day worth? A contingent valuation study of hospital Chief Executive Officers. BMC Health Serv Res. Feb 14, 2017;17(1):137. [FREE Full text] [CrossRef] [Medline]
  6. Keegan AD. Hospital bed occupancy: more than queuing for a bed. Med J Aust. Sep 06, 2010;193(5):291-293. [CrossRef] [Medline]
  7. Kaier K, Mutters N, Frank U. Bed occupancy rates and hospital-acquired infections--should beds be kept empty? Clin Microbiol Infect. Oct 2012;18(10):941-945. [FREE Full text] [CrossRef] [Medline]
  8. Anderson D. The impact of resource management on hospital efficiency and quality of care. University of Maryland. 2013. URL: [accessed 2024-02-21]
  9. Kutafina E, Bechtold I, Kabino K, Jonas SM. Recursive neural networks in hospital bed occupancy forecasting. BMC Med Inform Decis Mak. Mar 07, 2019;19(1):39. [FREE Full text] [CrossRef] [Medline]
  10. Baas S, Dijkstra S, Braaksma A, van Rooij P, Snijders FJ, Tiemessen L, et al. Real-time forecasting of COVID-19 bed occupancy in wards and Intensive Care Units. Health Care Manag Sci. Jun 25, 2021;24(2):402-419. [FREE Full text] [CrossRef] [Medline]
  11. Esteban C, Staeck O, Baier S, Yang Y, Tresp V. Predicting Clinical Events by Combining Static and Dynamic Information Using Recurrent Neural Networks. 2016. Presented at: 2016 IEEE International Conference on Healthcare Informatics (ICHI); October 4-7, 2016;93-101; Chicago, IL. [CrossRef]
  12. Mackay M, Lee M. Using Compartmental Models to Predict Hospital Bed Occupancy. Semantic Scholar. URL: https:/​/www.​​paper/​Using-Compartmental-Models-to-Predict-Hospital-Bed-Mackay-Lee/​f2b32e60df7dd80bd48 e8ccd0af920134d1452c5?p2df [accessed 2024-02-21]
  13. Littig SJ, Isken MW. Short term hospital occupancy prediction. Health Care Manag Sci. Feb 28, 2007;10(1):47-66. [CrossRef] [Medline]
  14. Kumar A, Mo J. Models for Bed Occupancy Management of a Hospital in Singapore. In: Proceedings of the 2010 International Conference on Industrial Engineering and Operations Management. 2010. Presented at: 2010 International Conference on Industrial Engineering and Operations Management; January 9-10, 2010; Dhaka, Bangladesh.
  15. Seematter-Bagnoud L, Fustinoni S, Dung D, Santos-Eggimann B, Koehn V, Bize R, et al. Comparison of different methods to forecast hospital bed needs. European Geriatric Medicine. Jun 2015;6(3):262-266. [CrossRef]
  16. Farmer RD, Emami J. Models for forecasting hospital bed requirements in the acute sector. J Epidemiol Community Health. Dec 01, 1990;44(4):307-312. [FREE Full text] [CrossRef] [Medline]
  17. Kim K, Lee C, O’Leary KJ, Rosenauer S, Mehrotra S. Predicting Patient Volumes in Hospital Medicine: A Comparative Study of Different Time Series Forecasting Methods. Northwestern University. URL: [accessed 2024-02-21]
  18. Rosenbaum L. Facing Covid-19 in Italy - Ethics, Logistics, and Therapeutics on the Epidemic's Front Line. N Engl J Med. May 14, 2020;382(20):1873-1875. [CrossRef] [Medline]
  19. Bouhamed H, Hamdi M, Gargouri R. Covid-19 Patients’ Hospital Occupancy Prediction During the Recent Omicron Wave via some Recurrent Deep Learning Architectures. Int. J. Comput. Commun. Control. Mar 14, 2022;17(3):4697. [CrossRef]
  20. Bekker R, Uit Het Broek M, Koole G. Modeling COVID-19 hospital admissions and occupancy in the Netherlands. Eur J Oper Res. Jan 01, 2023;304(1):207-218. [FREE Full text] [CrossRef] [Medline]
  21. Farcomeni A, Maruotti A, Divino F, Jona-Lasinio G, Lovison G. An ensemble approach to short-term forecast of COVID-19 intensive care occupancy in Italian regions. Biom J. Mar 30, 2021;63(3):503-513. [FREE Full text] [CrossRef] [Medline]
  22. Caro JJ, Möller J, Santhirapala V, Gill H, Johnston J, El-Boghdadly K, et al. Predicting Hospital Resource Use During COVID-19 Surges: A Simple but Flexible Discretely Integrated Condition Event Simulation of Individual Patient-Hospital Trajectories. Value Health. Nov 2021;24(11):1570-1577. [FREE Full text] [CrossRef] [Medline]
  23. Schmidt R, Geisler S, Spreckelsen C. Decision support for hospital bed management using adaptable individual length of stay estimations and shared resources. BMC Med Inform Decis Mak. Jan 07, 2013;13:3. [FREE Full text] [CrossRef] [Medline]
  24. Hancock WM, Walter PF. The use of computer simulation to develop hospital systems. SIGSIM Simul. Dig. Jul 1979;10(4):28-32. [CrossRef]
  25. Shahpori R, Gibney N, Guebert N, Hatcher C, Zygun D. An on-line dashboard to facilitate monitoring of provincial ICU bed occupancy in Alberta, Canada. JHA. Oct 10, 2013;3(1):47. [CrossRef]
  26. Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature. Oct 9, 1986;323(6088):533-536. [CrossRef]
  27. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. Nov 15, 1997;9(8):1735-1780. [CrossRef] [Medline]
  28. Schuster M, Paliwal K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997;45(11):2673-2681. [CrossRef]
  29. Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv. 2014. URL: [accessed 2024-02-21]
  30. Luong MT, Pham H, Manning CD. Effective Approaches to Attention-based Neural Machine Translation. arXiv. 2015. URL: [accessed 2024-02-21]
  31. Leontjeva A, Kuzovkin I. Combining Static and Dynamic Features for Multivariate Sequence Classification. 2016. Presented at: 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (DSAA); October 17-19, 2016;21-30; Montreal, QC. [CrossRef]
  32. Vincent J, Creteur J. Ethical aspects of the COVID-19 crisis: How to deal with an overwhelming shortage of acute beds. Eur Heart J Acute Cardiovasc Care. Apr 29, 2020;9(3):248-252. [FREE Full text] [CrossRef] [Medline]
  33. Vakharia V, Shah M, Nair P, Borade H, Sahlot P, Wankhede V. Estimation of Lithium-ion Battery Discharge Capacity by Integrating Optimized Explainable-AI and Stacked LSTM Model. Batteries. Feb 09, 2023;9(2):125. [CrossRef]
  34. Joshi S, Owens JA, Shah S, Munasinghe T. Analysis of Preprocessing Techniques, Keras Tuner, and Transfer Learning on Cloud Street image data. 2021. Presented at: IEEE International Conference on Big Data (Big Data); December 15-18, 2021; Orlando, FL. [CrossRef]
  35. Jung Y. Multiple predicting K-fold cross-validation for model selection. Journal of Nonparametric Statistics. Nov 21, 2017;30(1):197-215. [CrossRef]
  36. Nair P, Vakharia V, Borade H, Shah M, Wankhede V. Predicting Li-Ion Battery Remaining Useful Life: An XDFM-Driven Approach with Explainable AI. Energies. Jul 31, 2023;16(15):5725. [CrossRef]
  37. Seo H, Ahn I, Gwon H, Kang HJ, Kim Y, Cho HN, et al. Prediction of hospitalization and waiting time within 24 hours of emergency department patients with unstructured text data. Health Care Manag Sci. Nov 03, 2023.:09660-5. [CrossRef] [Medline]
  38. Deng A. Time series cross validation: A theoretical result and finite sample performance. Economics Letters. Dec 2023;233:111369. [CrossRef]
  39. Martinez DA, Kane EM, Jalalpour M, Scheulen J, Rupani H, Toteja R, et al. An Electronic Dashboard to Monitor Patient Flow at the Johns Hopkins Hospital: Communication of Key Performance Indicators Using the Donabedian Model. J Med Syst. Jun 18, 2018;42(8):133. [CrossRef] [Medline]
  40. Gartner D, Padman R. Machine learning for healthcare behavioural OR: Addressing waiting time perceptions in emergency care. Journal of the Operational Research Society. Apr 15, 2019;71(7):1087-1101. [CrossRef]
  41. Welch SJ. Twenty years of patient satisfaction research applied to the emergency department: a qualitative review. Am J Med Qual. Dec 04, 2010;25(1):64-72. [CrossRef] [Medline]

AI: artificial intelligence
AMC: Asan Medical Center
Bi-LSTM: bidirectional long short-term memory
BOR: bed occupancy rate
DL: deep learning
DNN: deep neural network
LeakyReLU: leaky rectified linear unit
LSTM: long short-term memory
MAE: mean square error
ML: machine learning
R3D: room 3 days
R7D: room 7 days
RBOR: room bed occupancy rate
RMSE: root mean square error
RNN: recurrent neural network
RS3D: room static 3 days
RS7D: room static 7 days
W7D: ward 7 days
W30D: ward 30 days
WBOR: ward bed occupancy rate

Edited by C Lovis; submitted 05.10.23; peer-reviewed by V Vakharia, T Leili; comments to author 10.11.23; revised version received 20.12.23; accepted 16.02.24; published 21.03.24.


©Hyeram Seo, Imjin Ahn, Hansle Gwon, Heejun Kang, Yunha Kim, Heejung Choi, Minkyoung Kim, Jiye Han, Gaeun Kee, Seohyun Park, Soyoung Ko, HyoJe Jung, Byeolhee Kim, Jungsik Oh, Tae Joon Jun, Young-Hak Kim. Originally published in JMIR Medical Informatics (, 21.03.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.