Published on in Vol 7, No 1 (2019): Jan-Mar

Preprints (earlier versions) of this paper are available at, first published .
The Connected Intensive Care Unit Patient: Exploratory Analyses and Cohort Discovery From a Critical Care Telemedicine Database

The Connected Intensive Care Unit Patient: Exploratory Analyses and Cohort Discovery From a Critical Care Telemedicine Database

The Connected Intensive Care Unit Patient: Exploratory Analyses and Cohort Discovery From a Critical Care Telemedicine Database

Original Paper

1College of Engineering, The University of Arizona, Tucson, AZ, United States

2College of Medicine - Tucson, The University of Arizona, Tucson, AZ, United States

3Division of Pulmonary, Allergy, Critical Care, and Sleep, Department of Medicine, The University of Arizona, Tucson, AZ, United States

4Department of Emergency Medicine, The University of Arizona, Tucson, AZ, United States

5Department of Systems and Industrial Engineering, The University of Arizona, Tucson, AZ, United States

6Department of Biomedical Engineering, The University of Arizona, Tucson, AZ, United States

Corresponding Author:

Patrick Essay, MS

College of Engineering

The University of Arizona

1127 East James E Rogers Way

Tucson, AZ, 85721-0020

United States

Phone: 1 4024305524


Background: Many intensive care units (ICUs) utilize telemedicine in response to an expanding critical care patient population, off-hours coverage, and intensivist shortages, particularly in rural facilities. Advances in digital health technologies, among other reasons, have led to the integration of active, well-networked critical care telemedicine (tele-ICU) systems across the United States, which in turn, provide the ability to generate large-scale remote monitoring data from critically ill patients.

Objective: The objective of this study was to explore opportunities and challenges of utilizing multisite, multimodal data acquired through critical care telemedicine. Using a publicly available tele-ICU, or electronic ICU (eICU), database, we illustrated the quality and potential uses of remote monitoring data, including cohort discovery for secondary research.

Methods: Exploratory analyses were performed on the eICU Collaborative Research Database that includes deidentified clinical data collected from adult patients admitted to ICUs between 2014 and 2015. Patient and ICU characteristics, top admission diagnoses, and predictions from clinical scoring systems were extracted and analyzed. Additionally, a case study on respiratory failure patients was conducted to demonstrate research prospects using tele-ICU data.

Results: The eICU database spans more than 200 hospitals and over 139,000 ICU patients across the United States with wide-ranging clinical data and diagnoses. Although mixed medical-surgical ICU was the most common critical care setting, patients with cardiovascular conditions accounted for more than 20% of ICU stays, and those with neurological or respiratory illness accounted for nearly 15% of ICU unit stays. The case study on respiratory failure patients showed that cohort discovery using the eICU database can be highly specific, albeit potentially limiting in terms of data provenance and sparsity for certain types of clinical questions.

Conclusions: Large-scale remote monitoring data sources, such as the eICU database, have a strong potential to advance the role of critical care telemedicine by serving as a testbed for secondary research as well as for developing and testing tools, including predictive and prescriptive analytical solutions and decision support systems. The resulting tools will also inform coordination of care for critically ill patients, intensivist coverage, and the overall process of critical care telemedicine.

JMIR Med Inform 2019;7(1):e13006



Critical care telemedicine, or tele-ICU, is broadly defined as a collaborative, interprofessional care model for critically ill patients where the bedside intensive care unit (ICU) team and patient are networked to a centralized and often remotely located critical care team using telecommunication and computer systems [1,2]. Applications of tele-ICU include quality improvement, continuous monitoring of patients for early warning of deterioration, and varying degrees of clinical decision support, interventions, and consultations [3,4]. Although there exist several tele-ICU models [5,6], we refer to tele-ICU in the context of continuous patient monitoring and subsequent data generation from application of telemedicine in intensive care settings as opposed to more active models involving computer-generated alerts or those with interventions such as audio and video consultations.

Advances in data management infrastructure, biomedical sensors and devices, and computational methods, coupled with the current trend of consolidation of hospitals into large health care delivery systems, provide unique opportunities for not only enhancing tele-ICU capabilities to improve patient, physician, and system-level outcomes but also leveraging tele-ICU data for research and evaluation purposes. The full benefit of the influx of tele-ICU data, however, has yet to be realized.

The objective of this study was to explore opportunities and challenges of using multisite, multimodal data acquired through critical care telemedicine. Using a publicly available tele-ICU database (eICU Collaborative Research Database), we illustrate the quality and potential uses of remote monitoring data [7]. In addition, we present a case study on extraction of multiple respiratory failure patient cohorts to illustrate various strengths and limitations of the database. Specifically, we present 3 patient cohorts—endotracheal intubation patients, patients requiring other noninvasive ventilation therapy, and patients with both invasive and noninvasive treatments in the same visit—and attempt to generate relevant questions for further research.

The electronic ICU (eICU) database consists of deidentified data collected from patients admitted to adult ICUs between 2014 and 2015. It consists of a wide array of data from admission diagnosis, patient severity scores, standard and custom lab values, nurse charting, physiological data, and treatment records through discharge status. Clinical scores in the database include the Acute Physiology Score (APS) III and the Acute Physiology and Chronic Health Evaluation (APACHE) IV and IVa, both of which are examples of existing instruments that have been widely used in critical care settings for assessment of disease severity and outcome prediction [8].

Hospital data were extracted along with patient demographics, diagnoses, length of stay and mortality outcomes, and treatment records. APACHE IVa severity scores and prediction values were also extracted. Development of respiratory failure cohorts utilized multiple record types in the database that contain respiratory chart and treatment data. The specific cohorts were created using intubation and ventilator-type records. We then attempted to verify patients that required endotracheal intubation or noninvasive respiratory therapy with a redundant record within the database to validate that patients actually required ventilation. For example, one can confidently say that a patient with a record of endotracheal tube and a treatment record of endotracheal tube insertion was in fact intubated during their ICU stay compared with a patient who has ventilator setting records but no other indication of airway type or noninvasive respiratory therapy.

Data from the eICU database were extracted and preprocessed in Python version 2.7.14 using the Pandas [9] and Seaborn libraries [10], versions 0.23.4 and 0.9.0, respectively. A complete evaluation of all data tables in the eICU database is available [11].

Participant Characteristics

The eICU database consists of 200,859 adult ICU stays at 208 hospitals including 139,367 unique patients with nearly equal numbers of male and female patients. The majority of patients are white. A high-level overview of the database is shown in Figure 1 and additional patient characteristics are presented in Table 1.

Figure 1. Infographic overview of the eICU Collaborative Research Database. ICU: intensive care unit.
View this figure
Table 1. Basic patient characteristics in the critical care telemedicine (tele-ICU) database.
Unique patients, na139,367
Distinct ICUb admissions, n200,859
Age, years, mean (SD)62.1 (16.7)
Gender, n (%)

Male108,379 (53.96)

Female92,303 (45.95)

Other134 (00.09)
Race, n (%)

White155,285 (77.31)

African American21,308 (10.61)

Hispanic7464 (3.72)

Asian3270 (1.63)

Native American1700 (0.85)

Unknown or unspecified11,832 (5.90)
ICU length of stay in days, mean (IQRc)ICU3.00 (2.31)
ICU mortality, % of ICU admissionsICU5.79
Hospital length of stay in days, mean (IQR)Hospital8.06 (7.04)
Hospital mortality, % of admissionsHospital9.24

aNot applicable.

bICU: intensive care unit.

cIQR: interquartile range.

The ICU types covered in the database are wide ranging, with mixed medical-surgical ICU as the most common critical care setting (Figure 2). This is likely because of the configuration and workflow of ICUs within each hospital. The majority of hospitals in the eICU database are primarily nonteaching hospitals across most of the United States (Figure 3).

There were 431 admission diagnoses with several additional diagnosis records in the database that provide context and higher granularity to the reasons for admission. Patients with cardiovascular conditions accounted for more than 20% of ICU stays, and those with neurological or respiratory illness accounted for nearly 15% of ICU unit stays. Table 2 shows further details on the most frequent admission diagnosis by number of ICU stays and the associated percent of the total visits in the database with corresponding mortality rates and average ICU length of stay.

Figure 2. Frequency of admission to each intensive care unit type within the eICU Collaborative Research Database. ICU: intensive care unit; Med-Surg ICU: medical surgical ICU; CTICU: cardiothoracic ICU; SICU: surgical ICU; CCU-CTICU: coronary care/CTICU ICU; MICU: medical ICU; Neuro ICU: neurological ICU; Cardiac ICU: cardiological ICU; CSICU: cardiac surgery ICU.
View this figure
Figure 3. (a) Hospital distribution by size and associated teaching status (b) hospital distribution by United States region and associated teaching status.
View this figure
Table 2. Most frequent admission diagnosis categories with corresponding intensive care unit (ICU) mortality rate and average ICU length of stay.
Admission diagnosis nameICU stays, n (%)Average length of stay, daysMortality, n (%)
Cardiovascular79,560 (20.6)2.974861 (7.33)
Neurologic31,113 (8.07)2.83949 (3.64)
Respiratory25,813 (6.69)3.681408 (7.00)
Gastrointestinal17,726 (4.60)2.95681 (4.63)
Sepsis, pulmonary8862 (2.30)4.31904 (12.26)
Metabolic or endocrine8025 (2.08)1.8872 (1.06)
Infarction, acute myocardial7228 (1.87)2.09180 (2.93)
Trauma7136 (1.85)3.59303 (5.01)
Cerebrovascular accident or stroke6647 (1.72)2.77290 (5.20)
Congestive heart failure6617 (1.72)3.13302 (5.67)
Table 3. Overview of Acute Physiology Score and Acute Physiology and Chronic Health Evaluation (APACHE) scores in the tele-ICU (critical care telemedicine) database with APACHE IVa predictions.
Acute Physiology Score, mean (IQRb)43.63 (27.00)c
APACHE Score, mean (IQR)55.49 (31.00)
Intensive care unit (ICU) length of stay in days, mean (IQR)3.87 (3.02)3.00 (2.31)
ICU mortality, % of ICU admissionsd5.495.79
Hospital length of stay in days, mean (IQR)9.44 (5.88)8.06 (7.04)
Hospital mortality, % of admissionsd3.849.24

aPrediction values taken from APACHE version IVa.

bIQR: interquartile range.

cN/A: not applicable.

dPredicted ICU and hospital mortality values are the averages of percent chance of dying of all patients.

Severity and Predictive Scoring Systems

Severity of illness and prognosis are captured in the eICU database as a function of the APACHE IVa score and consists of 288,090 entries. The APACHE evaluation also provides predictions of patient outcomes soon after ICU admission and includes probability of mortality, length of stay, and ventilation days and is used in conjunction with APS. An overview of APS and APACHE scores is presented in Table 3. The distributions of patient severity within the eICU database as a function of APACHE IVa stratified by discharge status of alive or expired is shown in Figure 4.

The APACHE mortality prediction distributions were normalized and segregated by discharge status as shown in Figure 5. This illustrates existing model deficiencies where predicted mortality is not reliable at higher severities [12,13]. Although the predictions for the survivors are reasonably accurate, the predictions for nonsurvivors are not. We include this to illustrate that although predictive models are useful in certain situations, they may not perform well in others because of the dynamics involved or issues with source data [14]. These results are consistent with evaluations of earlier versions of APACHE predictions [15] and are an area of improvement for tele-ICU to provide the best possible decision support for the fast-paced ICU environment.

Figure 4. Kernel density estimate (KDE) of Acute Physiology and Chronic Health Evaluation (APACHE) IVa scores within the eICU Collaborative Research Database stratified by actual intensive care unit mortality outcome.
View this figure
Figure 5. Kernel density estimate (KDE) of predicted hospital mortality in the eICU Collaborative Research Database stratified by actual intensive care unit (ICU) mortality outcome.
View this figure

Case Study on Respiratory Failure Patients

The selected respiratory failure patient cohorts and corresponding number of patients within each group developed from treatment records are shown in Figure 6. Possible noninvasive ventilation therapy failure was determined using treatment timestamps. Many endotracheal intubation records correspond to continuous positive airway pressure (CPAP) and positive end expiratory pressure (PEEP) records at the same time. However, it is possible that patients with intubation treatment recorded after CPAP or PEEP treatment required intubation after failure of noninvasive respiratory therapy.

As a demonstration of database coverage and specificity within a particular patient cohort, we selected the 1004 patients that have definitive records of both endotracheal tube insertion and removal. Using the associated treatment time stamps for tube insertion and removal, intubation times were estimated as was the distribution of admission diagnoses across the same cohort (Figures 7 and 8).

Figure 6. Number of patients with particular respiratory-type treatment records in the eICU database. CPAP: continuous positive airway pressure; PEEP: positive end expiratory pressure.
View this figure
Figure 7. Kernel density estimate (KDE) of intubation times for patients with endotracheal tube insertion and removal.
View this figure
Figure 8. Fifteen most frequent admission diagnoses for patients with record of endotracheal tube insertion and tube removal. AMI: acute myocardial infarction; CABG: coronary artery bypass grafting; CHF: congestive heart failure.
View this figure

Principal Findings

Investigation of the eICU Collaborative Research Database shows a wide range of illnesses from a large number of hospitals that span the continental United States. Organized as a relational database, it is highly versatile for narrowing research focus to specific critical care patient populations, and it allows for robust and generalizable analysis and modeling across multiple institutions and regions. The case study on respiratory failure patients illustrates the potential for cohort discovery and analysis of specific patient subgroups (see Figure 6) using unique identifiers across the database, coupled with the ability to query multiple record types such as treatment records, respiratory, medication, or laboratory data. For example, if using treatment records, one would find 8565 unique patients that required endotracheal intubation. If searching for patients with distinct records of both endotracheal tube insertion and removal treatments, the available cohort is limited to 1004 patients. Any combination of these data with other record types may limit or extend cohort size further.

Although the eICU database provides real-word critical care data from a diverse sample of hospitals and practice settings to evaluate interventions, there are some limitations to consider. First, the granularity of the data can be limiting, given the nature of data collection. For example, despite continuous collection of hemodynamic data, interventions such as tracheal intubation may be recorded with a margin of error because of a requirement for manual entry of events into the electronic medical record. Narrowing the window between when the intubation was performed and when the event was recorded could potentially be accomplished by using drugs associated with intubation. Regardless, this limitation makes studying peri-intubation complications difficult as one does not know whether a hemodynamic decompensation occurred before or after the intubation procedure. In addition, manually entered data could have deviations based on hospital-specific practices and protocol variations.

Second, though the eICU database is considered tele-ICU data, the mode of data collection and the origins of data are not well defined. Specifically, it is not clear which data are generated at the bedside versus the remote unit and by whom. Third, terminology variations across institutions and health information systems pose an additional hurdle. A study of a previous version of eICU data showed discrepancies in standards for laboratory and microbiology data for patients with primary cardiovascular diagnosis [16]. This suggests that cohort discovery on eICU data may also need to be reconfigured based on specific research questions.

Finally, a major caveat to the eICU database is that the absence of a record does not mean an event did not occur. This is true of other similar databases; however, missing records are exacerbated in the eICU database because of data being from many different hospitals, and not all participating hospitals have interfaces in place to record all data types. Although there are methods for handling data sparsity and missing data [17,18], large quantities of missing data could negate the overall benefit of having a large number of hospitals in the database.

Despite these limitations, the most critical component of future tele-ICU operations and the eICU Collaborative Research Database is that of advanced analytics and clinical decision support. For example, cardiovascular complications arising from traumatic brain injury are common and are linked to increased morbidity and mortality [19]. Generally, monitoring the vital signs of the patient and controlling primary intracranial pathology are effective for proactive prevention of complications. Tele-ICU not only offers continuous display of vitals for remote monitoring but can also serve as a platform to (1) develop, implement, and test clinical and subclinical markers of patient decompensation and other adverse events and (2) further define the role of tele-ICU in improving the precision of electronic alerts [20]. For example, as alarm desensitization creates additional risk, much of the monitoring and resolving of alerts can be shifted to the tele-ICU [21].

Other large, publicly available databases, such as the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC) database, have been widely used for secondary research purposes. MIMIC includes inpatient critical care data, spanning over 10 years from a single institution. An evaluation of the MIMIC database highlights the successful role of the database in risk assessment, medical personnel performance evaluation, and supporting development of clinical decision support systems [22]. The eICU database has strong potential for advancing the role of critical care telemedicine as MIMIC has been for bedside, inpatient critical care.


This work, to our knowledge, is the first of its kind to demonstrate the potential and versatility of a publicly available, large, critical care telemedicine database. The ability to extract and analyze wide-ranging patient subgroups from remote monitoring data for secondary research is one of the key strengths of such a resource. As highlighted through the case study on respiratory patients, there are some limitations such as data provenance and sparsity, which are typical of such resources. Nonetheless, tele-ICU data are particularly useful to catalyze efforts around developing robust clinical decision support systems for critical care that can be distributed between bedside and remote care teams, as well as identifying specific patient populations and associated clinical events that would be appropriate for such distributed care. Secondary insults, in particular, stand to benefit from remote monitoring and advanced analytical support because they are highly time sensitive and potentially reversible in critically ill patients, if mitigated promptly. The eICU database and the resulting tools will also inform coordination of care, intensivist coverage, and the overall process of critical care telemedicine.


This work was supported in part by the Office of Research, Discovery, Innovation at the University of Arizona, the National Science Foundation under grant #1838745, and the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number 5T32HL007955.

Conflicts of Interest

None declared.

  1. Goran SF. A second set of eyes: an introduction to tele-ICU. Crit Care Nurse 2010 Aug;30(4):46-55 [FREE Full text] [CrossRef] [Medline]
  2. Davis TM, Barden C, Dean S, Gavish A, Goliash I, Goran S, et al. American Telemedicine Association guidelines for teleICU operations. Telemed J E Health 2016 Dec;22(12):971-980. [CrossRef] [Medline]
  3. Kahn JM. The use and misuse of ICU telemedicine. J Am Med Assoc 2011 Jun 01;305(21):2227-2228. [CrossRef] [Medline]
  4. Perednia DA, Allen A. Telemedicine technology and clinical applications. J Am Med Assoc 1995 Feb 08;273(6):483-488. [CrossRef] [Medline]
  5. Ramnath VR, Khazeni N. Centralized monitoring and virtual consultant models of tele-ICU care: a side-by-side review. Telemed J E Health 2014 Oct;20(10):962-971. [CrossRef] [Medline]
  6. Wilcox ME, Adhikari NK. The effect of telemedicine in critically ill patients: systematic review and meta-analysis. Crit Care 2012 Jul 18;16(4):R127 [FREE Full text] [CrossRef] [Medline]
  7. Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, et al. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 2000 Jun 13;101(23):E215-E220. [CrossRef] [Medline]
  8. Vincent JL, Moreno R. Clinical review: scoring systems in the critically ill. Crit Care 2010;14(2):207 [FREE Full text] [CrossRef] [Medline]
  9. van der Walt S, Millman J. SciPy. 2007 May. Proceedings of the 9th Python in Science Conference   URL: [accessed 2019-01-12] [WebCite Cache]
  10. Waskom M, Botvinnik O, Hobson P, Cole JB, Halchenko Y, Hoyer S, et al. Zenodo. 2018 Jul 16. seaborn: v0.5.0 (November 2014)   URL: [accessed 2019-01-10] [WebCite Cache]
  11. Pollard TJ, Johnson AE, Raffa JD, Celi LA, Mark RG, Badawi O. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci Data 2018 Dec 11;5:180178 [FREE Full text] [CrossRef] [Medline]
  12. Wong LS, Young JD. A comparison of ICU mortality prediction using the APACHE II scoring system and artificial neural networks. Anaesthesia 1999 Nov;54(11):1048-1054 [FREE Full text] [CrossRef] [Medline]
  13. Einav L, Finkelstein A, Mullainathan S, Obermeyer Z. Predictive modeling of US health care spending in late life. Science 2018 Dec 29;360(6396):1462-1465 [FREE Full text] [CrossRef] [Medline]
  14. Balkan B, Essay P, Subbian V. Evaluating ICU clinical severity scoring systems and machine learning applications: APACHE IV/IVa case study. In: Proc IEEE Eng Med Biol Soc. 2018 Jul Presented at: IEEE Eng Med Biol Soc; July 18-21, 2018; Honolulu, HI, USA p. 4073-4076. [CrossRef]
  15. Goldhill DR, Sumner A. APACHE II, data accuracy and outcome prediction. Anaesthesia 1998 Oct;53(10):937-943 [FREE Full text] [CrossRef] [Medline]
  16. Chronaki C, Shahin A, Mark R. Designing reliable cohorts of cardiac patients across MIMIC and eICU. Comput Cardiol 2015;42:189-192 [FREE Full text] [CrossRef] [Medline]
  17. Waljee AK, Mukherjee A, Singal AG, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 2013 Aug 01;3(8) [FREE Full text] [CrossRef] [Medline]
  18. Vesin A, Azoulay E, Ruckly S, Vignoud L, Rusinovà K, Benoit D, et al. Reporting and handling missing values in clinical studies in intensive care units. Intensive Care Med 2013 Aug;39(8):1396-1404. [CrossRef] [Medline]
  19. Manley G, Knudson MM, Morabito D, Damron S, Erickson V, Pitts L. Hypotension, hypoxia, and head injury: frequency, duration, and consequences. Arch Surg 2001 Oct;136(10):1118-1123. [CrossRef] [Medline]
  20. Celi LA, Hassan E, Marquardt C, Breslow M, Rosenfeld B. The eICU: it's not just telemedicine. Crit Care Med 2001 Aug;29(8 Suppl):N183-N189. [CrossRef] [Medline]
  21. Johnson KR, Hagadorn JI, Sink DW. Alarm safety and alarm fatigue. Clin Perinatol 2017 Sep;44(3):713-728. [CrossRef] [Medline]
  22. Saeed M, Villarroel M, Reisner AT, Clifford G, Lehman LW, Moody G, et al. Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database. Crit Care Med 2011 May;39(5):952-960 [FREE Full text] [CrossRef] [Medline]

APACHE: Acute Physiology and Chronic Health Evaluation
APS: Acute Physiology Score
CPAP: continuous positive airway pressure
eICU: electronic intensive care unit
ICU: intensive care unit
IQR: interquartile range
MIMIC: Multiparameter Intelligent Monitoring in Intensive Care
PEEP: positive end expiratory pressure
tele-ICU: critical care telemedicine

Edited by G Eysenbach; submitted 02.12.18; peer-reviewed by T Abdulai, T Aslanidis; comments to author 26.12.18; revised version received 29.12.18; accepted 29.12.18; published 24.01.19


©Patrick Essay, Tala B Shahin, Baran Balkan, Jarrod Mosier, Vignesh Subbian. Originally published in JMIR Medical Informatics (, 24.01.2019.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.