This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
Sepsis is one of the leading causes of mortality in hospitalized patients. Despite this fact, a reliable means of predicting sepsis onset remains elusive. Early and accurate sepsis onset predictions could allow more aggressive and targeted therapy while maintaining antimicrobial stewardship. Existing detection methods suffer from low performance and often require time-consuming laboratory test results.
To study and validate a sepsis prediction method,
We apply
In a test dataset with 11.3% sepsis prevalence,
Despite using little more than vitals,
Sepsis and its associated syndromes are among the leading causes of worldwide morbidity and mortality [
A new bedside scoring system to be used outside the ICU, “qSOFA” (for “quick SOFA”), has been proposed as a screening mechanism to prompt the clinician to further investigate for sepsis or to transfer to a higher level of care [
The purpose of this study is to validate the
This work uses the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC)-III version 1.3 dataset [
We collect a variety of data from the MIMIC-III dataset to define sepsis onset and calculate the
We follow the sepsis definition promulgated by Singer et al [
To identify an acute change in SOFA score, we adhere to the definition proposed by Seymour et al. Taking the initial time of the earliest culture draw or antibiotic administration as the time of suspicion of infection, we define a window of up to 48 hours before this time (limited by time of data availability) and 24 hours after this time (limited by time of departure from the ICU). The SOFA score at the beginning of this window is compared with its hourly value throughout this window; if this hourly value is ≥ 2 points higher than the value at the start of the window, we define the first such hour as the onset of sepsis and designate the patient as septic (class 1). If a patient fails to have such an event, we classify them as nonseptic (class 0). If the data required to calculate one of the SOFA subscores is not present in the imputed data, that subscore is given the value 0 (ie, “normal”). We also use a modified version of the SOFA respiration score [
Windows of suspected infection, as defined by the presence of a culture and antibiotic administration, following Seymour et al [
First event | Window in which second event must occur |
Antibiotics administered | Culture taken in the following 72 hours |
Culture taken | Antibiotics administered in the following 24 hours |
The learning method employed by
The requirement that sepsis onset in an included patient occurs be at least 7 hours into their ICU stay is for clarity of presentation. In operation,
The use of only Metavision patients deserves special discussion. For ICU stays logged using the CareVue system, data about procedures performed (ie, cultures being taken) does not appear in the MIMIC-III database in as detailed and comprehensive a fashion as for ICU stays logged using Metavision. Further, while the MIMIC-III version 1.3 dataset includes information from the BIDMC microbiology lab, reporting positive cultures and the results thereof for all patients, negative cultures are not reported consistently. The combination of these facts means that negative cultures are underreported for CareVue patients. This in turn implies that suspicion of infection, as defined by the cooccurrence of culture and antibiotics, is systematically underrepresented in these ICU stays, resulting in a sepsis prevalence of 3.5% for CareVue patients versus 11.3% for Metavision. In light of this disparity, we chose to exclude CareVue patients from our analyses.
We performed an auxiliary analysis to eliminate patients who received antibiotics prior to the start of their ICU stay (4078 of the 23,906 Metavision ICU stays). This was intended to be a highly sensitive, albeit nonspecific way of removing pre-ICU sepsis cases. Since the exact time-stamp of the start of an ICU stay was not available, we approximated it as 60 minutes prior to initial measurement of any of the extended vital signs from the list in the Clinical Measurements section. Although the 60-minute approximation is discussed here, we also examined various other time windows, and the set of excluded patients was not strongly sensitive to the cutoff time used. With the pre-ICU antibiotic removal, the remaining 19,828 ICU stays were screened identically as previously described, leaving a set of 1840 septic ICU stays and 17,214 nonseptic ICU stays (9.66% sepsis prevalence).
Demographics of the included Multiparameter Intelligent Monitoring in Intensive Care version III (MIMIC-III) intensive care unit stays. All stays correspond to patients aged 15 years or more (21,173 hospital admissions).
Demographic characteristic | Number of ICU Stays n (%) | |
medical intensive care unit | 9460 (41.89) | |
cardiac surgery recovery unit | 3345 (14.81) | |
surgical intensive care unit | 4293 (19.01) | |
coronary care unit | 2726 (12.07) | |
trauma-surgical intensive care unit | 2759 (12.22) | |
Female | 9902 (43.85) | |
Male | 12,681 (56.15) | |
15-17 | 25 (0.1) | |
18-29 | 982 (4.3) | |
30-39 | 1132 (5.01) | |
40-49 | 2176 (9.64) | |
50-59 | 4038 (17.88) | |
60-69 | 5159 (22.84) | |
70+ | 9071 (40.17) | |
0-2 | 15,178 (67.21) | |
3-5 | 4267 (18.89) | |
6-8 | 1340 (5.93) | |
9-11 | 649 (2.9) | |
12+ | 1149 (5.09) | |
Yes | 1569 (6.95) | |
No | 21,014 (93.05) |
aIQR: interquartile range.
Inclusion diagram. All intensive care unit (ICU) stays meeting the sequential inclusion criteria outlined above are included in the training and testing sets. The final dataset has a sepsis prevalence of 11.3%. MIMIC-III: Multiparameter Intelligent Monitoring in Intensive Care version III.
The training and testing process for the
ξ = [x1, x2, ... P(s=1 | x1i) ..., ... P(s=1 | Δxi) ..., ... P(s=1 | Δxi, Δxj) ..., ... P(s=1 | Δxi, Δxj, Δxk)... ]
In our first experiment, we assess how performance changes as we use
In our second experiment, we test the performance of the
Per-hour observation frequencies among included ICU stays (n=22,853). Three ICU stays were of less than 60 minutes and were discarded from these calculations.
Measurement | Mean (SD) (h-1) | Median (IQRa) (h-1) | Fraction of ICU stays (Fb) |
GCSc | 0.29 (0.16) | 0.25 (0.21-0.29) | 1 |
Heart rate | 1.31 (3.32) | 1.07 (1.01-1.16) | 1 |
Respiration rate | 1.30 (3.26) | 1.06 (1.00-1.16) | 1 |
SpO2d | 1.27 (3.01) | 1.06 (0.99-1.17) | 1 |
Temperature | 0.31 (0.21) | 0.27 (0.23-0.314) | 1 |
NIDiasABPe | 0.76 (0.39) | 0.88 (0.46-1.02) | 0.99 |
NISysABPf | 0.76 (0.39) | 0.88 (0.46-1.02) | 0.99 |
SysABPg | 0.41 (1.55) | 0 (0-0.76) | 0.43 |
DiasABPh | 0.41 (1.55) | 0 (0-0.76) | 0.43 |
aIQR: interquartile range.
bF: the fraction of these ICU stays with at least one measurement of the given type.
cGCS: Glasgow Coma Score.
dSpO2: peripheral capillary oxygen saturation.
eNIDiasABP: noninvasive diastolic arterial blood pressure.
fNISysABP: noninvasive systolic arterial blood pressure.
gSysABP: invasive systolic arterial blood pressure.
hDiasABP: invasive diastolic arterial blood pressure.
Training and testing procedure. The innermost steps in the process (rightmost) are repeated for each partitioning of the data into cross-validation folds (4 partitionings), for each test cross-validation fold in each partition (4 folds), and each time horizon (5 time horizons). ICU: intensive care unit.
The comparison of
The ROC curves of
We performed an auxiliary analysis where we eliminated patients who received antibiotics prior to the start of their ICU stay, and the resulting AUROC and model performance metrics were not found to be significantly different from those reported in
We computed the performance of the
Detailed performance measures for
SIRSa | quick SOFA | MEWSb | SAPS IIc | SOFAd | |||
AUROCe | 0.88 (SD 0.006) | 0.74 (SD 0.010) | 0.61 | 0.77 | 0.80 | 0.70 | 0.73 |
APRf | 0.60 (SD 0.016) | 0.28 (SD 0.013) | 0.16 | 0.28 | 0.33 | 0.23 | 0.28 |
Sensitivity | 0.80 | 0.80 | 0.72 | 0.56 | 0.70 | 0.75 | 0.80 |
Specificity | 0.80 | 0.54 | 0.44 | 0.84 | 0.77 | 0.52 | 0.48 |
F1g | 0.47 | 0.30 | 0.24 | 0.39 | 0.40 | 0.27 | 0.27 |
DORh | 15.51 | 4.75 | 2.06 | 6.33 | 7.85 | 3.26 | 3.71 |
LR+i | 3.90 | 1.75 | 1.30 | 3.37 | 3.05 | 1.57 | 1.55 |
LR-j | 0.25 | 0.37 | 0.63 | 0.53 | 0.39 | 0.48 | 0.42 |
Accuracy | 0.80 | 0.57 | 0.47 | 0.80 | 0.76 | 0.55 | 0.52 |
aSIRS: systemic inflammatory response syndrome
bMEWS: Modified Early Warning Score.
cSAPS II: Simplified Acute Physiology Score II.
dSOFA: Sequential (Sepsis-Related) Organ Failure Assessment.
eAURUC: area under the receiver operating characteristic curve.
fAPR: area under the precision-recall curve.
gF1: harmonic mean of precision and recall.
hDOR: diagnostic odds ratio.
iLR+: positive likelihood ratio.
jLR-: negative likelihood ratio.
Detailed performance measures of
AUROCa | 0.89 (SD 0.010) | 0.87 (SD 0.006) | 0.84 (SD 0.011) | 0.83 (SD 0.012) | 0.78 (SD 0.013) | 0.75 (SD 0.008) | 0.73 (SD 0.010) |
APRb | 0.60 (SD 0.022) | 0.57 (SD 0.015) | 0.54 (SD 0.022) | 0.49 (SD 0.021) | 0.40 (SD 0.015) | 0.27 (SD 0.012) | 0.27 (SD 0.009) |
Sensitivity | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 | 0.80 |
Specificity | 0.82 | 0.78 | 0.72 | 0.68 | 0.59 | 0.55 | 0.52 |
F1c | 0.49 | 0.45 | 0.40 | 0.37 | 0.32 | 0.30 | 0.29 |
DORd | 17.90 | 14.14 | 10.23 | 8.31 | 5.76 | 4.95 | 4.38 |
LR+e | 4.37 | 3.62 | 2.85 | 2.46 | 1.95 | 1.79 | 1.67 |
LR-f | 0.24 | 0.26 | 0.28 | 0.30 | 0.34 | 0.36 | 0.38 |
Accuracy | 0.82 | 0.78 | 0.73 | 0.69 | 0.61 | 0.58 | 0.55 |
aAUROC: area under the receiver operating characteristic curve.
bAPR: area under the precision-recall curve.
cF1: harmonic mean of precision and recall.
dDOR: diagnostic odds ratio.
eLR+: positive likelihood ratio.
fLR-: negative likelihood ratio.
Receiver operating characteristic curves for
Test set area under receiver operating characteristic curves for
Test set area under precision-recall curves for
Receiver operating characteristic curves for
Area under the receiver operating characteristic curve (AUROC) for
Area under the precision-recall curve (APR) for
We tested and validated
The detailed numerical results in
To improve performance over current scoring systems,
These experiments show
While this is a retrospective study, we are planning future prospective studies through EHR integration of the
Many scoring systems are used for predicting patient outcomes or treatment guidance, despite not being developed for these purposes (eg, SOFA). We present a purpose-built alternative to these systems, based on ubiquitously available vital sign data, for predicting sepsis onset in ICU patients. In this study,
There are several practical limitations in this study. First, it is not designed to “discover” a set of rules that could create a manual scoring system.
We have also chosen to use only a subset of patients in the MIMIC-III (v1.3) database. Because the currently available version of MIMIC-III under-reports cultures, particularly for patients recorded using the CareVue system, we have chosen to work only with patients recorded using the alternative Metavision system to get a more complete picture of suspected infection at various sites. Future work will address these limitations.
An additional limitation is that this study was performed exclusively on ICU data and at a single center, which may limit generalization of our results to other hospitals and hospital systems. While
Sepsis prediction is a challenging problem and remains so despite many years of research and development efforts because its manifestation is often unclear until later stages.
area under the precision-recall curve
area under receiver operating characteristic
Beth Israel Deaconess Medical Center
coronary care unit
cardiac surgery recovery unit
invasive diastolic arterial blood pressure
diagnostic odds ratio
electronic health records
harmonic mean of precision and recall
Glasgow Coma Score
Health Insurance Portability and Accountability Act
intensive care unit
interquartile range
positive likelihood ratio
negative likelihood ratio
Modified Early Warning Score
medical intensive care unit
Multiparameter Intelligent Monitoring in Intensive Care version III
noninvasive diastolic arterial blood pressure
noninvasive systolic arterial blood pressure
negative predictive value
positive predictive value
quickSOFA
receiver operating characteristic
simplified acute physiology score II
surgical intensive care Unit
systemic inflammatory response syndrome
Sequential (Sepsis-Related) Organ Failure Assessment
peripheral capillary oxygen saturation
invasive systolic arterial blood pressure
trauma-surgical intensive care unit
This material is based upon work supported by the National Science Foundation under Grant No. 1549867. The funder had no role in the conduct of the study; collection, management, analysis, and interpretation of data; preparation, review, and approval of the manuscript; and decision to submit the manuscript for publication.
We gratefully acknowledge the assistance of Dr. Angela J. Rogers, Samson Mataraso, Nima Shajarian, Jasmine Jan, Adrian Gunawan, Allen Chen, and Lauren Song in the preparation of this manuscript. We acknowledge Qingqing Mao and Hamid Mohamadlou for significant contributions to the development and application of the machine learning algorithm,
All authors who have affiliations listed with Dascena (Hayward, CA, USA) are employees of Dascena.