This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
Most of the mortality resulting from COVID-19 has been associated with severe disease. Effective treatment of severe cases remains a challenge due to the lack of early detection of the infection.
This study aimed to develop an effective prediction model for COVID-19 severity by combining radiological outcome with clinical biochemical indexes.
A total of 46 patients with COVID-19 (10 severe, 36 nonsevere) were examined. To build the prediction model, a set of 27 severe and 151 nonsevere clinical laboratory records and computerized tomography (CT) records were collected from these patients. We managed to extract specific features from the patients’ CT images by using a recently published convolutional neural network. We also trained a machine learning model combining these features with clinical laboratory results.
We present a prediction model combining patients’ radiological outcomes with their clinical biochemical indexes to identify severe COVID-19 cases. The prediction model yielded a cross-validated area under the receiver operating characteristic (AUROC) score of 0.93 and an F1 score of 0.89, which showed a 6% and 15% improvement, respectively, compared to the models based on laboratory test features only. In addition, we developed a statistical model for forecasting COVID-19 severity based on the results of patients’ laboratory tests performed before they were classified as severe cases; this model yielded an AUROC score of 0.81.
To our knowledge, this is the first report predicting the clinical progression of COVID-19, as well as forecasting severity, based on a combined analysis using laboratory tests and CT images.
In December 2019, an epidemic of pneumonia caused by a newly identified coronavirus (SARS-CoV-2) emerged in China and has been spreading worldwide ever since [
The clinical features of COVID-19 are atypical, ranging from mild systematic symptoms, including intermittent fever (83%) and lower respiratory tract reactions such as cough (61%), to less common ones such as shortness of breath (14.5%), muscle ache (18.6%), headache (11.8%), and diarrhea (6.1%) [
SARS-CoV-2 is highly infectious and can be primarily transmitted through direct or indirect contact, droplets, and aerosol. Diagnosis of COVID-19 usually involves a combination of the patient’s travel history, clinical symptoms, and radiological and biochemical findings. Patchy ipsilateral pulmonary consolidations are visible on a computerized tomography (CT) scan initially, during the early course of COVID-19. As the infection progresses, the consolidations are reduced and appear as bilateral ground-glass opacities, marking the prominent radiological features of COVID-19 [
Antiviral medication and glucocorticoids are most commonly used for the clinical treatment of COVID-19, with antibacterial medication sought when bacterial co-infection is detected [
Our study aimed to address this challenge: we developed a prediction model for COVID-19 clinical progression, by combining radiological outcome based on CT scans with biochemical indexes. To extract essential features from CT scans, we segmented the lungs from the CT volumetric images by using a deep convolutional neural network (CNN). Finally, we also developed a model to forecast COVID-19 severity based on the results of the patients’ laboratory tests before the patients were classified as severe cases. To our knowledge, this is the first study to report a prediction model for assessing COVID-19 severity by combining radiological outcomes with clinical biochemical indexes. We believe that our prediction model will shed light on predicting disease severity for all patients with COVID-19.
We collected samples from 46 patients who visited People’s Hospital of Yicheng City between January 16, 2020, and March 4, 2020, and were diagnosed with COVID-19 according to the Chinese Government Diagnosis and Treatment Guideline (Trial 5th version; Medicine, 2020). For a confirmed diagnosis of COVID-19, nucleic acid was extracted from sputum or throat swab samples using a nucleic acid extractor (EX3600, Shanghai Zhijiang Biotechnology Co.) and a nucleic acid extraction reagent (No. P20200201, Shanghai Zhijiang Biotechnology Co.).
Fluorescence-based quantitative polymerase chain reaction (PCR; ABI7500) and SARS-CoV-2 nucleic acid detection kit (triple fluorescence PCR, No. P20200203, Shanghai Zhijiang Biotechnology Co.) were used for nucleic acid detection. This kit uses a one-step reverse transcription–PCR combined with Taqman technology to detect RNA-dependent RNA polymerase (
Approval for studies on CT screening and clinical test results was obtained from the Medical Ethics Committee of The People’s Hospital of Yicheng City, China (2020Yc002)
We collected and reviewed clinical information of 46 patients with COVID-19 after admission, including clinical signs and symptoms, comorbidities, travel history, laboratory tests, and CT scans. To consolidate all patients’ records into a single table, missing records for a given day were noted as “NA” (not available). In all, we obtained 178 records (27 severe and 151 nonsevere cases) from 105 different laboratory tests and chest CT images. Note that throughout the clinical course, each patient had more than one record variably classified as severe or nonsevere. Patients with at least one severe record were classified as severe cases.
We identified 44 laboratory tests that had more than 50% missing values (NA), and we then imputed the NAs with the mean values. Related laboratory tests were identified based on the criterion that the
Prediction models were developed to predict patient severity based on laboratory and CT signatures collected at corresponding dates. Each patient record was considered a sample for a model; as a result, 178 samples were evaluated using those models. Before using model prediction, we used random forest importance score, mutual information, and fold change as possible approaches to select important model features while avoiding potential overfitting. We found mutual information to be the most robust approach. We considered different candidate machine learning models, including random forest classifier, gradient boost classifier, XGB classifier, logistic classifier, and supported vector machine. Random forest was found to be the best classifier, and model parameters were optimized using a genetic algorithm (Tree-Based Pipeline Optimization Tool). The area under the curve of the receiver operating characteristic (AUROC) and F1 scores were used to evaluate model accuracy considering the dataset imbalance. All models were trained with 5-fold cross-validation with stratified train-test splits that preserve the percentage of samples in severe and nonsevere groups. All cross-validated results were averaged over 20 runs.
Forecasting models were built to forecast patient severity based on laboratory and CT signatures collected from nonsevere cases at admission. In these models, instead of the patients’ records, the patients themselves were considered as samples to build forecasting relationships. CT records were not collected as frequently as laboratory tests were performed, and initial, nonsevere CT records were not available for 3 severe cases. Therefore, we built two separate random forest models based on CT features and laboratory tests with 7 and 10 severe cases, respectively. Other model details were identical to those of the severity prediction models.
We collected clinical data of 46 patients with COVID-19 who were admitted at the People’s Hospital of Yicheng City, between mid-January and early-March 2020. We recorded 305 biochemical test results from 105 different tests, based on the clinical reports of all 46 study patients (
Characteristics and symptoms of study patients.
Characteristic | Values | ||||
|
All cases (N=46) | Severe cases (n=10) | Nonsevere cases (n=36) | ||
Age in years, mean (range) | 48.8 (24-71) | 56.8 (33-71) | 46.5 (24-71) | ||
|
|||||
|
Male | 25 (54) | 6 (60) | 19 (53) | |
|
Female | 21 (46) | 4 (40) | 17 (47) | |
|
|||||
|
Wuhan | 24 (52) | 5 (50) | 19 (53) | |
|
Family | 4 (9) | 2 (20) | 2 (6) | |
|
Community | 5 (11) | 0 (0) | 5 (14) | |
|
None | 13 (28) | 3 (30) | 10 (28) | |
|
|||||
|
Hypertension | 11 (24) | 5 (50) | 6 (17) | |
|
Cardiovascular disease | 6 (13) | 2 (20) | 4 (11) | |
|
Chronic liver disease | 3 (7) | 2 (20) | 1 (3) | |
|
Diabetes | 5 (11) | 3 (30) | 2 (6) | |
|
Leukoderma | 1 (2) | 0 (0) | 1 (3) | |
|
Chronic kidney disease | 1 (2) | 0 (0) | 1 (3) | |
|
Hyperuricemia | 1 (2) | 0 (0) | 1 (3) | |
|
Chronic lung disease | 2 (4) | 0 (0) | 2 (6) | |
|
|||||
|
Dry Cough | 28 (61) | 6 (60) | 22 (61) | |
|
Cough with phlegm | 9 (20) | 2 (20) | 7 (19) | |
|
|
||||
|
|
High | 8 (17) | 3 (30) | 5 (14) |
|
|
Mid | 20 (43) | 4 (40) | 10 (28) |
|
|
Mild | 14 (30) | 3 (30) | 17 (47) |
|
Fatigue | 25 (54) | 9 (90) | 16 (44) | |
|
Anorexia | 33 (72) | 9 (90) | 24 (67) | |
|
Malaise | 34 (74) | 10 (100) | 24 (67) | |
|
Headache | 7 (15) | 3 (30) | 4 (11) | |
|
Nausea | 1 (2) | 0 (0) | 1 (3) | |
|
Diarrhea | 5 (11) | 2 (20) | 3 (8) | |
|
Dyspnea | 1 (2) | 1 (10) | 0 (0) | |
|
Chest congestion | 16 (35) | 5 (50) | 11 (31) | |
|
Shortness of breath after activity | 19 (41) | 6 (60) | 13 (36) |
In all, 52% (24/46) patients had a travel history to or from Wuhan within the past 1 month, and 20% (9/46) patients had clear exposure history in the local city (
Data processing yielded 61 laboratory tests results, 36 of which were significantly related to severity. Eight related laboratory tests that showed the largest fold change are illustrated in
Correlation of laboratory tests with COVID-19 severity. (A) Top-8 laboratory tests ranked by fold change. (B) Principal component (PC) analysis of all laboratory tests. (C) Venn diagram of the top features selected by 3 different approaches: random forest importance score, mutual information, and fold change. (D) Area under receiver operating characteristic of classification using a signature of 12 laboratory tests. The asterisk annotations denote the following: * 1.00e-02<P≤5.00e-02, ** 1.00e-03<P≤1.00e-02, *** 1.00e-04<P≤1.00e-03, **** for P≤1.00e-04.
To extract CT features, we first segmented the lungs from the CT volumetric images using a deep CNN, U-Net. Because the CNN was pretrained with several annotated datasets, including a COVID-19 dataset from MedSeg [
Computed tomography (CT) feature extraction. (A) Segmented lung images from the middle CT slice for a patient with a full course of COVID-19 from nonsevere to severe and then from severe to nonsevere. The patient’s severe records are presented in red color. (B) Intensity histograms of the volume CT within segmented lung masks for five consecutive records of the patient. (C) Peak location and Otsu threshold features from the intensity histogram on Day 18. (D) Variation of 3 different CT features along the course of the disease.
Computed tomography (CT) intensity distribution and extracted features of patients with COVID-19. (A) Intensity distribution of CT volumes from nonsevere cases. (B) Intensity distribution of CT volumes from severe cases. (C) Principal component analysis of all CT features. (D) All CT features between severe and nonsevere groups. “Peak” stands for peak location, and “height” stands for peak height. The asterisk annotations denote the following: * 1.00e-02<P≤5.00e-02, ** 1.00e-03<P≤1.00e-02, *** 1.00e-04<P≤1.00e-03, **** P≤1.00e-04.
The CT feature extraction enables quantitative prediction with signatures of both CT and laboratory features. We first analyzed the Spearman correlation between the CT and laboratory features (
Prediction based on computed tomography (CT) and laboratory features. (A) Spearman correlation heatmap between CT and laboratory features. “Peak” stands for peak location, and “height” stands for peak height. A summary table describing all CT and laboratory features and their abbreviations is provided in Multimedia Appendix 2. (B) Model accuracy metrics with an increased number of features. (C) Area under receiver operating characteristic of classification using a signature of 15 CT and laboratory features.
Forecasting disease severity has significant clinical importance, as it allows clinicians to better prepare for treatment course. In addition to predicting severity based on CT and laboratory signatures, we also developed a statistical model to forecast severity from patient records upon admission when they were considered nonsevere. Although CT features are excellent predictors of severity, they are not as good for forecasting, yielding an AUROC of 0.68. In contrast, the random forest model based on laboratory tests yielded an AUROC of 0.81, indicating excellent forecasting predictability (
COVID-19 severity forecasted using the prediction model. (A) Forecasting severity using patient’s nonsevere records noted upon admission. (B) Laboratory tests showing a significant relation to the severity forecast.
Metrics of prediction and forecasting models. Mean and standard deviation values across 5 cross-validation splits are shown. AUROC: area under the receiver operating characteristics.
Features |
Prediction model | Forecasting model | ||
|
Laboratory only, mean (SD) | Laboratory and CTa, mean (SD) | CT only, mean (SD) | Laboratory only, mean (SD) |
Precision | 0.75 (0.2) | 0.82 (0.05) | 0.55 (0.22) | 0.61 (0.23) |
Recall | 0.7 (0.15) | 0.79 (0.1) | 0.56 (0.23) | 0.61 (0.11) |
AUROCb Score | 0.86 (0.1) | 0.93 (0.03) | 0.68 (0.22) | 0.81 (0.14) |
F1 Score | 0.69 (0.17) | 0.81 (0.05) | 0.56 (0.22) | 0.60 (0.16) |
Accuracy | 0.87 (0.04) | 0.88 (0.03) | 0.78 (0.12) | 0.83 (0.06) |
aCT: computed tomography.
bAUROC: area under the receiver operating characteristic.
In this study, we collected clinical records from 46 patients with COVID-19 (27 severe and 151 nonsevere records) and developed a prediction model using a combination of radiological outcomes and clinical biochemical indexes, to identify disease severity. Using the model thus developed, we successfully achieved an AUROC score of 0.93 to identify the patient’s severity status. Furthermore, we established a model for forecasting disease severity based on the combined features recorded before the patients were classified as severe cases, resulting in an AUROC score of 0.81.
In the history of confrontation between human beings and pathogens, humans have always been prone to losing the battle when the development of effective medicine or vaccine is extremely difficult owing to the high variability of the pathogenic genome, such as in the case of influenza virus, HIV, or SARS. Even though the reported mortality rate of COVID-19 (1.4% [
Many studies highlight the potential hallmarks of COVID-19. Biochemical and radiological outcomes are the most widely recognized indexes in clinical treatment and decision making [
In conclusion, the course of clinical progression might be clearer with the application of our model, and we believe our effort could provide useful opinions for early identification of severely ill patients. Thus, advanced interventions could be applied to potentially reduce mortality rates and alleviate the health care burden regarding the management of COVID-19 cases.
Clinical laboratory data for study patients.
Summary of all clinical laboratory features as described in Figure 4A.
acute respiratory distress syndrome
area under the receiver operating characteristic
convolutional neural network
computerized tomography
cycle threshold
extracorporeal membrane pulmonary oxygenation
lactate dehydrogenase
polymerase chain reaction
reverse transcription
We thank all patients and donors involved in this study. We appreciate the assistance received from Intanx Life Co. Ltd. (Shanghai) in data processing and consulting.
XL and FZ designed the research study and collected patient samples; DL, QZ, YT, Y, YB, Jimeng Li, Jiahang Li, YX, SX, and MS performed the research; DL, QZ, YT, SX, and MS analyzed the data; DL, QZ, YT, XL, and FZ wrote the manuscript.
None declared.