TY - JOUR AU - Vaid, Akhil AU - Jaladanki, Suraj K AU - Xu, Jie AU - Teng, Shelly AU - Kumar, Arvind AU - Lee, Samuel AU - Somani, Sulaiman AU - Paranjpe, Ishan AU - De Freitas, Jessica K AU - Wanyan, Tingyi AU - Johnson, Kipp W AU - Bicak, Mesude AU - Klang, Eyal AU - Kwon, Young Joon AU - Costa, Anthony AU - Zhao, Shan AU - Miotto, Riccardo AU - Charney, Alexander W AU - Böttinger, Erwin AU - Fayad, Zahi A AU - Nadkarni, Girish N AU - Wang, Fei AU - Glicksberg, Benjamin S PY - 2021 DA - 2021/1/27 TI - Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach JO - JMIR Med Inform SP - e24207 VL - 9 IS - 1 KW - federated learning KW - COVID-19 KW - machine learning KW - electronic health records AB - Background: Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. Objective: We aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. Methods: Patient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. Results: The LASSOfederated model outperformed the LASSOlocal model at 3 hospitals, and the MLPfederated model performed better than the MLPlocal model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSOpooled model outperformed the LASSOfederated model at all hospitals, and the MLPfederated model outperformed the MLPpooled model at 2 hospitals. Conclusions: The federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy. SN - 2291-9694 UR - http://medinform.jmir.org/2021/1/e24207/ UR - https://doi.org/10.2196/24207 UR - http://www.ncbi.nlm.nih.gov/pubmed/33400679 DO - 10.2196/24207 ID - info:doi/10.2196/24207 ER -