Published on in Vol 6, No 1 (2018): Jan-Mar

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/8960, first published .
Characterizing and Managing Missing Structured Data in Electronic Health Records: Data Analysis

Characterizing and Managing Missing Structured Data in Electronic Health Records: Data Analysis

Characterizing and Managing Missing Structured Data in Electronic Health Records: Data Analysis

Journals

  1. Ross E, Jung K, Dudley J, Li L, Leeper N, Shah N. Predicting Future Cardiovascular Events in Patients With Peripheral Artery Disease Using Electronic Health Record Data. Circulation: Cardiovascular Quality and Outcomes 2019;12(3) View
  2. Pendergrass S, Crawford D. Using Electronic Health Records To Generate Phenotypes For Research. Current Protocols in Human Genetics 2019;100(1) View
  3. Jetley G, Zhang H. Electronic health records in IS research: Quality issues, essential thresholds and remedial actions. Decision Support Systems 2019;126:113137 View
  4. Krittanawong C, Johnson K, Rosenson R, Wang Z, Aydar M, Baber U, Min J, Tang W, Halperin J, Narayan S. Deep learning for cardiovascular medicine: a practical primer. European Heart Journal 2019;40(25):2058 View
  5. Li R, Chen Y, Moore J. Integration of genetic and clinical information to improve imputation of data missing from electronic health records. Journal of the American Medical Informatics Association 2019;26(10):1056 View
  6. Filipe M, van Deukeren D, Kip M, Doeksen A, Pronk A, Verheijen P, Heikens J, Witkamp A, Richir M. Effect of the COVID-19 Pandemic on Surgical Breast Cancer Care in the Netherlands: A Multicenter Retrospective Cohort Study. Clinical Breast Cancer 2020;20(6):454 View
  7. Ford E, Rooney P, Hurley P, Oliver S, Bremner S, Cassell J. Can the Use of Bayesian Analysis Methods Correct for Incompleteness in Electronic Health Records Diagnosis Data? Development of a Novel Method Using Simulated and Real-Life Clinical Data. Frontiers in Public Health 2020;8 View
  8. Hu Z, Du D, Kaderali L. A new analytical framework for missing data imputation and classification with uncertainty: Missing data imputation and heart failure readmission prediction. PLOS ONE 2020;15(9):e0237724 View
  9. Agor J, Özaltın O, Ivy J, Capan M, Arnold R, Romero S. The value of missing information in severity of illness score development. Journal of Biomedical Informatics 2019;97:103255 View
  10. Callahan A, Shah N, Chen J. Research and Reporting Considerations for Observational Studies Using Electronic Health Record Data. Annals of Internal Medicine 2020;172(11_Supplement):S79 View
  11. Arena P, Mo J, Sabharwal C, Begier E, Zhou X, Gurtman A, Liu Q, Shen R, Wentworth C, Huang K. The incidence of stroke among selected patients undergoing elective posterior lumbar fusion: a retrospective cohort study. BMC Musculoskeletal Disorders 2020;21(1) View
  12. Chen R, Stewart W, Sun J, Ng K, Yan X. Recurrent Neural Networks for Early Detection of Heart Failure From Longitudinal Electronic Health Record Data. Circulation: Cardiovascular Quality and Outcomes 2019;12(10) View
  13. McGurk K, Dagliati A, Chiasserini D, Lee D, Plant D, Baricevic-Jones I, Kelsall J, Eineman R, Reed R, Geary B, Unwin R, Nicolaou A, Keavney B, Barton A, Whetton A, Geifman N, Wren J. The use of missing values in proteomic data-independent acquisition mass spectrometry to enable disease activity discrimination. Bioinformatics 2020;36(7):2217 View
  14. Verma M, Hontecillas R, Tubau-Juni N, Abedi V, Bassaganya-Riera J. Challenges in Personalized Nutrition and Health. Frontiers in Nutrition 2018;5 View
  15. Tantoso E, Wong W, Tay W, Lee J, Sinha S, Eisenhaber B, Eisenhaber F. Hypocrisy Around Medical Patient Data: Issues of Access for Biomedical Research, Data Quality, Usefulness for the Purpose and Omics Data as Game Changer. Asian Bioethics Review 2019;11(2):189 View
  16. Filipe M, Siesling S, Vriens M, van Diest P, Witkamp A, Mureau M. Socioeconomic status significantly contributes to the likelihood of immediate postmastectomy breast reconstruction in the Netherlands: A nationwide study. European Journal of Surgical Oncology 2021;47(2):245 View
  17. Goodday S, Kormilitzin A, Vaci N, Liu Q, Cipriani A, Smith T, Nevado-Holgado A. Maximizing the use of social and behavioural information from secondary care mental health electronic health records. Journal of Biomedical Informatics 2020;107:103429 View
  18. Abedi V, Li J, Shivakumar M, Avula V, Chaudhary D, Shellenberger M, Khara H, Zhang Y, Lee M, Wolk D, Yeasin M, Hontecillas R, Bassaganya-Riera J, Zand R. Increasing the Density of Laboratory Measures for Machine Learning Applications. Journal of Clinical Medicine 2020;10(1):103 View
  19. Choi Y, Hanrahan L, Norton D, Zhao Y. Simultaneous spatial smoothing and outlier detection using penalized regression, with application to childhood obesity surveillance from electronic health records. Biometrics 2022;78(1):324 View
  20. Zhang X, Yan C, Gao C, Malin B, Chen Y. Predicting Missing Values in Medical Data Via XGBoost Regression. Journal of Healthcare Informatics Research 2020;4(4):383 View
  21. Mosler F, Priebe S, Bird V. Routine measurement of satisfaction with life and treatment aspects in mental health patients – the DIALOG scale in East London. BMC Health Services Research 2020;20(1) View
  22. Baron J, Paranjape K, Love T, Sharma V, Heaney D, Prime M. Development of a “meta-model” to address missing data, predict patient-specific cancer survival and provide a foundation for clinical decision support. Journal of the American Medical Informatics Association 2021;28(3):605 View
  23. Tagawa S, Ramaswamy K, Huang A, Mardekian J, Schultz N, Wang L, Sandin R, Lechpammer S, George D. Survival outcomes in patients with chemotherapy-naive metastatic castration-resistant prostate cancer treated with enzalutamide or abiraterone acetate. Prostate Cancer and Prostatic Diseases 2021;24(4):1032 View
  24. Shung D. Advancing care for acute gastrointestinal bleeding using artificial intelligence. Journal of Gastroenterology and Hepatology 2021;36(2):273 View
  25. Beaulieu-Jones B, Yuan W, Brat G, Beam A, Weber G, Ruffin M, Kohane I. Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?. npj Digital Medicine 2021;4(1) View
  26. Peralta M, Jannin P, Haegelen C, Baxter J. Data imputation and compression for Parkinson's disease clinical questionnaires. Artificial Intelligence in Medicine 2021;114:102051 View
  27. Tomašev N, Harris N, Baur S, Mottram A, Glorot X, Rae J, Zielinski M, Askham H, Saraiva A, Magliulo V, Meyer C, Ravuri S, Protsyuk I, Connell A, Hughes C, Karthikesalingam A, Cornebise J, Montgomery H, Rees G, Laing C, Baker C, Osborne T, Reeves R, Hassabis D, King D, Suleyman M, Back T, Nielson C, Seneviratne M, Ledsam J, Mohamed S. Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records. Nature Protocols 2021;16(6):2765 View
  28. Hunt N, Gardarsdottir H, Bazelier M, Klungel O, Pajouheshnia R. A systematic review of how missing data are handled and reported in multi‐database pharmacoepidemiologic studies. Pharmacoepidemiology and Drug Safety 2021;30(7):819 View
  29. Mauer E, Lee J, Choi J, Zhang H, Hoffman K, Easthausen I, Rajan M, Weiner M, Kaushal R, Safford M, Steel P, Banerjee S. A predictive model of clinical deterioration among hospitalized COVID-19 patients by harnessing hospital course trajectories. Journal of Biomedical Informatics 2021;118:103794 View
  30. Antoniades A, Papaioannou M, Malatras A, Papagregoriou G, Müller H, Holub P, Deltas C, Schizas C. Integration of Biobanks in National eHealth Ecosystems Facilitating Long-Term Longitudinal Clinical-Omics Studies and Citizens' Engagement in Research Through eHealthBioR. Frontiers in Digital Health 2021;3 View
  31. Komamine M, Fujimura Y, Nitta Y, Omiya M, Doi M, Sato T. Characteristics of hospital differences in missing of clinical laboratory test results in a multi-hospital observational database contributing to MID-NET® in Japan. BMC Medical Informatics and Decision Making 2021;21(1) View
  32. Filipe M, Siesling S, Vriens M, van Diest P, Witkamp A. The association of socioeconomic status on treatment strategy in patients with stage I and II breast cancer in the Netherlands. Breast Cancer Research and Treatment 2021;189(2):541 View
  33. Cesare N, Were L. A multi-step approach to managing missing data in time and patient variant electronic health records. BMC Research Notes 2022;15(1) View
  34. Hall A, Davlyatov G, Orewa G, Mehta T, Feldman S. Multiple Electronic Health Record-Based Measures of Social Determinants of Health to Predict Return to the Emergency Department Following Discharge. Population Health Management 2022;25(6):771 View
  35. Pridham G, Rockwood K, Rutenberg A. Strategies for handling missing data that improve Frailty Index estimation and predictive power: lessons from the NHANES dataset. GeroScience 2022;44(2):897 View
  36. Pereira R, Abreu P, Rodrigues P. Partial Multiple Imputation With Variational Autoencoders: Tackling Not at Randomness in Healthcare Data. IEEE Journal of Biomedical and Health Informatics 2022;26(8):4218 View
  37. Getzen E, Ungar L, Mowery D, Jiang X, Long Q. Mining for equitable health: Assessing the impact of missing data in electronic health records. Journal of Biomedical Informatics 2023;139:104269 View
  38. Yang Q, Gao S, Lin J, Lyu K, Wu Z, Chen Y, Qiu Y, Zhao Y, Wang W, Lin T, Pan H, Chen M. A machine learning-based data mining in medical examination data: a biological features-based biological age prediction model. BMC Bioinformatics 2022;23(1) View
  39. Chen Y, Huang C, Lo Y, Chen Y, Lai F. Combining attention with spectrum to handle missing values on time series data without imputation. Information Sciences 2022;609:1271 View
  40. Thompson M, Hill B, Rakocz N, Chiang J, Geschwind D, Sankararaman S, Hofer I, Cannesson M, Zaitlen N, Halperin E. Methylation risk scores are associated with a collection of phenotypes within electronic health record systems. npj Genomic Medicine 2022;7(1) View
  41. Hespe C, Giskes K, Harris M, Peiris D. Findings and lessons learnt implementing a cardiovascular disease quality improvement program in Australian primary care: a mixed method evaluation. BMC Health Services Research 2022;22(1) View
  42. Börner N, Schoenberg M, Pöschke P, Pöllmann B, Koch D, Drefs M, Koliogiannis D, Böhm C, Werner J, Guba M. A custom build multidimensional medical combined imputation application for a transplantation dataset. Computer Methods and Programs in Biomedicine Update 2022;2:100083 View
  43. de Bock E, Filipe M, Pronk A, Boerma D, Heikens J, Verheijen P, Vriens M, Richir M. Factors affecting 30-day postoperative complications after emergency surgery during the COVID-19 outbreak: A multicentre cohort study. International Journal of Surgery Open 2021;35:100397 View
  44. Canoy D, Harvey N, Prieto-Alhambra D, Cooper C, Meyer H, Åsvold B, Nazarzadeh M, Rahimi K. Elevated blood pressure, antihypertensive medications and bone health in the population: revisiting old hypotheses and exploring future research directions. Osteoporosis International 2022;33(2):315 View
  45. Azimi V, Zaydman M. Optimizing Equity: Working towards Fair Machine Learning Algorithms in Laboratory Medicine. The Journal of Applied Laboratory Medicine 2023;8(1):113 View
  46. Deng H, Sun M, Wang Y, Zeng J, Yuan T, Li T, Li D, Chen W, Zhou P, Wang Q, Jiang H. Evaluating machine learning models for sepsis prediction: A systematic review of methodologies. iScience 2022;25(1):103651 View
  47. van Os H, Kanning J, Wermer M, Chavannes N, Numans M, Ruigrok Y, van Zwet E, Putter H, Steyerberg E, Groenwold R. Developing Clinical Prediction Models Using Primary Care Electronic Health Record Data: The Impact of Data Preparation Choices on Model Performance. Frontiers in Epidemiology 2022;2 View
  48. Batra S, Khurana R, Khan M, Boulila W, Koubaa A, Srivastava P. A Pragmatic Ensemble Strategy for Missing Values Imputation in Health Records. Entropy 2022;24(4):533 View
  49. Sarwar T, Seifollahi S, Chan J, Zhang X, Aksakalli V, Hudson I, Verspoor K, Cavedon L. The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges. ACM Computing Surveys 2023;55(2):1 View
  50. Hsu T, Lin C, Kuijjer M. Learning from small medical data—robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder. Bioinformatics Advances 2023;3(1) View
  51. Samad M, Abrar S, Diawara N. Missing value estimation using clustering and deep learning within multiple imputation framework. Knowledge-Based Systems 2022;249:108968 View
  52. Arena P, Mo J, Liu Q, Zhou X, Gong R, Wentworth C, Murugesan S, Huang K. The incidence of acute myocardial infarction after elective spinal fusions or joint replacement surgery in the United States: a large-scale retrospective observational cohort study in 322,585 patients. Patient Safety in Surgery 2021;15(1) View
  53. de Bock E, Filipe M, Herman E, Pronk A, Boerma D, Heikens J, Verheijen P, Vriens M, Richir M. Risk factors of postoperative intensive care unit admission during the COVID-19 pandemic: A multicentre retrospective cohort study. International Journal of Surgery Open 2023;55:100620 View
  54. Sanchez-Pinto L, Bhavani S, Atreya M, Sinha P. Leveraging Data Science and Novel Technologies to Develop and Implement Precision Medicine Strategies in Critical Care. Critical Care Clinics 2023;39(4):627 View
  55. Zhu S, Zheng W, Pang H. CPAE: Contrastive predictive autoencoder for unsupervised pre-training in health status prediction. Computer Methods and Programs in Biomedicine 2023;234:107484 View
  56. McConnell K, Hajat A, Sack C, Mooney S, Khosropour C. Association between any underlying health condition and COVID-19-associated hospitalization by age group, Washington State, 2020–2021: a retrospective cohort study. BMC Infectious Diseases 2023;23(1) View
  57. Jia Y, Liu Z, Guo J, He C, Zhou X, Xue M, Nie T, Sun T, Kang J, Lu Q, Jiang L, Liu S. Machine Learning and Bioinformatics Analysis for Laboratory Data in Pan‐Cancers Detection. Advanced Intelligent Systems 2023;5(12) View
  58. van Os H, Kanning J, Ferrari M, Bonten T, Kist J, Vos H, Vos R, Putter H, Groenwold R, Wermer M. Added Predictive Value of Female-Specific Factors and Psychosocial Factors for the Risk of Stroke in Women Under 50. Neurology 2023;101(8) View
  59. Xu D, Hu P, Fang X. Deep Learning-Based Imputation Method to Enhance Crowdsourced Data on Online Business Directory Platforms for Improved Services. Journal of Management Information Systems 2023;40(2):624 View
  60. Başakın E, Ekmekcioğlu Ö, Özger M. Providing a comprehensive understanding of missing data imputation processes in evapotranspiration-related research: a systematic literature review. Hydrological Sciences Journal 2023;68(14):2089 View
  61. Sondhi A, Weberpals J, Yerram P, Jiang C, Taylor M, Samant M, Cherng S. A systematic approach towards missing lab data in electronic health records: A case study in non‐small cell lung cancer and multiple myeloma. CPT: Pharmacometrics & Systems Pharmacology 2023;12(9):1201 View
  62. Komamine M, Fujimura Y, Omiya M, Sato T. Dealing with missing data in laboratory test results used as a baseline covariate: results of multi-hospital cohort studies utilizing a database system contributing to MID-NET® in Japan. BMC Medical Informatics and Decision Making 2023;23(1) View
  63. Tsiampalis T, Panagiotakos D. Methodological issues of the electronic health records’ use in the context of epidemiological investigations, in light of missing data: a review of the recent literature. BMC Medical Research Methodology 2023;23(1) View
  64. Blythe R, Parsons R, Barnett A, McPhail S, White N. Vital signs-based deterioration prediction model assumptions can lead to losses in prediction performance. Journal of Clinical Epidemiology 2023;159:106 View
  65. McConnell K, Hajat A, Sack C, Mooney S, Khosropour C. Associations Between Insurance, Race and Ethnicity, and COVID-19 Hospitalization Beyond Underlying Health Conditions: A Retrospective Cohort Study. AJPM Focus 2023;2(3):100120 View
  66. Beaulieu-Jones B, Villamar M, Scordis P, Bartmann A, Ali W, Wissel B, Alsentzer E, de Jong J, Patra A, Kohane I. Predicting seizure recurrence after an initial seizure-like episode from routine clinical notes using large language models: a retrospective cohort study. The Lancet Digital Health 2023;5(12):e882 View
  67. Lu H, Chou C, Lin Y, Uchiyama S, Terao C, Wang Y, Yang J, Liu T, Wong H, Chen S, Tsai F. The enome-wide ssociation tudy of erum IgE evels emonstrated a hared enetic ackground in llergic iseases. Clinical Immunology 2024;260:109897 View
  68. Yan C, Zhang Z, Nyemba S, Li Z. Generating Synthetic Electronic Health Record Data Using Generative Adversarial Networks: Tutorial. JMIR AI 2024;3:e52615 View
  69. Huguet N, Chen J, Parikh R, Marino M, Flocke S, Likumahuwa-Ackman S, Bekelman J, DeVoe J. Applying Machine Learning Techniques to Implementation Science. Online Journal of Public Health Informatics 2024;16:e50201 View
  70. Pereira R, Abreu P, Rodrigues P, Figueiredo M. Imputation of data Missing Not at Random: Artificial generation and benchmark analysis. Expert Systems with Applications 2024;249:123654 View
  71. Beaulieu-Jones B, Frau F, Bozzi S, Chandross K, Peterschmitt M, Cohen C, Coulovrat C, Kumar D, Kruger M, Lipnick S, Fitzsimmons L, Kohane I, Scherzer C. Disease progression strikingly differs in research and real-world Parkinson’s populations. npj Parkinson's Disease 2024;10(1) View
  72. Harder A, Olbricht G, Ekuma G, Hier D, Obafemi-Ajayi T. Multiple Imputation for Robust Cluster Analysis to Address Missingness in Medical Data. IEEE Access 2024;12:42974 View
  73. Kang H. On operational definitions of mortality. Kidney Research and Clinical Practice 2024;43(2):131 View
  74. Tiwaskar S, Rashid M, Gokhale P. Impact of machine learning-based imputation techniques on medical datasets- a comparative analysis. Multimedia Tools and Applications 2024 View
  75. Afkanpour M, Hosseinzadeh E, Tabesh H. Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review. BMC Medical Research Methodology 2024;24(1) View
  76. Nopsopon T, Brown A, Hahn G, Rank M, Huybrechts K, Akenroye A. Temporal variation in the effectiveness of biologics in asthma: Effect modification by changing patient characteristics. Respiratory Medicine 2024;234:107802 View
  77. Engelberg-Cook E, Shah J, Teixeira da Silva Hucke A, Vera-Garcia D, Dagher J, Donahue M, Belzil V, Oskarsson B. Prognostic Factors and Epidemiology of Amyotrophic Lateral Sclerosis in Southeastern United States. Mayo Clinic Proceedings: Innovations, Quality & Outcomes 2024;8(5):482 View
  78. Fernandes M, Westover M, Singhal A, Zafar S. Automated Extraction of Stroke Severity From Unstructured Electronic Health Records Using Natural Language Processing. Journal of the American Heart Association 2024 View
  79. Hu Y, Wu R, Lin Y, Lin T. A novel MissForest-based missing values imputation approach with recursive feature elimination in medical applications. BMC Medical Research Methodology 2024;24(1) View
  80. Ren W, Liu Z, Wu Y, Zhang Z, Hong S, Liu H. Moving Beyond Medical Statistics: A Systematic Review on Missing Data Handling in Electronic Health Records. Health Data Science 2024;4 View

Books/Policy Documents

  1. Allaart C, Mondrejevski L, Papapetrou P. Artificial Intelligence Applications and Innovations. View
  2. McGrath L, Wong J. Pragmatic Randomized Clinical Trials. View
  3. Simard J, Chaichian Y, Falasinnu T. Outcome Measures and Metrics in Systemic Lupus Erythematosus. View
  4. Tiwaskar S, Gokhale P. Artificial Intelligence for Innovative Healthcare Informatics. View
  5. Hugo J, Ibing S, Borchert F, Sachs J, Cho J, Ungaro R, Böttinger E. Artificial Intelligence in Medicine. View
  6. Pereira R, Rodrigues P, Figueiredo M, Abreu P. Computational Science – ICCS 2023. View
  7. Gunn K, Lu W, Song R. Statistics in Precision Health. View
  8. Kaur M, Singh V, Khan A, Sharma K, Mendoonca Junior F, Nayarisseri A. Deep Learning in Genetics and Genomics. View