Published on in Vol 10, No 4 (2022): April

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/35734, first published .
Utility Metrics for Evaluating Synthetic Health Data Generation Methods: Validation Study

Utility Metrics for Evaluating Synthetic Health Data Generation Methods: Validation Study

Utility Metrics for Evaluating Synthetic Health Data Generation Methods: Validation Study

Journals

  1. El Emam K, Mosquera L, Fang X. Validating a membership disclosure metric for synthetic health data. JAMIA Open 2022;5(4) View
  2. Zhang Z, Yan C, Malin B. Keeping synthetic patients on track: feedback mechanisms to mitigate performance drift in longitudinal health data simulation. Journal of the American Medical Informatics Association 2022;29(11):1890 View
  3. Rajotte J, Bergen R, Buckeridge D, El Emam K, Ng R, Strome E. Synthetic data as an enabler for machine learning applications in medicine. iScience 2022;25(11):105331 View
  4. Gonzalez-Abril L, Angulo C, Ortega J, Lopez-Guerra J. Statistical Validation of Synthetic Data for Lung Cancer Patients Generated by Using Generative Adversarial Networks. Electronics 2022;11(20):3277 View
  5. Mosquera L, El Emam K, Ding L, Sharma V, Zhang X, Kababji S, Carvalho C, Hamilton B, Palfrey D, Kong L, Jiang B, Eurich D. A method for generating synthetic longitudinal health data. BMC Medical Research Methodology 2023;23(1) View
  6. D'Amico S, Dall’Olio D, Sala C, Dall’Olio L, Sauta E, Zampini M, Asti G, Lanino L, Maggioni G, Campagna A, Ubezio M, Russo A, Bicchieri M, Riva E, Tentori C, Travaglino E, Morandini P, Savevski V, Santoro A, Prada-Luengo I, Krogh A, Santini V, Kordasti S, Platzbecker U, Diez-Campelo M, Fenaux P, Haferlach T, Castellani G, Della Porta M. Synthetic Data Generation by Artificial Intelligence to Accelerate Research and Precision Medicine in Hematology. JCO Clinical Cancer Informatics 2023;(7) View
  7. El Emam K. Status of Synthetic Data Generation for Structured Health Data. JCO Clinical Cancer Informatics 2023;(7) View
  8. Pun F, Ozerov I, Zhavoronkov A. AI-powered therapeutic target discovery. Trends in Pharmacological Sciences 2023;44(9):561 View
  9. Lenatti M, Paglialonga A, Orani V, Ferretti M, Mongelli M. Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models. IEEE Journal of Biomedical and Health Informatics 2023;27(8):3760 View
  10. Alloza C, Knox B, Raad H, Aguilà M, Coakley C, Mohrova Z, Boin É, Bénard M, Davies J, Jacquot E, Lecomte C, Fabre A, Batech M. A Case for Synthetic Data in Regulatory Decision‐Making in Europe. Clinical Pharmacology & Therapeutics 2023;114(4):795 View
  11. Ali Q. A trade-off between farm production and flood alleviation using land use tillage preferences as a natural flood management (NFM) strategy. Smart Agricultural Technology 2023;6:100361 View
  12. He S, Chong P, Yoon B, Chung P, Chen D, Marzouk S, Black K, Sharp W, Safari P, Goldstein J, Raja A, Lee J. Entropy removal of medical diagnostics. Scientific Reports 2024;14(1) View
  13. Ghosheh G, Li J, Zhu T. A Survey of Generative Adversarial Networks for Synthesizing Structured Electronic Health Records. ACM Computing Surveys 2024;56(6):1 View
  14. Ling X, Menzies T, Hazard C, Shu J, Beel J. Trading Off Scalability, Privacy, and Performance in Data Synthesis. IEEE Access 2024;12:26642 View
  15. Vallevik V, Babic A, Marshall S, Elvatun S, Brøgger H, Alagaratnam S, Edwin B, Veeraragavan N, Befring A, Nygård J. Can I trust my fake data – A comprehensive quality assessment framework for synthetic tabular data in healthcare. International Journal of Medical Informatics 2024;185:105413 View
  16. El Emam K, Mosquera L, Fang X, El-Hussuna A. An evaluation of the replicability of analyses using synthetic health data. Scientific Reports 2024;14(1) View
  17. Akiya I, Ishihara T, Yamamoto K. A Comparison of Synthetic Data Generation Techniques for Control Group Survival Data in Oncology Clinical Trials: Simulation Study (Preprint). JMIR Medical Informatics 2023 View
  18. Gangwal A, Ansari A, Ahmad I, Azad A, Wan Sulaiman W. Current strategies to address data scarcity in artificial intelligence-based drug discovery: A comprehensive review. Computers in Biology and Medicine 2024;179:108734 View
  19. Ntampakis N, Argyriou V, Diamantaras K, Goulianas K, Sarigiannidis P, Siniosoglou I. Introducing SPINE: A Holistic Approach to Synthetic Pulmonary Imaging Evaluation Through End-to-End Data and Model Management. IEEE Open Journal of Engineering in Medicine and Biology 2024;5:576 View
  20. Chaynikov Y, Sudakov V. On the estimation of integral risk of predictor Lipschitz functions in machine learning models. Keldysh Institute Preprints 2024;(53):1 View
  21. Kim J, Choo H, Shin S, Song K. Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network. Scientific Reports 2024;14(1) View
  22. Benani A, Vibert J, Demuth S. Données synthétiques en médecine : génération, évaluation et limites. médecine/sciences 2024;40(8-9):661 View
  23. Wang E, Mott K, Zhang H, Gazit S, Chodick G, Burcu M. Validation Assessment of Privacy‐Preserving Synthetic Electronic Health Record Data: Comparison of Original Versus Synthetic Data on Real‐World COVID‐19 Vaccine Effectiveness. Pharmacoepidemiology and Drug Safety 2024;33(10) View
  24. Smolyak D, Bjarnadóttir M, Crowley K, Agarwal R. Large language models and synthetic health data: progress and prospects. JAMIA Open 2024;7(4) View
  25. Lautrup A, Hyrup T, Zimek A, Schneider-Kamp P. Systematic Review of Generative Modelling Tools and Utility Metrics for Fully Synthetic Tabular Data. ACM Computing Surveys 2025;57(4):1 View
  26. Lautrup A, Hyrup T, Zimek A, Schneider-Kamp P. Syntheval: a framework for detailed utility and privacy evaluation of tabular synthetic data. Data Mining and Knowledge Discovery 2025;39(1) View

Books/Policy Documents

  1. Bullward A, Aljebreen A, Coles A, McInerney C, Johnson O. Process Mining Workshops. View
  2. Kvak D, Březinová E, Biroš M, Hrubý R. Medical Imaging and Computer-Aided Diagnosis. View
  3. Martiri E. Data Envelopment Analysis (DEA) Methods for Maximizing Efficiency. View