TY - JOUR AU - Wu, Patrick AU - Gifford, Aliya AU - Meng, Xiangrui AU - Li, Xue AU - Campbell, Harry AU - Varley, Tim AU - Zhao, Juan AU - Carroll, Robert AU - Bastarache, Lisa AU - Denny, Joshua C AU - Theodoratou, Evropi AU - Wei, Wei-Qi PY - 2019 DA - 2019/11/29 TI - Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation JO - JMIR Med Inform SP - e14325 VL - 7 IS - 4 KW - electronic health record KW - genome-wide association study KW - phenome-wide association study KW - phenotyping KW - medical informatics applications KW - data science AB - Background: The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR). Objective: The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes. Methods: We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS. Results: We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]). Conclusions: This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR. SN - 2291-9694 UR - http://medinform.jmir.org/2019/4/e14325/ UR - https://doi.org/10.2196/14325 UR - http://www.ncbi.nlm.nih.gov/pubmed/31553307 DO - 10.2196/14325 ID - info:doi/10.2196/14325 ER -