TY - JOUR AU - Zheng, Le AU - Wang, Yue AU - Hao, Shiying AU - Shin, Andrew Y AU - Jin, Bo AU - Ngo, Anh D AU - Jackson-Browne, Medina S AU - Feller, Daniel J AU - Fu, Tianyun AU - Zhang, Karena AU - Zhou, Xin AU - Zhu, Chunqing AU - Dai, Dorothy AU - Yu, Yunxian AU - Zheng, Gang AU - Li, Yu-Ming AU - McElhinney, Doff B AU - Culver, Devore S AU - Alfreds, Shaun T AU - Stearns, Frank AU - Sylvester, Karl G AU - Widen, Eric AU - Ling, Xuefeng Bruce PY - 2016 DA - 2016/11/11 TI - Web-based Real-Time Case Finding for the Population Health Management of Patients With Diabetes Mellitus: A Prospective Validation of the Natural Language Processing–Based Algorithm With Statewide Electronic Medical Records JO - JMIR Med Inform SP - e37 VL - 4 IS - 4 KW - electronic medical record KW - natural language processing KW - diabetes mellitus KW - data mining AB - Background: Diabetes case finding based on structured medical records does not fully identify diabetic patients whose medical histories related to diabetes are available in the form of free text. Manual chart reviews have been used but involve high labor costs and long latency. Objective: This study developed and tested a Web-based diabetes case finding algorithm using both structured and unstructured electronic medical records (EMRs). Methods: This study was based on the health information exchange (HIE) EMR database that covers almost all health facilities in the state of Maine, United States. Using narrative clinical notes, a Web-based natural language processing (NLP) case finding algorithm was retrospectively (July 1, 2012, to June 30, 2013) developed with a random subset of HIE-associated facilities, which was then blind tested with the remaining facilities. The NLP-based algorithm was subsequently integrated into the HIE database and validated prospectively (July 1, 2013, to June 30, 2014). Results: Of the 935,891 patients in the prospective cohort, 64,168 diabetes cases were identified using diagnosis codes alone. Our NLP-based case finding algorithm prospectively found an additional 5756 uncodified cases (5756/64,168, 8.97% increase) with a positive predictive value of .90. Of the 21,720 diabetic patients identified by both methods, 6616 patients (6616/21,720, 30.46%) were identified by the NLP-based algorithm before a diabetes diagnosis was noted in the structured EMR (mean time difference = 48 days). Conclusions: The online NLP algorithm was effective in identifying uncodified diabetes cases in real time, leading to a significant improvement in diabetes case finding. The successful integration of the NLP-based case finding algorithm into the Maine HIE database indicates a strong potential for application of this novel method to achieve a more complete ascertainment of diagnoses of diabetes mellitus. SN - 2291-9694 UR - http://medinform.jmir.org/2016/4/e37/ UR - https://doi.org/10.2196/medinform.6328 UR - http://www.ncbi.nlm.nih.gov/pubmed/27836816 DO - 10.2196/medinform.6328 ID - info:doi/10.2196/medinform.6328 ER -