Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Monday, March 11, 2019 at 4:00 PM to 4:30 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?


Journal Description

JMIR Medical Informatics (JMI, ISSN 2291-9694; Impact Factor: 3.188) (Editor-in-chief: Christian Lovis MD MPH FACMI) is a PubMed/SCIE-indexed, top-rated, tier A journal that focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation.

Published by JMIR Publications, JMIR Medical Informatics has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.

JMIR Medical Informatics adheres to rigorous quality standards, involving a rapid and thorough peer-review process, professional copyediting, professional production of PDF, XHTML, and XML proofs (ready for deposit in PubMed Central/PubMed).


Recent Articles:

  • Source: flickr; Copyright: Hans Splinter; URL:; License: Creative Commons Attribution + NoDerivatives (CC-BY-ND).

    A Hematologist-Level Deep Learning Algorithm (BMSNet) for Assessing the Morphologies of Single Nuclear Balls in Bone Marrow Smears: Algorithm Development


    Background: Bone marrow aspiration and biopsy remain the gold standard for the diagnosis of hematological diseases despite the development of flow cytometry (FCM) and molecular and gene analyses. However, the interpretation of the results is laborious and operator dependent. Furthermore, the obtained results exhibit inter- and intravariations among specialists. Therefore, it is important to develop a more objective and automated analysis system. Several deep learning models have been developed and applied in medical image analysis but not in the field of hematological histology, especially for bone marrow smear applications. Objective: The aim of this study was to develop a deep learning model (BMSNet) for assisting hematologists in the interpretation of bone marrow smears for faster diagnosis and disease monitoring. Methods: From January 1, 2016, to December 31, 2018, 122 bone marrow smears were photographed and divided into a development cohort (N=42), a validation cohort (N=70), and a competition cohort (N=10). The development cohort included 17,319 annotated cells from 291 high-resolution photos. In total, 20 photos were taken for each patient in the validation cohort and the competition cohort. This study included eight annotation categories: erythroid, blasts, myeloid, lymphoid, plasma cells, monocyte, megakaryocyte, and unable to identify. BMSNet is a convolutional neural network with the YOLO v3 architecture, which detects and classifies single cells in a single model. Six visiting staff members participated in a human-machine competition, and the results from the FCM were regarded as the ground truth. Results: In the development cohort, according to 6-fold cross-validation, the average precision of the bounding box prediction without consideration of the classification is 67.4%. After removing the bounding box prediction error, the precision and recall of BMSNet were similar to those of the hematologists in most categories. In detecting more than 5% of blasts in the validation cohort, the area under the curve (AUC) of BMSNet (0.948) was higher than the AUC of the hematologists (0.929) but lower than the AUC of the pathologists (0.985). In detecting more than 20% of blasts, the AUCs of the hematologists (0.981) and pathologists (0.980) were similar and were higher than the AUC of BMSNet (0.942). Further analysis showed that the performance difference could be attributed to the myelodysplastic syndrome cases. In the competition cohort, the mean value of the correlations between BMSNet and FCM was 0.960, and the mean values of the correlations between the visiting staff and FCM ranged between 0.952 and 0.990. Conclusions: Our deep learning model can assist hematologists in interpreting bone marrow smears by facilitating and accelerating the detection of hematopoietic cells. However, a detailed morphological interpretation still requires trained hematologists.

  • Inpatient surgery has become increasingly more common, yet little is known about the variations in patient demographic characteristics associated with surgical treatment, while these could provide invaluable insights into the evidence-based criteria for effective clinical management and health care system development. Source: Flickr; Copyright: U.S. Pacific Fleet; URL:; License: Creative Commons Attribution + Noncommercial (CC-BY-NC).

    Identifying the Characteristics of Patients With Cervical Degenerative Disease for Surgical Treatment From 17-Year Real-World Data: Retrospective Study


    Background: Real-world data (RWD) play important roles in evaluating treatment effectiveness in clinical research. In recent decades, with the development of more accurate diagnoses and better treatment options, inpatient surgery for cervical degenerative disease (CDD) has become increasingly more common, yet little is known about the variations in patient demographic characteristics associated with surgical treatment. Objective: This study aimed to identify the characteristics of surgical patients with CDD using RWD collected from electronic medical records. Methods: This study included 20,288 inpatient surgeries registered from January 1, 2000, to December 31, 2016, among patients aged 18 years or older, and demographic data (eg, age, sex, admission time, surgery type, treatment, discharge diagnosis, and discharge time) were collected at baseline. Regression modeling and time series analysis were conducted to analyze the trend in each variable (total number of inpatient surgeries, mean age at surgery, sex, and average length of stay). A P value <.01 was considered statistically significant. The RWD in this study were collected from the Orthopedic Department at Peking University Third Hospital, and the study was approved by the institutional review board. Results: Over the last 17 years, the number of inpatient surgeries increased annually by an average of 11.13%, with some fluctuations. In total, 76.4% (15,496/20,288) of the surgeries were performed in patients with CDD aged 41 to 65 years, and there was no significant change in the mean age at surgery. More male patients were observed, and the proportions of male and female patients who underwent surgery were 64.7% (13,126/20,288) and 35.3% (7162/20,288), respectively. However, interestingly, the proportion of surgeries performed among female patients showed an increasing trend (P<.001), leading to a narrowing sex gap. The average length of stay for surgical treatment decreased from 21 days to 6 days and showed a steady decline from 2012 onward. Conclusions: The RWD showed its capability in supporting clinical research. The mean age at surgery for CDD was consistent in the real-world population, the proportion of female patients increased, and the average length of stay decreased over time. These results may be valuable to guide resource allocation for the early prevention and diagnosis, as well as surgical treatment of CDD.

  • Source: FlickR; Copyright: Marco Verch; URL:; License: Creative Commons Attribution (CC-BY).

    Critical Predictors for the Early Detection of Conversion From Unipolar Major Depressive Disorder to Bipolar Disorder: Nationwide Population-Based...


    Background: Unipolar major depressive disorder (MDD) and bipolar disorder are two major mood disorders. The two disorders have different treatment strategies and prognoses. However, bipolar disorder may begin with depression and could be diagnosed as MDD in the initial stage, which may later contribute to treatment failure. Previous studies indicated that a high proportion of patients diagnosed with MDD will develop bipolar disorder over time. This kind of hidden bipolar disorder may contribute to the treatment resistance observed in patients with MDD. Objective: In this population-based study, our aim was to investigate the rate and risk factors of a diagnostic change from unipolar MDD to bipolar disorder during a 10-year follow-up. Furthermore, a risk stratification model was developed for MDD-to-bipolar disorder conversion. Methods: We conducted a retrospective cohort study involving patients who were newly diagnosed with MDD between January 1, 2000, and December 31, 2004, by using the Taiwan National Health Insurance Research Database. All patients with depression were observed until (1) diagnosis of bipolar disorder by a psychiatrist, (2) death, or (3) December 31, 2013. All patients with depression were divided into the following two groups, according to whether bipolar disorder was diagnosed during the follow-up period: converted group and nonconverted group. Six groups of variables within the first 6 months of enrollment, including personal characteristics, physical comorbidities, psychiatric comorbidities, health care usage behaviors, disorder severity, and psychotropic use, were extracted and were included in a classification and regression tree (CART) analysis to generate a risk stratification model for MDD-to-bipolar disorder conversion. Results: Our study enrolled 2820 patients with MDD. During the follow-up period, 536 patients were diagnosed with bipolar disorder (conversion rate=19.0%). The CART method identified five variables (kinds of antipsychotics used within the first 6 months of enrollment, kinds of antidepressants used within the first 6 months of enrollment, total psychiatric outpatient visits, kinds of benzodiazepines used within one visit, and use of mood stabilizers) as significant predictors of the risk of bipolar disorder conversion. This risk CART was able to stratify patients into high-, medium-, and low-risk groups with regard to bipolar disorder conversion. In the high-risk group, 61.5%-100% of patients with depression eventually developed bipolar disorder. On the other hand, in the low-risk group, only 6.4%-14.3% of patients with depression developed bipolar disorder. Conclusions: The CART method identified five variables as significant predictors of bipolar disorder conversion. In a simple two- to four-step process, these variables permit the identification of patients with low, intermediate, or high risk of bipolar disorder conversion. The developed model can be applied to routine clinical practice for the early diagnosis of bipolar disorder.

  • Source: Freepik; Copyright: jcomp; URL:; License: Licensed by JMIR.

    Low-Density Lipoprotein Cholesterol Target Attainment in Patients With Established Cardiovascular Disease: Analysis of Routine Care Data


    Background: Direct feedback on quality of care is one of the key features of a learning health care system (LHS), enabling health care professionals to improve upon the routine clinical care of their patients during practice. Objective: This study aimed to evaluate the potential of routine care data extracted from electronic health records (EHRs) in order to obtain reliable information on low-density lipoprotein cholesterol (LDL-c) management in cardiovascular disease (CVD) patients referred to a tertiary care center. Methods: We extracted all LDL-c measurements from the EHRs of patients with a history of CVD referred to the University Medical Center Utrecht. We assessed LDL-c target attainment at the time of referral and per year. In patients with multiple measurements, we analyzed LDL-c trajectories, truncated at 6 follow-up measurements. Lastly, we performed a logistic regression analysis to investigate factors associated with improvement of LDL-c at the next measurement. Results: Between February 2003 and December 2017, 250,749 LDL-c measurements were taken from 95,795 patients, of whom 23,932 had a history of CVD. At the time of referral, 51% of patients had not reached their LDL-c target. A large proportion of patients (55%) had no follow-up LDL-c measurements. Most of the patients with repeated measurements showed no change in LDL-c levels over time: the transition probability to remain in the same category was up to 0.84. Sequence clustering analysis showed more women (odds ratio 1.18, 95% CI 1.07-1.10) in the cluster with both most measurements off target and the most LDL-c measurements furthest from the target. Timing of drug prescription was difficult to determine from our data, limiting the interpretation of results regarding medication management. Conclusions: Routine care data can be used to provide feedback on quality of care, such as LDL-c target attainment. These routine care data show high off-target prevalence and little change in LDL-c over time. Registrations of diagnosis; follow-up trajectory, including primary and secondary care; and medication use need to be improved in order to enhance usability of the EHR system for adequate feedback.

  • Source: rawpixel; Copyright: rawpixel; URL:; License: Licensed by JMIR.

    Clinical Text Data in Machine Learning: Systematic Review


    Background: Clinical narratives represent the main form of communication within health care, providing a personalized account of patient history and assessments, and offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data. Objective: The main aim of this study was to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigated the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice. Methods: Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multifaceted interface, to perform a literature search against MEDLINE. We identified 110 relevant studies and extracted information about text data used to support machine learning, NLP tasks supported, and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation, and any relevant statistics. Results: The majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents, with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing the predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free-text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable because of the sensitive nature of data considered. Besides the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The majority of studies focused on text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management, and surveillance. Conclusions: We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.

  • Source: Unsplash; Copyright: Sharon McCutcheon; URL:; License: Licensed by JMIR.

    Discrepancies in Written Versus Calculated Durations in Opioid Prescriptions: Pre-Post Study


    Background: The United States is in the midst of an opioid epidemic. Long-term use of opioid medications is associated with an increased risk of dependence. The US Centers for Disease Control and Prevention makes specific recommendations regarding opioid prescribing, including that prescription quantities should not exceed the intended duration of treatment. Objective: The purpose of this study was to determine if opioid prescription quantities written at our institution exceed intended duration of treatment and whether enhancements to our electronic health record system improved any discrepancies. Methods: We examined the opioid prescriptions written at our institution for a 22-month period. We examined the duration of treatment documented in the prescription itself and calculated a duration based on the quantity of tablets and doses per day. We determined whether requiring documentation of the prescription duration affected these outcomes. Results: We reviewed 72,314 opioid prescriptions, of which 16.96% had a calculated duration that was greater than what was documented in the prescription. Making the duration a required field significantly reduced this discrepancy (17.95% vs 16.21%, P<.001) but did not eliminate it. Conclusions: Health information technology vendors should develop tools that, by default, accurately represent prescription durations and/or modify doses and quantities dispensed based on provider-entered durations. This would potentially reduce unintended prolonged opioid use and reduce the potential for long-term dependence.

  • Patient portal on device. Source: ChipSoft B.V. Author: Remie Kranendonk | REEM.; Copyright: ChipSoft B.V.; URL:; License: Licensed by the authors.

    Towards an Adoption Framework for Patient Access to Electronic Health Records: Systematic Literature Mapping Study


    Background: Patient access to electronic health records (EHRs) is associated with increased patient engagement and health care quality outcomes. However, the adoption of patient portals and personal health records (PHRs) that facilitate this access is impeded by barriers. The Clinical Adoption Framework (CAF) has been developed to analyze EHR adoption, but this framework does not consider the patient as an end-user. Objective: We aim to extend the scope of the CAF to patient access to EHRs, develop guidance documentation for the application of the CAF, and assess the interrater reliability. Methods: We systematically reviewed existing systematic reviews on patients' access to EHRs and PHRs. Results of each review were mapped to one of the 43 CAF categories. Categories were iteratively adapted when needed. We measured the interrater reliability with Cohen’s unweighted kappa and statistics regarding the agreement among reviewers on mapping quotes of the reviews to different CAF categories. Results: We further defined the framework’s inclusion and exclusion criteria for 33 of the 43 CAF categories and achieved a moderate agreement among the raters, which varied between categories. Conclusions: In the reviews, categories about people, organization, system quality, system use, and the net benefits of system use were addressed more often than those about international and regional information and communication technology infrastructures, standards, politics, incentive programs, and social trends. Categories that were addressed less might have been underdefined in this study. The guidance documentation we developed can be applied to systematic literature reviews and implementation studies, patient and informal caregiver access to EHRs, and the adoption of PHRs.

  • Source: Image created by the Authors; Copyright: Junfeng Peng; URL:; License: Creative Commons Attribution (CC-BY).

    Peak Outpatient and Emergency Department Visit Forecasting for Patients With Chronic Respiratory Diseases Using Machine Learning Methods: Retrospective...


    Background: The overcrowding of hospital outpatient and emergency departments (OEDs) due to chronic respiratory diseases in certain weather or under certain environmental pollution conditions results in the degradation in quality of medical care, and even limits its availability. Objective: To help OED managers to schedule medical resource allocation during times of excessive health care demands after short-term fluctuations in air pollution and weather, we employed machine learning (ML) methods to predict the peak OED arrivals of patients with chronic respiratory diseases. Methods: In this paper, we first identified 13,218 visits from patients with chronic respiratory diseases to OEDs in hospitals from January 1, 2016, to December 31, 2017. Then, we divided the data into three datasets: weather-based visits, air quality-based visits, and weather air quality-based visits. Finally, we developed ML methods to predict the peak event (peak demand days) of patients with chronic respiratory diseases (eg, asthma, respiratory infection, and chronic obstructive pulmonary disease) visiting OEDs on the three weather data and environmental pollution datasets in Guangzhou, China. Results: The adaptive boosting-based neural networks, tree bag, and random forest achieved the biggest receiver operating characteristic area under the curve, 0.698, 0.714, and 0.809, on the air quality dataset, the weather dataset, and weather air quality dataset, respectively. Overall, random forests reached the best classification prediction performance. Conclusions: The proposed ML methods may act as a useful tool to adapt medical services in advance by predicting the peak of OED arrivals. Further, the developed ML methods are generic enough to cope with similar medical scenarios, provided that the data is available.

  • Form to fill in the results of biochemistry blood tests. Source: iStock by Getty Images; Copyright: luchschen; URL:; License: Licensed by the authors.

    Predicting Adverse Outcomes for Febrile Patients in the Emergency Department Using Sparse Laboratory Data: Development of a Time Adaptive Model


    Background: A timely decision in the initial stages for patients with an acute illness is important. However, only a few studies have determined the prognosis of patients based on insufficient laboratory data during the initial stages of treatment. Objective: This study aimed to develop and validate time adaptive prediction models to predict the severity of illness in the emergency department (ED) using highly sparse laboratory test data (test order status and test results) and a machine learning approach. Methods: This retrospective study used ED data from a tertiary academic hospital in Seoul, Korea. Two different models were developed based on laboratory test data: order status only (OSO) and order status and results (OSR) models. A binary composite adverse outcome was used, including mortality or hospitalization in the intensive care unit. Both models were evaluated using various performance criteria, including the area under the receiver operating characteristic curve (AUC) and balanced accuracy (BA). Clinical usefulness was examined by determining the positive likelihood ratio (PLR) and negative likelihood ratio (NLR). Results: Of 9491 eligible patients in the ED (mean age, 55.2 years, SD 17.7 years; 4839/9491, 51.0% women), the model development cohort and validation cohort included 6645 and 2846 patients, respectively. The OSR model generally exhibited better performance (AUC=0.88, BA=0.81) than the OSO model (AUC=0.80, BA=0.74). The OSR model was more informative than the OSO model to predict patients at low or high risk of adverse outcomes (P<.001 for differences in both PLR and NLR). Conclusions: Early-stage adverse outcomes for febrile patients could be predicted using machine learning models of highly sparse data including test order status and laboratory test results. This prediction tool could help medical professionals who are simultaneously treating the same patient share information, lead dynamic communication, and consequently prevent medical errors.

  • Source: iStock; Copyright: NicoElNino; URL:; License: Licensed by the authors.

    Insurance Customers’ Expectations for Sharing Health Data: Qualitative Survey Study


    Background: Insurance organizations are essential stakeholders in health care ecosystems. For addressing future health care needs, insurance companies require access to health data to deliver preventative and proactive digital health services to customers. However, extant research is limited in examining the conditions that incentivize health data sharing. Objective: This study aimed to (1) identify the expectations of insurance customers when sharing health data, (2) determine the perceived intrinsic value of health data, and (3) explore the conditions that aid in incentivizing health data sharing in the relationship between an insurance organization and its customer. Methods: A Web-based survey was distributed to randomly selected customers from a Finnish insurance organization through email. A single open-text answer was used for a qualitative data analysis through inductive coding, followed by a thematic analysis. Furthermore, the 4 constructs of commitment, power, reciprocity, and trust from the social exchange theory (SET) were applied as a framework. Results: From the 5000 customers invited to participate, we received 452 surveys (response rate: 9.0%). Customer characteristics were found to reflect customer demographics. Of the 452 surveys, 48 (10.6%) open-text responses were skipped by the customer, 57 (12.6%) customers had no expectations from sharing health data, and 44 (9.7%) customers preferred to abstain from a data sharing relationship. Using the SET framework, we found that customers expected different conditions to be fulfilled by their insurance provider based on the commitment, power, reciprocity, and trust constructs. Of the 452 customers who completed the surveys, 64 (14.2%) customers required that the insurance organization meets their data treatment expectations (commitment). Overall, 4.9% (22/452) of customers were concerned about their health data being used against them to profile their health, to increase insurance prices, or to deny health insurance claims (power). A total of 28.5% (129/452) of customers expected some form of benefit, such as personalized digital health services, and 29.9% (135/452) of customers expected finance-related compensation (reciprocity). Furthermore, 7.5% (34/452) of customers expected some form of empathy from the insurance organization through enhanced transparency or an emotional connection (trust). Conclusions: To aid in the design and development of digital health services, insurance organizations need to address the customers’ expectations when sharing their health data. We established the expectations of customers in the social exchange of health data and explored the perceived values of data as intangible goods. Actions by the insurance organization should aim to increase trust through a culture of transparency, commitment to treat health data in a prescribed manner, provide reciprocal benefits through digital health services that customers deem valuable, and assuage fears of health data being used to prevent providing insurance coverage or increase costs.

  • Monitoring of patients with chronic diseases using electronic medical records. Source:; Copyright: marcus/; URL:; License: Licensed by the authors.

    Toward Standardized Monitoring of Patients With Chronic Diseases in Primary Care Using Electronic Medical Records: Development of a Tool by Adapted Delphi...


    Background: Long-term care for patients with chronic diseases poses a huge challenge in primary care. There are deficits in care, especially regarding monitoring and creating structured follow-ups. Appropriate electronic medical records (EMR) could support this, but so far, no generic evidence-based template exists. Objective: The aim of this study is to develop an evidence-based standardized, generic template that improves the monitoring of patients with chronic conditions in primary care by means of an EMR. Methods: We used an adapted Delphi procedure to evaluate a structured set of evidence-based monitoring indicators for 5 highly prevalent chronic diseases (ie, diabetes mellitus type 2, asthma, arterial hypertension, chronic heart failure, and osteoarthritis). We assessed the indicators’ utility in practice and summarized them into a user-friendly layout. Results: This multistep procedure resulted in a monitoring tool consisting of condensed sets of indicators, which were divided into sublayers to maximize ergonomics. A cockpit serves as an overview of fixed goals and a set of procedures to facilitate disease management. An additional tab contains information on nondisease-specific indicators such as allergies and vital signs. Conclusions: Our generic template systematically integrates the existing scientific evidence for the standardized long-term monitoring of chronic conditions. It contains a user-friendly and clinically sensible layout. This template can improve the care for patients with chronic diseases when using EMRs in primary care.

  • A medical practitioner using machine learning decision tree algorithm for analysis. Source: Image created by the Authors; Copyright: Cheng-Sheng Yu; URL:; License: Creative Commons Attribution (CC-BY).

    Predicting Metabolic Syndrome With Machine Learning Models Using a Decision Tree Algorithm: Retrospective Cohort Study


    Background: Metabolic syndrome is a cluster of disorders that significantly influence the development and deterioration of numerous diseases. FibroScan is an ultrasound device that was recently shown to predict metabolic syndrome with moderate accuracy. However, previous research regarding prediction of metabolic syndrome in subjects examined with FibroScan has been mainly based on conventional statistical models. Alternatively, machine learning, whereby a computer algorithm learns from prior experience, has better predictive performance over conventional statistical modeling. Objective: We aimed to evaluate the accuracy of different decision tree machine learning algorithms to predict the state of metabolic syndrome in self-paid health examination subjects who were examined with FibroScan. Methods: Multivariate logistic regression was conducted for every known risk factor of metabolic syndrome. Principal components analysis was used to visualize the distribution of metabolic syndrome patients. We further applied various statistical machine learning techniques to visualize and investigate the pattern and relationship between metabolic syndrome and several risk variables. Results: Obesity, serum glutamic-oxalocetic transaminase, serum glutamic pyruvic transaminase, controlled attenuation parameter score, and glycated hemoglobin emerged as significant risk factors in multivariate logistic regression. The area under the receiver operating characteristic curve values for classification and regression trees and for the random forest were 0.831 and 0.904, respectively. Conclusions: Machine learning technology facilitates the identification of metabolic syndrome in self-paid health examination subjects with high accuracy.

Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Latest Submissions Open for Peer-Review:

View All Open Peer Review Articles
  • Adverse Drug Event Detection From Clinical Notes in Electronic Health Records: A Joint Approach for Entity and Relation Extraction Based on Knowledge-aware Neural Attentive Deep Learning Models

    Date Submitted: Feb 25, 2020

    Open Peer Review Period: Feb 25, 2020 - Apr 21, 2020

    Background: An adverse drug event (ADE) is commonly defined as “an injury resulting from medical intervention related to a drug”. Providing information related to ADEs and alerting caregivers at t...

    Background: An adverse drug event (ADE) is commonly defined as “an injury resulting from medical intervention related to a drug”. Providing information related to ADEs and alerting caregivers at the point-of-care can reduce the risk of prescription and diagnosis errors, and improve health outcomes. ADEs captured in Electronic Health Records (EHR) structured data, as either coded problems or allergies, are often incomplete leading to underreporting. It is therefore important to develop capabilities to process unstructured EHR data in the form of clinical notes, which contain richer documentation of a patient’s adverse drug events. Several natural language processing (NLP) systems were previously proposed to automatically extract information related to ADEs. However, the results from these systems showed that significant improvement is still required for automatic extraction of ADEs from clinical notes. Objective: The objective of this study is to improve automatic extraction of ADEs and related information such as drugs and their reason for administration from patient clinical notes. Methods: This research was conducted using discharge summaries from the MIMIC-III database obtained through the National NLP Clinical Challenges (n2c2) annotated with Drugs, drug attributes (Strength, Form, Frequency, Route, Dosage, Duration), Adverse Drug Events, Reasons, and relations between drugs and other entities. We developed a deep learning–based system for extracting these drug–centric concepts and relations simultaneously using a joint method enhanced with contextualized embeddings, a position-attention mechanism, and knowledge representations. The joint method generated different sentence representations with respect to each drug, which were then used to extract related concepts and relations simultaneously. Contextualized representations trained on the MIMIC-III database were used to capture context¬–sensitive meanings of words. The position-attention mechanism amplified benefits of the joint method by generating sentence representations that capture long-distance relations. Knowledge representations were obtained from graph embeddings created using the FAERS database to improve relation extraction, especially when contextual clues are insufficient. Results: Our system achieved new state-of-the-art results on the n2c2 dataset, with significant improvements in recognizing the crucial Drug-->Reason (F1 0.650 vs 0.579) and Drug-->ADE (0.490 vs 0.476) relations. Conclusions: We present a system for extracting drug–centric concepts and relations that outperformed current state-of-the-art results. We show that contextualized embeddings, position-attention mechanism and knowledge graph embeddings effectively improve deep learning–based concept and relation extraction. This study demonstrates the further potential for deep learning–based methods to help extract real world evidence from unstructured patient data for drug safety surveillance.