This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
Unscheduled emergency department return visits (EDRVs) are key indicators for monitoring the quality of emergency medical care. A high return rate implies that the medical services provided by the emergency department (ED) failed to achieve the expected results of accurate diagnosis and effective treatment. Older adults are more susceptible to diseases and comorbidities than younger adults, and they exhibit unique and complex clinical characteristics that increase the difficulty of clinical diagnosis and treatment. Older adults also use more emergency medical resources than people in other age groups. Many studies have reviewed the causes of EDRVs among general ED patients; however, few have focused on older adults, although this is the age group with the highest rate of EDRVs.
This aim of this study is to establish a model for predicting unscheduled EDRVs within a 72-hour period among patients aged 65 years and older. In addition, we aim to investigate the effects of the influencing factors on their unscheduled EDRVs.
We used stratified and randomized data from Taiwan’s National Health Insurance Research Database and applied data mining techniques to construct a prediction model consisting of patient, disease, hospital, and physician characteristics. Records of ED visits by patients aged 65 years and older from 1996 to 2010 in the National Health Insurance Research Database were selected, and the final sample size was 49,252 records.
The decision tree of the prediction model achieved an acceptable overall accuracy of 76.80%. Economic status, chronic illness, and length of stay in the ED were the top three variables influencing unscheduled EDRVs. Those who stayed in the ED overnight or longer on their first visit were less likely to return. This study confirms the results of prior studies, which found that economically underprivileged older adults with chronic illness and comorbidities were more likely to return to the ED.
Medical institutions can use our prediction model as a reference to improve medical management and clinical services by understanding the reasons for 72-hour unscheduled EDRVs in older adult patients. A possible solution is to create mechanisms that incorporate our prediction model and develop a support system with customized medical education for older patients and their family members before discharge. Meanwhile, a reasonably longer length of stay in the ED may help evaluate treatments and guide prognosis for older adult patients, and it may further reduce the rate of their unscheduled EDRVs.
Many countries today face challenges related to the rapidly aging population. Advances in medical technology and the aging of post–World War II baby boomers have led to a greater proportion of adults aged over 65 years in many industrialized nations’ populations. This substantive shift in demographics not only increases the overall demand for health care and medical services but also influences economic and social welfare policies. Older adults are more susceptible to diseases and comorbidities than younger adults, and they exhibit unique and complex clinical characteristics that increase the difficulty of clinical diagnosis and treatment [
A high unscheduled EDRV rate implies that the medical services provided by the ED failed to achieve the expected results of accurate diagnosis and effective treatment [
Many studies have reviewed the causes of EDRVs among general ED patients [
Taiwan’s health care services have been ranked the highest worldwide by The Richest [
The factors influencing EDRVs can be categorized into approximately four areas: disease-related, patient-related, physician-related, and medical institution–related factors. One of the major disease-related reasons for ED visits is a pathological condition with unclear symptoms, signs, and diagnoses, and the primary pathological condition responsible for EDRVs, such as abdominal pain [
EDRVs are known to increase concurrently with age [
In summary, the literature confirms that disease-, patient-, physician-, and institution-related factors all influence the rate of unscheduled EDRVs. As older patients’ EDRVs are associated with high risks and high impacts, this study focused on older patients and investigated the effects of the aforementioned influencing factors on their unscheduled EDRVs.
The study was divided into two stages. The first stage entailed data selection and preprocessing. The second stage entailed data analysis. Machine learning techniques are unlikely to be restricted by statistical analysis assumptions or affected by collinear interactions between independent variables, and they demonstrate superior fault tolerance and learning capability. This study focuses on investigating the factors influencing the classification of 72-hour unscheduled EDRVs. The decision tree technique, one of machine learning classification techniques, is easier to interpret by a nonstatistician and is intuitive to follow compared with other methods (eg, random forest and support vector machine) [
We used the NHIRD as the data source and selected records of ED visits by patients aged 65 years or older from 1996 to 2010 and had older adult visits of 162,264 records out of 1,425,335 total ED visits. We then excluded 190 records of deaths and 26,912 records hospitalized within 72 hours after the ED visit. In 2010, Taiwan’s Ministry of Health and Welfare amended the emergency TC from four to five classes. To prevent data inconsistency, 21,318 records following the new emergency triage reclassification were excluded from the scope of this research. Meanwhile, the Ministry of Health and Welfare that launched improvements in medical technologies in 2005 might significantly influence the number of unscheduled EDRVs; therefore, 44,114 records from 1996 to 2004 were removed. Finally, 20,478 records with incomplete or illogical values were excluded to ensure the accuracy and consistency of the analyzed data. The final sample size was 49,252 records, including 3510 unscheduled EDRV records within 72 hours.
To develop a prediction model for older patients’ unscheduled EDRVs, we applied the presence or absence of a
Among the aforementioned independent variables, only age was a continuous variable; all other variables were categorical variables with a nominal or ordinal scale. Moreover, the variables of chronic illness and radiography examination comprised several subvariables. Detailed information related to the included variables is presented in
We applied the C4.5 technique (ie, J48 in Weka) to create a decision tree for the classifications. Decision trees use a simple tree structure to represent a set of IF-THEN rules between independent and dependent variables. The tree structure consists of multiple internal and leaf nodes. In a decision tree, each internal node represents a single independent variable, each branch of a node represents one possible value or a set of possible values of the independent variable, and each leaf node represents a class label.
A 10-fold cross-validation method was used to randomly partition the data set into 10 subsets. The validation was repeated 10 times. A confusion matrix was established to evaluate the performance of the classification model. Subsequently, we calculated the average accuracy rate of the classification results for the 10 testing sets. The sensitivity and specificity were also examined. Sensitivity refers to the ability of the prediction model to accurately predict the EDRVs among the sampled population, whereas specificity refers to the ability of the prediction model to accurately predict the samples with no return to the ED; accuracy refers to the accuracy of the prediction model regardless of return or nonreturn to the ED.
The final sample size was 49,252 records, including 3510 unscheduled EDRV records within 72 hours. However, the number of unscheduled EDRVs within 72 hours indicated only 7.13% (3510/49,252) of the emergency visits (not unscheduled EDRVs). This raises a class imbalance problem, which may lead the rare class (unscheduled EDRVs) to be ignored in the prediction model. To overcome this problem, we maintained an approximately 1:1 ratio of unscheduled EDRVs and emergency visits randomly selected from the emergency visit samples (3659/45,742, 7.99%). Then, we combined the total samples of the unscheduled EDRVs and emergency visits into a single test data set. This study increases in proportion to the sample sizes of unscheduled EDRVs and emergency visits by older adult patients for test data sets by setting the attribute of supervised resample (biasToUniform=200) in the Weka software. After such a resampling procedure, the average number of unscheduled EDRVs and emergency visits by older patients were 7231 and 7153, respectively. We obtained 30 test data sets after 30 repeated resampling and mixed procedures, and the test data sets were used for further decision tree analysis through tenfold cross-validation. In this study, the decision tree achieved an average sensitivity of 76.65% for accurately predicting the unscheduled EDRVs, an average specificity of 76.95% for accurately predicting nonreturn to the ED, and an average overall prediction accuracy of 76.80%.
This study was approved by the institutional review board No. SE20209B of Taichung Veterans General Hospital. As the NHIRD data set comprises deidentified secondary data for research purposes, written consent from the study participants was not obtained, and the institutional review board of Taichung Veterans General Hospital issued a formal written waiver of the need for consent.
According to the results of gain ratio of the decision tree using C4.5 implemented by Weka J48, the decision tree showed that ES, cancer drug treatment and monitoring, LOSED, cerebrovascular disease, DC, physician year of practice, patient age, LU, x-ray, DS, TC, and hospital level are critical variables for data classification and prediction. The top three influencing variables, in descending order, were ES, chronic illness-cancer drug treatment and monitoring (CICDTM), and LOSED. The 72-hour unscheduled EDRVs by older ED patients was negatively correlated with patients’ ES, positively correlated with their CICDTM, and negatively correlated with their LOSED. This demonstrated that patients from low-income households or those with CICDTM are at a higher risk of unscheduled EDRVs within 72 hours. The likelihood of EDRVs decreased exponentially if older patients had an overnight stay or longer LOSED at their first visit.
The decision tree generated 11 prediction patterns (rules) for unscheduled EDRVs in older patients, which are presented in
As shown in the upper section of
Decision criteria for predicting older patients’ unscheduled emergency department return visits. ED: emergency department; LOS: length of stay.
Rule 1: economic status (ES)=0 (older patients from low-income households)
Rule 2: ES=1 and chronic illness-cancer drug treatment and monitoring (CICDTM)=0 (older patients from non–low-income households with cancer drug treatment and monitoring)
Rule 3: ES=1, CICDTM=1, length of stay in the emergency department (LOSED)=0, and chronic illness-cerebrovascular disease (CICD)=0 (older patients stay in the emergency department (ED) for less than 1 d, and with cerebrovascular disease)
Rule 4: ES=1, CICDTM=1, LOSED=0, CICD=1, and diagnostic categories (DC)=1, 2, 3, 5, and 10 (older patients stay in ED less than 1 day, with no cerebrovascular disease but infectious diseases and parasitic diseases, tumor, endocrine and immune diseases, mental illness, and genito-urinary system diseases)
Rule 5: ES=1, CICDTM=1, LOSED=0, CICD=1, DC=7, and physician year of practice (PYP)≤8 (older patients stay in the ED for less than 1 day, with no cerebrovascular disease but circulatory system diseases, and treated by physician with 8 or fewer years of practice)
Rule 6: ES=1, CICDTM=1, LOSED=0, CICD=1, DC=8, and (PYP≤6 or PYP>6 and patient age [PA]>75) (older patients stay in the ED less than 1 day, with no cerebrovascular disease but respiratory diseases, and treated by a physician with 6 or less years; or all the conditions are same but treated by a physician with more than 6 years of practice, and PA is more than 75)
Rule 7: ES=1, CICDTM=1, LOSED=0, CICD=1, DC=9, and level of urbanization (LU)=1, 4 (older patients stay in the ED for less than 1 day, with no cerebrovascular disease but digestive diseases, and live in a high LU or general town)
Rule 8: ES=1, CICDTM=1, LOSED=0, CICD=1, DC=13, and x-ray=1 (older patients stay in the ED for less than 1 day, with no cerebrovascular disease but musculoskeletal system diseases, had no x-ray)
Rule 9: ES=1, CICDTM=1, LOSED=0, CICD=1, DC=16, and (disease severity [DS]=0 and triage classification [TC]=3; LU=1 or LU=2 and PYP>8; or TC=2, PA≤82 or TC=4; or DS=1, 2, 3, 4; older patients stay in ED less than 1 day, with no cerebrovascular disease but have signs, symptoms, and diagnosis less clear, and DS, TC of 3, and live in high LU, and treated by physician with more than 8 years of practice and live in Remote town or all conditions are the same with TC=2 and aged 82 or less, or all conditions are the same as TC=4, or all conditions are the same with DS)
Rule 10: ES=1, CICDTM=1, LOSED=0, CICD=1, DC=17, PA>67, x-ray=0, and LU=1, 3, 4, 7 (older patients stay in the ED for less than 1 day, with no cerebrovascular disease but injury and poisoning, had no x-ray, and lived in a high LU or an emerging town, general town, or remote town)
Rule 11: ES=1, CICDTM=1, LOSED=1, PYP≤11, TC=3, and hospital level=1, 2 (older patients stay in ED less than 1 day, treated by physician with 11 or fewer years, with TC and visit Regional Hospital or District Hospital)
Among the 28 investigated variables, as shown in
In this study, we confirmed that older ED patients with less economic privilege were more likely to return to the ED than those in the opposite group. Furthermore, these findings are consistent with those of a previous study [
In addition, older patients with chronic symptoms that remain prevalent or frequently relapse may prefer to return to the ED for rapid and convenient treatment, rather than visit an outpatient department. The rates of 72-hour unscheduled EDRVs were higher for older patients who required cancer drug treatment and monitoring or were diagnosed with chronic cerebrovascular diseases. These results confirm the findings of Liaw et al [
We also found that older patients with shorter LOSED had higher rates of EDRVs than those who stayed in the ED overnight or longer. Patients older than 65 years are known to have a lower metabolic rate [
In this study, some specific DC (infectious diseases and parasitic diseases, tumors, endocrine and immune diseases, mental illness, circulatory system diseases, respiratory diseases, digestive diseases, genito-urinary system diseases, musculoskeletal system diseases, signs, symptoms and diagnosis less clear, and injury and poisoning) were found to be highly related to unscheduled EDRVs under certain circumstances (patients from non–low-income households and LOS less than 1 day and patients without CICDTM and cerebrovascular disease). The results showed that only a portion of the DC (disease types) [
Older patients classified as class 3 or higher on the TC level had a higher likelihood of 72-hour unscheduled EDRVs if they were treated by physicians with less than 11 years of practice, a result partially consistent with a previous study [
Moreover, physicians often underestimate the TC of frail older patients because of the absence of prominent symptoms. This increases the risk of delayed treatment and the likelihood of unscheduled EDRVs. Platts-Mills et al [
As mentioned above, decision trees have been widely used in various clinical studies, and the analyzed results can be easily applied to clinical practice. Our prediction model developed by the decision tree achieved an acceptable rate for sensitivity, specificity, and overall prediction accuracy. Future researchers can use the results of this study as a reference and apply other methods such as random forest or support vector machine to generate a prediction model and obtain higher accuracy. As data were collected in Taiwan, caution is needed when generalizing the results of this study. Meanwhile, because of the limited content of NHIRD, important variables other than claim-based data cannot be obtained. Furthermore, the insured area and degree of urbanization may be different from the actual area of residence. In addition, the 20,478 records with incomplete or illogical values excluded in this study can cause selection bias. Future studies can use advanced interpolation techniques to explore the characteristics of deleted records and extend the results of this study.
Compared with previous studies [
Medical and health care is an important segment of Taiwan’s
For physicians, our prediction model can be used as a reference to improve medical management and clinical services to reduce older patients’ 72-hour unscheduled EDRVs. Policymakers can use the results of this study to generate incentives for medical institutions to provide appropriate education to older patients and their family members before discharge. Medical institutions may create mechanisms that incorporate our prediction model and develop a decision-making support system for emergency return visits, similar to other clinical decision support systems [
In summary, this study is based on large population-based retrospective data from the NHIRD and uses machine learning techniques, which demonstrate superior fault tolerance and learning capability from massive data. The decision tree machine learning technique was further used for data analysis and validation because of its simplicity, interpretability, and applicability of the results compared with other machine learning techniques. Through the decision tree technique, decision rules with important factors influencing the unscheduled EDRV prediction model from the considerations of patient, disease, hospital, and physician characteristics were obtained. The decision rules may serve as a reference for the early detection of unscheduled EDRV in older adults. Further studies can be based on the findings of this study and integrate hospitals’ information systems or electronic medical records to generate appropriate rules for unscheduled EDRVs for older adults in different hospitals.
Patient characteristics and variables.
Vitae.
chronic illness-cancer drug treatment and monitoring
diagnostic categories
disease severity
emergency department
emergency department return visit
economic status
length of stay
length of stay in the emergency department
level of urbanization
National Health Insurance Research Database
triage classification
The authors would like to thank Dr Ya Han Hu for his elaboration of the research conception and design stage. In addition, this research was supported in part by the Ministry of Science and Technology of ROC, Taiwan, under contract number MOST109-2410-H-041-001.
In this study, each author has participated sufficiently in the work to take public responsibility for appropriate portions of the content; study conception and design was conducted by RFC and ICC; data acquisition was done by YYL and KCC; analysis and interpretation of data was performed by RFC, CHT, and YYL; ICC, KCC, CHT, and RFC drafted the manuscript; and ICC, KCC, and RFC were the referees for the report (
None declared.