This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
Asthma hospital encounters impose a heavy burden on the health care system. To improve preventive care and outcomes for patients with asthma, we recently developed a black-box machine learning model to predict whether a patient with asthma will have one or more asthma hospital encounters in the succeeding 12 months. Our model is more accurate than previous models. However, black-box machine learning models do not explain their predictions, which forms a barrier to widespread clinical adoption. To solve this issue, we previously developed a method to automatically provide rule-based explanations for the model’s predictions and to suggest tailored interventions without sacrificing model performance. For an average patient correctly predicted by our model to have future asthma hospital encounters, our explanation method generated over 5000 rule-based explanations, if any. However, the user of the automated explanation function, often a busy clinician, will want to quickly obtain the most useful information for a patient by viewing only the top few explanations. Therefore, a methodology is required to appropriately rank the explanations generated for a patient. However, this is currently an open problem.
The aim of this study is to develop a method to appropriately rank the rule-based explanations that our automated explanation method generates for a patient.
We developed a ranking method that struck a balance among multiple factors. Through a secondary analysis of 82,888 data instances of adults with asthma from the University of Washington Medicine between 2011 and 2018, we demonstrated our ranking method on the test case of predicting asthma hospital encounters in patients with asthma.
For each patient predicted to have asthma hospital encounters in the succeeding 12 months, the top few explanations returned by our ranking method typically have high quality and low redundancy. Many top-ranked explanations provide useful insights on the various aspects of the patient’s situation, which cannot be easily obtained by viewing the patient’s data in the current electronic health record system.
The explanation ranking module is an essential component of the automated explanation function, and it addresses the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice. In the next few years, we plan to test our explanation ranking method on predictive modeling problems addressing other diseases as well as on data from other health care systems.
RR2-10.2196/5039
Approximately 7.7% of Americans and over 339 million people worldwide have asthma [
Due to limited capacity, a care management program can serve at most 3% of patients [
To fill this gap, the aim of this study is to develop a method to appropriately rank the rule-based explanations generated by our automated explanation method [
We reused the following items from our previous papers [
max(
mean(
min(
norm(): normalization function
The institutional review board of the UWM approved this secondary analysis retrospective cohort study.
In Washington State, the UWM is the largest academic health care system. Its enterprise data warehouse stores clinical and administrative data from 3 hospitals and 12 clinics for adults. The patient cohort included all adult patients with asthma (aged ≥18 years) who received care at any of these UWM facilities between 2011 and 2018. In a specific year, a patient was considered asthmatic if the patient had one or more asthma diagnosis codes (International Classification of Diseases [ICD], Tenth Revision: J45.x; ICD, Ninth Revision: 493.0x, 493.1x, 493.8x, 493.9x) documented in the encounter billing database during the year [
Given a patient deemed asthmatic in an index year, we wanted to predict whether the patient would experience any asthma hospital encounter at the UWM in the succeeding 12 months, that is, any ED visit or inpatient stay at the UWM with asthma (ICD-10: J45.x; ICD-9: 493.0x, 493.1x, 493.8x, 493.9x) as its principal diagnosis. In predictive model training and testing, the patient’s outcome in the succeeding 12 months was predicted using the patient’s data until the end of the year.
We used a structured administrative and clinical data set retrieved from the UWM’s enterprise data warehouse. This data set contained information recorded for the visits by the patient cohort to the 12 clinics and 3 hospitals of the UWM over the 9-year span of 2011-2019. As the prediction target was for the following 12 months, the effective data in the data set spanned across the 8-year period of 2011-2018.
We used the data from 2011 to 2017 as the training set to train the predictive model and to mine the association rules used by our automated explanation method. We used the data of 2018 as the test set to demonstrate our ranking method for the rule-based explanations generated by our automated explanation method.
Our UWM model used the XGBoost classification algorithm [
Our automated explanation method [
Our automated explanation method [
Before rule mining starts, an automated discretizing method based on the minimum description length principle [
Here, each item
The patient had ≥13 ED visits in the past year AND the patient had ≥4 systemic corticosteroid prescriptions in the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months.
The flow diagram of our automated explanation method coupled with our explanation ranking method.
Our automated explanation method imposes several constraints on the association rules used by it. In this section, we review some of the constraints that are relevant to our explanation ranking method. For an association rule
For each feature-value pair item used to create association rules, a clinician in the development team of the automated explanation function precompiles 0 or more interventions. An item linking to at least one intervention is called actionable. The interventions related to the actionable items on the left-hand side of a rule are automatically linked to that rule. A rule linking to at least one intervention is called actionable.
For each patient predicted to have a poor outcome by the predictive model, the prediction is explained by the related association rules. For each such rule, the patient satisfies all of the feature-value pair items on its left-hand side. The poor outcome value appears on its right-hand side. Each rule delineates a reason for the patient’s predicted poor outcome. Every actionable rule is displayed along with its linked interventions. The user of the automated explanation function can choose from these tailored interventions for the patient. The rules mined from the training set typically cover common reasons for having poor outcomes. Nonetheless, some patients could have poor outcomes due to rare reasons, such as the patient was prescribed between three and seven asthma medications during the past year AND the patient was prescribed ≥11 distinct medications during the past year AND the patient has some drug or material allergy AND the patient had ≥1 active problem in the problem list during the past year. Hence, our explanation method usually explains the predictions for most, though not all, of the patients correctly predicted by the model to have poor outcomes.
For an average patient whom the predictive model predicts to have a poor outcome, our automated explanation method finds many related association rules, if any. Multiple rules often share some common feature-value pair items on their left-hand sides. To avoid overwhelming the user of the automated explanation function and to enable the user to quickly obtain the most useful information by viewing only the top few rules, we need to appropriately rank the rules found for a patient. As a rule often has a long description, a standard computer screen can show only a few rules simultaneously. To reduce the burden on the user, we present the rules in a manner similar to how a web search engine presents its search results for a keyword query. We chose a small number
The main idea of our association rule ranking method is to consider multiple factors in the ranking process. The procedure incorporates these factors into a rule scoring function that strikes a balance among them and then ranks the rules found for a patient based on the scores computed for the rules in an iterative manner. In each iteration, the scores of the remaining rules are recomputed, and then, a rule is chosen from them. In the following, we describe our rule ranking method in detail.
When ranking the association rules found for a patient, we consider five factors:
We incorporate the five factors listed above into a rule scoring function to strike a balance among them. For an association rule
its ranking score is a linear combination of five terms, one per factor:
At a high level,
The term mean(
The term
Let
In this section, we sequentially describe the five terms used in the rule scoring function in detail.
As norm() is a monotonically increasing function, all else being equal, the term norm(
As shown in
As −norm() is a monotonically decreasing function, all else being equal, the term −norm(
In the
The distribution of the commonality values of all of the association rules used by our automated explanation method for predicting asthma hospital encounters in patients with asthma at the University of Washington Medicine.
If only one association rule is found for a patient, there is no need to rank the rule. If ≥2 rules are found for the patient, we rank these rules iteratively. In the
The same feature-value pair item could appear on the left-hand side of ≥2 top-ranked association rules. The user of the automated explanation function tends to read both the rules and the items on the left-hand side of a rule in the display order. To help the user obtain the most useful information as quickly as possible, for each rule on display, we need to appropriately rank the items on its left-hand side. For this purpose, we considered two factors:
We incorporate the two factors listed above into an item scoring function to strike a balance between them. Consider the
The terms in the equation above are further explained below:
In the equation for
The term
Both exp(−
When the rank of an association rule is decided, we compute the ranking score for each feature-value pair item on the rule’s left-hand side. We then sort these items in descending order of their scores. Items with the same score are randomly prescribed and given consecutive ranks.
We used the R programming language to implement our explanation ranking method.
We want to demonstrate various aspects of the results produced by our explanation ranking method. For this purpose, we chose 8 patients with asthma in the test set, each of whom our UWM model correctly predicted to have ≥1 asthma hospital encounter in 2019, and our automated explanation method could explain this prediction. For each patient, we show the top three explanations produced by our explanation ranking method. Each patient satisfied one or more of the following conditions and was an informative case:
The rule scoring function uses six parameters whose default values are as follows:
As explained before, when
Each UWM data instance used in this study corresponds to a distinct patient and index year pair and is used to predict the patient’s outcome in the succeeding 12 months. Tables S1 and S2 in
For an average patient with asthma, our explanation ranking method took <0.01 seconds to produce the top three explanations. This is sufficiently fast for providing real-time clinical decision support.
The test set included 134 patients with asthma, each of whom our UWM model correctly predicted to have ≥1 asthma hospital encounter in 2019, and our automated explanation method could explain this prediction. To show the reader various aspects of the results produced by our explanation ranking method, we chose 8 of these patients who were informative cases.
The top three association rules that our explanation ranking method produced for the first selected patient (patient 1). This patient satisfied condition 1.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient had 2 or 3 EDa visits related to asthma during the past year AND the patient was prescribed between 7 and 11 distinct asthma medications during the past year AND the patient was prescribed between 5 and 7 distinct asthma relievers during the past year AND the patient had ≥1 active problem in the problem list during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
46 | 24 (52.17) | 24 (2.03) | |
2 |
The patient’s mean length of stay of an ED visit during the past year was >0.205 day AND the patient was prescribed ≥4 systemic corticosteroids during the past year AND the patient’s most recent ED visit related to asthma occurred no less than 26 days ago and no more than 100 days ago AND the patient was prescribed 2 distinct nebulizer medications during the past year AND the patient is not a White patient → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
28 | 14 (50) | 14 (1.18) | |
3 |
The patient was prescribed nebulizer medications ≥8 times during the past year AND the patient had ≥5 no shows during the past year AND the patient had 2 or 3 ED visits related to asthma during the past year AND the patient’s mean temperature during the past year was ≤98.09 Fahrenheit AND the patient is ≤54 years old → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
32 | 18 (56.25) | 18 (1.52) |
aED: emergency department.
The top three association rules that our explanation ranking method produced for the second selected patient (patient 2). This patient satisfied condition 1.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient’s most recent diagnosis of asthma with acute exacerbation or status asthmaticus was from ≤110 days ago AND the patient was prescribed ≥10 short-acting AND the patient had no outpatient visit during the past year AND the patient’s first encounter related to asthma was from ≥1 year ago → The patient will likely have ≥1 inpatient stay or EDa visit for asthma in the succeeding 12 months. |
87 | 54 (62.07) | 54 (4.56) | |
2 |
The patient was prescribed asthma medications ≥16 times during the past year AND the patient’s mean respiratory rate during the past year was >16.89 breaths per minute AND the patient’s most recent visit was an ED visit AND the patient is a Black or an African American patient AND the patient was totally allowed between 1 and 33 medication refills during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
32 | 18 (56.25) | 18 (1.52) | |
3 |
The patient had between 8 and 16 asthma diagnoses during the past year AND the patient’s lowest SpO2b level during the past year was between 8.0% and 94.5% AND the patient’s most recent ED visit related to asthma occurred no less than 26 days ago and no more than 100 days ago AND the patient is not a White patient AND the patient had ≤6 encounters during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
35 | 18 (51.43) | 18 (1.52) |
aED: emergency department.
bSpO2: peripheral capillary oxygen saturation.
The top three association rules that our explanation ranking method produced for the third selected patient (patient 3). This patient satisfied condition 1.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient’s most recent diagnosis of asthma with acute exacerbation or status asthmaticus was from ≤110 days ago AND the patient’s most recent visit was an EDa visit AND the patient had between 9 and 17 primary or principal asthma diagnoses during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
127 | 79 (62.2) | 79 (6.67) | |
2 |
The patient had between 17 and 27 asthma diagnoses during the past year AND the patient’s most recent visit was an ED visit AND the patient had no visit to the primary care provider during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
68 | 38 (55.88) | 38 (3.21) | |
3 |
The patient was prescribed ≥10 short-acting AND the highest severity of all asthma diagnoses of the patient during the past year was moderate or severe persistent asthma AND the patient was allowed ≥34 medication refills during the past year AND the patient is ≤54 years old → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
40 | 20 (50) | 20 (1.69) |
aED: emergency department.
The top three association rules that our explanation ranking method produced for the fourth selected patient (patient 4). This patient satisfied condition 2.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient had ≥7 EDa visits related to asthma during the past year AND the patient is single → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
37 | 34 (91.89) | 34 (2.87) | |
2 |
The patient had between 9 and 17 primary or principal asthma diagnoses during the past year AND the patient’s most recent outpatient visit related to asthma was from ≥365 days ago → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
105 | 66 (62.86) | 66 (5.57) | |
3 |
The patient had ≥28 asthma diagnoses during the past year AND the patient had no outpatient visit during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
19 | 16 (84.21) | 16 (1.35) |
aED: emergency department.
The top three association rules that our explanation ranking method produced for the fifth selected patient (patient 5). This patient satisfied condition 5.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient had ≥20 diagnoses of asthma with acute exacerbation during the past year AND the patient was prescribed ≥10 short-acting → The patient will likely have ≥1 inpatient stay or EDa visit for asthma in the succeeding 12 months. |
82 | 48 (58.54) | 48 (4.05) | |
2 |
The patient had ≥28 asthma diagnoses during the past year AND the patient was prescribed nebulizer medications ≥8 times during the past year AND the patient had no outpatient visit to the primary care provider during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
55 | 37 (67.27) | 37 (3.13) | |
3 |
The patient had ≥18 primary or principal asthma diagnoses during the past year AND the patient was prescribed ≥8 distinct asthma relievers during the past year AND the patient’s mean heart rate during the past year was >80 beats per minute → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
116 | 58 (50) | 58 (4.9) |
aED: emergency department.
The top three association rules that our explanation ranking method produced for the sixth selected patient (patient 6). This patient satisfied conditions 3 and 4.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient had 2 or 3 EDa visits related to asthma during the past year AND the patient’s most recent outpatient visit related to asthma was from ≤104 days ago AND the patient was prescribed ≤2 inhaled corticosteroids during the past year AND the patient is ≤54 years old AND the patient’s relative change of weight during the past year was ≤3% → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
40 | 22 (55) | 22 (1.86) | |
2 |
The patient had between 3 and 8 diagnoses of asthma with (acute) exacerbation during the past year AND the patient had 2 or 3 ED visits related to asthma during the past year AND the patient is not a White patient AND the patient was prescribed ≤2 distinct asthma medications during the past year AND the patient is single → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
25 | 14 (56) | 14 (1.18) | |
3 |
The patient’s most recent outpatient visit related to asthma was from ≤104 days ago AND the patient had 2 or 3 ED visits related to asthma during the past year AND the patient was prescribed ≥1 unit of medications during the past year AND the patient had no public insurance on the last day of the past year AND the patient had between 1 and 13 outpatient visits during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
32 | 16 (50) | 16 (1.35) |
aED: emergency department.
The top three association rules that our explanation ranking method produced for the seventh selected patient (patient 7). This patient satisfied conditions 1 and 2.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient had ≥7 EDa visits related to asthma during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
51 | 39 (76.47) | 39 (3.29) | |
2 |
The patient had between 17 and 27 asthma diagnoses during the past year AND the patient had no outpatient visit during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
48 | 28 (58.33) | 28 (2.36) | |
3 |
The patient’s mean length of stay of an ED visit during the past year was between 0.025 and 0.205 day AND the patient had ≥3 ED visits during the past year AND the patient was prescribed ≥3 asthma relievers that are neither short-acting AND the patient was prescribed ≥4 systemic corticosteroids during the past year AND the patient is single → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
116 | 58 (50) | 58 (4.9) |
aED: emergency department.
The top three association rules that our explanation ranking method produced for the eighth selected patient (patient 8). This patient satisfied condition 5.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
|
|
Total, n | Value, n (%) |
|
|
1 |
The patient had between 9 and 17 primary or principal asthma diagnoses during the past year AND the patient was prescribed asthma medications ≥16 times during the past year AND the patient had no outpatient visit to the primary care provider during the past year AND the patient is not a White patient → The patient will likely have ≥1 inpatient stay or EDa visit for asthma in the succeeding 12 months. |
87 | 45 (51.72) | 45 (3.8) | |
2 |
For the patient’s most recent visit, the time from making the request to the actual visit was ≤0.6 day AND the patient was prescribed asthma medications ≥16 times during the past year AND the patient is a Black or an African American patient AND the patient’s first encounter related to asthma was from ≥1 year ago AND the patient’s lowest SpO2b level during the past year was between 94.5% and 95.5% → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
19 | 12 (63.16) | 12 (1.01) | |
3 |
The patient was prescribed ≥12 distinct asthma medications during the past year AND the patient had ≥12 encounters during the past year AND the patient’s most recent outpatient visit related to asthma was from ≤104 days ago AND the patient had ≤82 laboratory tests during the past year AND the patient is not a White patient → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
19 | 12 (63.16) | 12 (1.01) |
aED: emergency department.
bSpO2: peripheral capillary oxygen saturation.
The interventions linked to each of the top three association rules that our explanation ranking method produced for patient 7.
Rank | Association rule | Linked interventions |
1 |
The patient had ≥7 EDa visits related to asthma during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
An intervention linked to the item “the patient had ≥7 ED visits related to asthma during the past year” is to use control strategies to prevent needing emergency care. |
2 |
The patient had between 17 and 27 asthma diagnoses during the past year AND the patient had no outpatient visit during the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
An intervention linked to the item “the patient had between 17 and 27 asthma diagnoses during the past year” is to give the patient suggestions on how to improve asthma control. An intervention linked to the item “the patient had no outpatient visit during the past year” is to make sure that the patient has a primary care provider and to suggest the patient to regularly visit the provider. |
3 |
The patient’s mean length of stay of an ED visit during the past year was between 0.025 and 0.205 day AND the patient had ≥3 ED visits during the past year AND the patient was prescribed ≥3 asthma relievers that are neither short-acting AND the patient was prescribed ≥4 systemic corticosteroids during the past year AND the patient is single → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. |
An intervention linked to the items “the patient’s mean length of stay of an ED visit during the past year was between 0.025 and 0.205 day” and “the patient had ≥3 ED visits during the past year” is to use control strategies to prevent needing emergency care. An intervention linked to the items “the patient was prescribed ≥3 asthma relievers that are neither short-acting |
aED: emergency department.
As illustrated by the cases shown in
To make good clinical decisions for a patient, the clinician needs to understand the patient’s situation well. For each of the eight selected patients, the top three rule-based explanations produced by our explanation ranking method provide succinct summaries on a wide range of aspects of the patient’s situation, such as demographics, encounters, vital signs, laboratory tests, and medications. From these summaries, the user of the automated explanation function can quickly gain a comprehensive understanding of the patient’s situation related to the prediction target. This saves the user a significant amount of time and effort. In comparison, to gain this understanding in a clinical setting, even if a clinician knows all of the features needed for this purpose, the clinician currently often needs to spend a significant amount of time laboriously checking many pages of information scattered in various places in the EHR system and performing manual calculations. For example, patient 1 had a total of >1000 encounters recorded in the EHR system at the UWM over time. In 2018, this patient had 164 encounters, only two of which were related to asthma, and both were ED visits. As
To gain a comprehensive understanding of a patient’s situation quickly, a clinician could ask the patient to describe his or her situation. However, the patient often cannot perform this well. For example, patients 1, 3, and 7 had severe mental disorders, which affected their memory and ability to describe their situation. This was a common scenario. Over 29.99% (4393/14,644) of patients with asthma at the UWM have mental disorders. Moreover, when making clinical decisions, the clinician does not always have direct access to the patient. For instance, when identifying candidate patients for care management, care managers are sitting in a back office and cannot talk to patients. In either of these two cases, the summaries provided by the top few rule-based explanations can help the clinician gain an understanding of the patient.
Often, many features must be used to adequately summarize a patient’s situation related to the prediction target. In a busy clinical environment, a clinician cannot be expected to enumerate all of these features in a short amount of time. The top few rule-based explanations that our explanation ranking method produces for a patient cover the values of various features summarizing the patient’s situation related to the prediction target. This saves the user of the automated explanation function from having to manually think of these features and to compute their values.
The EHR system provides some browsing and basic search functions. However, for certain important features summarizing a patient’s situation related to the prediction target, we cannot easily obtain their values by using these functions to check the patient’s EHR data. The top few rule-based explanations that our explanation ranking method produces for a patient cover the values of several such features. This saves the user of the automated explanation function a significant amount of work. For example, many different asthma medications exist. In 2018, patient 2 had 740 medication prescriptions. It is difficult and time-consuming to manually compute the number of asthma medication prescriptions and the total number of short-acting
A patient with asthma often has several other diseases, which could distract the clinicians and cause them to pay insufficient attention to the patient’s asthma and record incorrect data on the patient in the EHR system. For example, in 2018, asthmatic patient 3 also had major depression disorder, anxiety, posttraumatic stress disorder, visual disturbance, chronic pain, and knee osteoarthritis. In the patient’s problem list, these diseases were recorded as major problems, whereas asthma was recorded as a minor problem. However, the patient had 15 primary asthma diagnoses, some of which were severe persistent asthma and indicated that asthma was a major problem for the patient at that time. In 2020, asthma was first recorded as two major problems in the patient’s problem list: one on asthma exacerbation and another on persistent asthma with status asthmaticus. As shown in
This can help the user of the automated explanation function identify suitable interventions for the patient. For example, as shown in
In this section, for each of conditions 1-5, we choose one example patient satisfying it and show how this patient was an informative case.
As an example case for condition 1, patient 1 had 164 encounters and 644 medication prescriptions in 2018. As shown in
As an example case for condition 2, patient 7 had eight asthma-related encounters in 2018, all of which were ED visits. As shown in
Patient 6 provides an example for condition 3. As shown in
As an example case for condition 4, patient 6 had only three encounters and one medication order, and subsequently, a small amount of information was recorded for this patient in the EHR system in 2018. As shown in
As an example case for condition 5, patient 8 had no hospital encounters related to asthma in 2018. As shown in
We performed 5 sensitivity analysis experiments, 1 for each of the 5 parameters
In comparison with the case where all five parameters took their default values and for each of the three parameters
In comparison with the case where all five parameters took their default values, the average percentage change in the unique feature-value pair items contained in the top min (3,
In comparison with the case where all five parameters took their default values, the average percentage change in the unique feature-value pair items contained in the top min (3,
In a busy clinical environment, the explanation ranking module is essential for our automated explanation function for machine learning predictions to provide high-quality real-time decision support. For an average patient with asthma correctly predicted by our UWM model to have future asthma hospital encounters, our automated explanation method generated over 5000 rule-based explanations, if any. Within a negligible amount of time, our explanation ranking method can appropriately rank them and return the few highest-ranked explanations. These few explanations typically have high quality and low redundancy. From these few explanations, the user of the automated explanation function can gain useful insights on various aspects of the patient’s situation. Many of these insights cannot be easily obtained by viewing the patient’s data in the current EHR system. With further improvements in model accuracy, our UWM model coupled with our automated explanation method and our explanation ranking method could be deployed to better guide the use of asthma care management to save costs and improve patient outcomes.
Similar to our automated explanation method, our explanation ranking method is general purpose and does not rely on any specific property of a particular prediction target, disease, patient cohort, or health care system. Our automated explanation method coupled with our explanation ranking method can be used for any predictive modeling problem on any tabular data set. This provides a unique solution to the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice.
In our sensitivity analysis, when we changed any parameter used in our explanation ranking method from its default value, the resulting average percentage change in the unique feature-value pair items contained in the top min(3,
Both the rule scoring and item scoring functions have several parameters. On the basis of the preferences of the users of the automated explanation function and the specific needs of the particular health care application, the developer of the automated explanation function could change some of these parameters from their default values. In the UWM test case used in this study, all association rules used by our automated explanation method were actionable. For some other predictive modeling problems, certain rules used by our automated explanation method are nonactionable [
Different patients have different distributions of the ranking scores for the association rules found for the patients. No single threshold on the ranking score works for all patients. Thus, we use a threshold on the number of rules rather than a threshold on the ranking score to determine the top rules that will be displayed by default. This is similar to the case with a web search engine such as Google. Google does not use any ranking score threshold to determine the search results that will be displayed on each search result page. Instead, by default, Google displays 10 search results on each search result page. The user can request to see more search results by clicking the
Understanding how a predictive model works requires a global interpretation. Understanding a single prediction of a model requires only local interpretation [
To use our automated explanation method in clinical practice, we could implement our automated explanation method together with our explanation ranking method as a software library with an application programming interface. For any clinical decision support software that uses a machine learning predictive model, we could use the application programming interface to add the automated explanation function into the software to explain the model’s predictions.
As surveyed in the book written by Molnar [
As surveyed in previous studies [
This work has three limitations that are excellent areas for future work:
This study used data from a single health care system. In the future, it would be beneficial to test our explanation ranking method on data from other health care systems.
This study tested our explanation ranking method for predicting one specific target in one disease. In the future, it would be beneficial to test our method on predictive modeling problems that address other prediction targets and diseases.
The data set used in this work contains no information on patients’ encounters outside the UWM. This forced us to limit the prediction target to asthma hospital encounters at the UWM rather than asthma hospital encounters in any health care system. In addition, the features used in this study were computed solely from the data recorded for the patients’ encounters at the UWM. In the future, it would be worth investigating how the top few explanations produced by our explanation ranking method would differ if we have data on the patients’ encounters in other health care systems.
In this study, we developed a method to rank the rule-based explanations generated by our automated explanation method for machine learning predictions. Within a negligible amount of time, our explanation ranking method ranks the explanations and returns the few highest-ranked explanations. These few explanations typically have high quality and low redundancy. Many of them provide useful insights on the various aspects of the patient’s situation, which cannot be easily obtained by viewing the patient’s data in the current EHR system. Both our automated explanation method and our explanation ranking method are designed based on general computer science principles and rely on no special property of any specific disease, prediction target, patient cohort, or health care system. Although only tested in the case of predicting asthma hospital encounters in patients with asthma, our explanation ranking method is general and can be used for any predictive modeling problem on any tabular data set. The explanation ranking module is an essential component of the automated explanation function, which addresses the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice. In the next few years, we plan to test our explanation ranking method on predictive modeling problems addressing other diseases as well as on data from other health care systems.
A summary of the demographic and clinical characteristics of patients with asthma at the University of Washington Medicine.
emergency department
electronic health record
International Classification of Diseases
University of Washington Medicine
extreme gradient boosting
The authors thank Brian Kelly for useful discussions. GL was partially supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number R01HL142503. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
XZ participated in designing the study, conducting a literature review, writing the paper’s first draft, performing the computer coding implementation, and conducting experiments. GL conceptualized and designed the study, conducted a literature review, and rewrote the entire paper. Both authors read and approved the final manuscript.
None declared.