Original Paper
Abstract
Background: Asthma hospital encounters impose a heavy burden on the health care system. To improve preventive care and outcomes for patients with asthma, we recently developed a black-box machine learning model to predict whether a patient with asthma will have one or more asthma hospital encounters in the succeeding 12 months. Our model is more accurate than previous models. However, black-box machine learning models do not explain their predictions, which forms a barrier to widespread clinical adoption. To solve this issue, we previously developed a method to automatically provide rule-based explanations for the model’s predictions and to suggest tailored interventions without sacrificing model performance. For an average patient correctly predicted by our model to have future asthma hospital encounters, our explanation method generated over 5000 rule-based explanations, if any. However, the user of the automated explanation function, often a busy clinician, will want to quickly obtain the most useful information for a patient by viewing only the top few explanations. Therefore, a methodology is required to appropriately rank the explanations generated for a patient. However, this is currently an open problem.
Objective: The aim of this study is to develop a method to appropriately rank the rule-based explanations that our automated explanation method generates for a patient.
Methods: We developed a ranking method that struck a balance among multiple factors. Through a secondary analysis of 82,888 data instances of adults with asthma from the University of Washington Medicine between 2011 and 2018, we demonstrated our ranking method on the test case of predicting asthma hospital encounters in patients with asthma.
Results: For each patient predicted to have asthma hospital encounters in the succeeding 12 months, the top few explanations returned by our ranking method typically have high quality and low redundancy. Many top-ranked explanations provide useful insights on the various aspects of the patient’s situation, which cannot be easily obtained by viewing the patient’s data in the current electronic health record system.
Conclusions: The explanation ranking module is an essential component of the automated explanation function, and it addresses the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice. In the next few years, we plan to test our explanation ranking method on predictive modeling problems addressing other diseases as well as on data from other health care systems.
International Registered Report Identifier (IRRID): RR2-10.2196/5039
doi:10.2196/28287
Keywords
Introduction
Background
Approximately 7.7% of Americans and over 339 million people worldwide have asthma [
, ]. Asthma incurs a total medical cost of US $50 billion [ ], 1,564,440 emergency department (ED) visits, and 182,620 inpatient stays annually in the United States [ ]. A primary goal of asthma management is to decrease the number of asthma hospital encounters, namely, ED visits and inpatient stays. The state-of-the-art approach for achieving this goal is to deploy a predictive model to identify patients at high risk of having poor outcomes in the future. Once identified, the patient is placed into a care management program. The program will assign a care manager to regularly contact the patient to assess asthma control status, adjust asthma medications when needed, and help schedule appointments for health and other relevant services. Many health plans, including those in 9 of 12 metropolitan communities [ ], and many health care systems, such as the University of Washington Medicine (UWM), Intermountain Healthcare, and Kaiser Permanente Northern California, currently use this approach [ ]. When used correctly, this approach prevents up to 40% of future asthma hospital encounters [ , - ].Due to limited capacity, a care management program can serve at most 3% of patients [
]. To maximize the effectiveness of these programs, an accurate predictive model should be used to identify the highest-risk patients. For this purpose, we recently developed a machine learning model powered by extreme gradient boosting (XGBoost) [ ] on UWM data to predict which patients with asthma will have asthma hospital encounters in the succeeding 12 months [ ]. Compared with previous models [ , - ], this model is more accurate and improves the area under the receiver operating characteristic curve by ≥0.09. In addition, we previously developed a method to automatically explain the model’s predictions in the form of rules and to suggest tailored interventions without sacrificing model performance [ , ]. Our method works for any black-box machine learning predictive model built on tabular data and addresses the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice. Among all the published automated explanation methods for machine learning predictions [ , ], only our method can automatically recommend tailored interventions. For an average patient whom our UWM model correctly predicted to have future asthma hospital encounters, our method generated over 5000 rule-based explanations, if any [ ]. The amount of nonredundant information in these explanations is usually two orders of magnitude less than the number of explanations, as multiple explanations often share some common components. The user of the automatic explanation function wants to quickly obtain the most useful information for a patient by viewing only the top few explanations. Therefore, we need to appropriately rank the explanations generated for each patient. Currently an open problem, procedures for appropriately ranking explanations are particularly important for the adoption of our automated explanation method in a busy clinical environment.Objectives
To fill this gap, the aim of this study is to develop a method to appropriately rank the rule-based explanations generated by our automated explanation method [
, ] for a patient. We demonstrated our explanation ranking method in a test case that predicts asthma hospital encounters in patients with asthma.Methods
Items Reused From Our Previous Papers
We reused the following items from our previous papers [
, ]: patient cohort, prediction target (ie, the dependent variable), features (ie, independent variables), data set, data preprocessing method, predictive model, cutoff threshold for binary classification, and automated explanation method. A list of symbols used in this paper is provided in .List of symbols.
List of Symbols
- Cr: confidence of the association rule r
- d: decay constant
- f(d, pi, r): exponential decay function computed for the feature-value pair item pi on the left-hand side of the association rule r
- f: feature
- m: number of feature-value pair items on the left-hand side of an association rule
- max(vr(x)): maximum value of the variable vr(x) across all the rules found for the patient
- mean(f(r)): mean of f(d, pi, r) over all the feature-value pair items on the left-hand side of the association rule r
- min(vr(x)): minimum value of the variable vr(x) across all the rules found for the patient
- n: maximum number of top-ranked explanations that are allowed to be displayed initially
- norm(): normalization function
- Nr: number of feature-value pair items on the left-hand side of the association rule r
- p: feature-value pair item
- pi: the i-th feature-value pair item on the left-hand side of an association rule
- q: number of association rules generated by our automated explanation method for the patient
- r: association rule
- scorep: ranking score of the feature-value pair item p
- scorer: ranking score of the association rule r
- Sr: commonality of the association rule r
- t, ti: number of times that a feature-value pair item appears in the higher-ranked rules
- u: a value or a range
- v: outcome value
- vr(x): variable whose value on the association rule r is x
- wa: weight for the term δactionable(r) in the rule scoring function
- wb: weight for the term δactionable(p) in the item scoring function
- wc: weight for the term norm(Cr) in the rule scoring function
- wd: weight for the term mean(f(r)) in the rule scoring function
- wg: weight for the term exp(−d·t) in the item scoring function
- wn: weight for the term norm(Nr) in the rule scoring function
- ws: weight for the term norm(log10Sr) in the rule scoring function
- x: value
- δactionable(p): indicator function for whether the feature-value pair item p is actionable
- δactionable(r): indicator function for whether the association rule r is actionable
Ethics Approval
The institutional review board of the UWM approved this secondary analysis retrospective cohort study.
Patient Cohort
In Washington State, the UWM is the largest academic health care system. Its enterprise data warehouse stores clinical and administrative data from 3 hospitals and 12 clinics for adults. The patient cohort included all adult patients with asthma (aged ≥18 years) who received care at any of these UWM facilities between 2011 and 2018. In a specific year, a patient was considered asthmatic if the patient had one or more asthma diagnosis codes (International Classification of Diseases [ICD], Tenth Revision: J45.x; ICD, Ninth Revision: 493.0x, 493.1x, 493.8x, 493.9x) documented in the encounter billing database during the year [
, , ]. We excluded the patients who died during that year.Prediction Target
Given a patient deemed asthmatic in an index year, we wanted to predict whether the patient would experience any asthma hospital encounter at the UWM in the succeeding 12 months, that is, any ED visit or inpatient stay at the UWM with asthma (ICD-10: J45.x; ICD-9: 493.0x, 493.1x, 493.8x, 493.9x) as its principal diagnosis. In predictive model training and testing, the patient’s outcome in the succeeding 12 months was predicted using the patient’s data until the end of the year.
Data Set
We used a structured administrative and clinical data set retrieved from the UWM’s enterprise data warehouse. This data set contained information recorded for the visits by the patient cohort to the 12 clinics and 3 hospitals of the UWM over the 9-year span of 2011-2019. As the prediction target was for the following 12 months, the effective data in the data set spanned across the 8-year period of 2011-2018.
The Training and Test Set Split
We used the data from 2011 to 2017 as the training set to train the predictive model and to mine the association rules used by our automated explanation method. We used the data of 2018 as the test set to demonstrate our ranking method for the rule-based explanations generated by our automated explanation method.
Predictive Model and Features
Our UWM model used the XGBoost classification algorithm [
] and 71 features to predict the prediction target. As our UWM model was built on a single computer whose memory could hold the entire data set, the exact greedy algorithm was used to find the best split for tree learning in XGBoost [ ]. These 71 features are listed in Table S2 in of our previous paper [ ]. They were constructed based on the structured attributes in our data set and described various aspects of the patient’s situation, such as demographics, encounters, diagnoses, laboratory tests, procedures, vital signs, and medications. An example feature is the patient’s mean length of stay for an ED visit in the past year. Every input data instance to our predictive model includes these 71 features. Features that are the same as or similar to these 71 features were formerly used to predict asthma hospital encounters in patients with asthma and to provide automatic explanations on Intermountain Healthcare data as well as on Kaiser Permanente Southern California data [ , - ]. For binary classification, we set the cutoff threshold at the top 10% of patients predicted to be at the highest risk. Our previous study [ ] showed that on the test set, our model reached an area under the receiver operating characteristic curve of 0.902, an accuracy of 90.6% (13,268/14,644), a sensitivity of 70.2% (153/218), a specificity of 90.91% (13,115/14,426), a positive predictive value of 10.45% (153/1464), and a negative predictive value of 99.51% (13,115/13,180).Review of Our Automated Explanation Method
Success Stories
Our automated explanation method [
, ] was designed as a general method that works for any machine learning predictive model built on tabular data. We initially demonstrated our method for predicting the diagnosis of type 2 diabetes [ ]. Later, we successfully applied our method to predict asthma hospital encounters in patients with asthma on Intermountain Healthcare data [ ], UWM data [ ], and Kaiser Permanente Southern California data [ ]. Other researchers have also successfully applied our method to project lung transplantation or death in patients with cystic fibrosis [ ]; to project cardiac death in patients with cancer; and to use projections to manage heart transplant waiting list, posttransplant follow-ups, and preventive care in patients with cardiovascular diseases [ ].Main Idea
Our automated explanation method [
, ] uses class-based association rules [ , ] mined from historical data to explain a model’s predictions and to recommend tailored interventions. As shown in , the association rules are constructed separately from the predictive model and are used solely to provide explanations rather than to make predictions. Thus, our automated explanation method can work with any machine learning predictive model built on tabular data with no performance penalty. That is, our method falls into the category of model-agnostic explanation methods, which are widely used to automatically explain machine learning predictions [ , ].Before rule mining starts, an automated discretizing method based on the minimum description length principle [
, ] is first applied to the training set to convert continuous features into categorical features. The association rules are then mined from the training set using a standard method, such as Apriori [ ]. Each rule shows that a feature pattern is linked to an outcome value and has the formp1 AND p2 AND ...AND pm → v (1)
Here, each item pi (1≤i≤m) is a feature-value pair (f, u). u is either the specific value of feature f or a range in which the value of f falls. For binary classification of a good versus a poor outcome, v is the poor outcome value; for example, the patient will have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months. For a patient fulfilling all of p1, p2, ..., and pm, the rule indicates that the patient’s outcome is likely to be v. An example rule is given below:
The patient had ≥13 ED visits in the past year AND the patient had ≥4 systemic corticosteroid prescriptions in the past year → The patient will likely have ≥1 inpatient stay or ED visit for asthma in the succeeding 12 months.
Constraints Put on the Association Rules
Our automated explanation method imposes several constraints on the association rules used by it. In this section, we review some of the constraints that are relevant to our explanation ranking method. For an association rule
p1 AND p2 AND ...AND pm → v, (2)
commonality measures its coverage in the context of v; among all of the data instances linking to v, commonality is the percentage of data instances fulfilling p1, p2, ..., and pm. Meanwhile, confidence measures its precision; among all of the data instances fulfilling p1, p2, ..., and pm, the confidence is the percentage of data instances linking to v. For every association rule used by our automated explanation method, we require its commonality to be greater than or equal to a given minimum commonality threshold, such as 1%; its confidence to be greater than or equal to a given minimum confidence threshold, such as 50%; and its left-hand side to have no more than a given number (eg, 5) of feature-value pair items. As detailed in our previous papers [
, ], by setting the thresholds to these values, we can fulfill three goals concurrently. First, explanations can be given to most patients whom our UWM model correctly predicts as having ≥1 asthma hospital encounter in the succeeding 12 months. Second, the rule has sufficiently high confidence for the user of the automated explanation function to trust the rule. Third, no rule is overly complex.The Explanation Method
For each feature-value pair item used to create association rules, a clinician in the development team of the automated explanation function precompiles 0 or more interventions. An item linking to at least one intervention is called actionable. The interventions related to the actionable items on the left-hand side of a rule are automatically linked to that rule. A rule linking to at least one intervention is called actionable.
For each patient predicted to have a poor outcome by the predictive model, the prediction is explained by the related association rules. For each such rule, the patient satisfies all of the feature-value pair items on its left-hand side. The poor outcome value appears on its right-hand side. Each rule delineates a reason for the patient’s predicted poor outcome. Every actionable rule is displayed along with its linked interventions. The user of the automated explanation function can choose from these tailored interventions for the patient. The rules mined from the training set typically cover common reasons for having poor outcomes. Nonetheless, some patients could have poor outcomes due to rare reasons, such as the patient was prescribed between three and seven asthma medications during the past year AND the patient was prescribed ≥11 distinct medications during the past year AND the patient has some drug or material allergy AND the patient had ≥1 active problem in the problem list during the past year. Hence, our explanation method usually explains the predictions for most, though not all, of the patients correctly predicted by the model to have poor outcomes.
Ranking the Rule-Based Explanations Generated by Our Automated Explanation Method
Overview
For an average patient whom the predictive model predicts to have a poor outcome, our automated explanation method finds many related association rules, if any. Multiple rules often share some common feature-value pair items on their left-hand sides. To avoid overwhelming the user of the automated explanation function and to enable the user to quickly obtain the most useful information by viewing only the top few rules, we need to appropriately rank the rules found for a patient. As a rule often has a long description, a standard computer screen can show only a few rules simultaneously. To reduce the burden on the user, we present the rules in a manner similar to how a web search engine presents its search results for a keyword query. We chose a small number n, such as 3. The user can opt to change the value of n, for example, based on the size of the computer screen. If ≤n rules are found for the patient, we display all of these rules. Otherwise, if >n rules are found for the patient, we display the top n rules by default. If desired, the user can request to see more rules, for example, by dragging a vertical scroll bar or by clicking the next page button.
The main idea of our association rule ranking method is to consider multiple factors in the ranking process. The procedure incorporates these factors into a rule scoring function that strikes a balance among them and then ranks the rules found for a patient based on the scores computed for the rules in an iterative manner. In each iteration, the scores of the remaining rules are recomputed, and then, a rule is chosen from them. In the following, we describe our rule ranking method in detail.
Factors Considered in the Association Rule Ranking Process
When ranking the association rules found for a patient, we consider five factors:
- Factor 1: All else being equal, a rule with a higher confidence is more precise and should rank higher.
- Factor 2: All else being equal, a rule with a higher commonality covers a larger portion of patients with poor outcomes and should rank higher.
- Factor 3: All else being equal, a rule with fewer feature-value pair items on its left-hand side is easier to comprehend and should rank higher.
- Factor 4: In information retrieval, search engine users want to see diversified search results [ - ]. Similarly, the user of the automated explanation function wants to see diversified information in the top-ranked rules. Hence, all else being equal, a rule whose left-hand side has more items appearing in the higher-ranked rules should rank lower. The more times the items on the left-hand side of this rule appear in those rules, the lower this rule should rank.
- Factor 5: The user of the automated explanation function wants to find suitable interventions for the patient. Thus, all else being equal, an actionable rule should rank higher than a nonactionable rule.
The Rule Scoring Function
We incorporate the five factors listed above into a rule scoring function to strike a balance among them. For an association rule
r: p1 AND p2 AND ...AND pm → v, (3)
its ranking score is a linear combination of five terms, one per factor:
scorer=wc·norm(Cr)+ws·norm(log10Sr)−wn·norm(Nr)+ wd·mean(f(r))+wa·δactionable(r) (4)
At a high level,
- Cr denotes r’s confidence. The term norm(Cr) has a weight wc>0 and addresses factor 1.
- Sr denotes r’s commonality. The term norm(log10Sr) has a weight ws>0 and addresses factor 2.
- Nr denotes the number of feature-value pair items on r’s left-hand side. The term norm(Nr) has a weight wn>0 and addresses factor 3.
- The term mean(f(r)) has a weight wd>0 and addresses factor 4. For each i (1≤i≤m), the function f(d, pi, r) is computed based on the number of times the item pi appears in the higher-ranked rules. The value of f(d, pi, r) is always between 0 and 1. Consequently, the value of mean(f(r)) is always between 0 and 1.
- The term δactionable(r) is the indicator function for whether r is actionable, has a weight wa>0, and addresses factor 5.
Let vr(x) denote the variable, such as confidence, whose value on the association rule r is x. min(vr(x)) and max(vr(x)) denote the minimum and maximum values of vr(x) across all the rules found for the patient, respectively. If max(vr(x))≠min(vr(x)), the function norm(x) [x−min(vr(x))]/[max(vr(x))−min(vr(x))] normalizes x to a value between 0 and 1. If max(vr(x))=min(vr(x)), all of the rules found for the patient have the same value of vr(x), and thus, there is no need to consider vr(x) in ranking these rules. In this case, norm(x) is set to 0.
Cr, log10Sr, and Nr have different value ranges. To make Cr, log10Sr, and Nr comparable with each other, we use norm() to put them into the same range of 0 to 1. mean(f(r)) and δactionable(r) also fall within this range. To reflect that factors 1, 2, and 3 are equally important, we set the default values of wc, ws, and wn to 1. To encourage the top-ranked rules to include diversified feature-value pair items, we wanted wd’s value to be >1 and set wd’s default value to 50. To strongly push the actionable rules to rank higher than the nonactionable rules, we wanted wa’s value to be ≫1 and set wa’s default value to 100. The value of wa does not impact the score differences and, hence, the relative rankings among the actionable rules. When wa is >wc+ws+wn+wd, the actionable rules always have larger scores than the nonactionable rules because norm(Cr), norm(log10Sr), norm(Nr), and mean(f(r)) are all between 0 and 1.
Detailed Description of the Five Terms Used in the Rule Scoring Function
In this section, we sequentially describe the five terms used in the rule scoring function in detail.
As norm() is a monotonically increasing function, all else being equal, the term norm(Cr) gives a larger ranking score to an association rule with a higher confidence Cr.
As shown in
, the commonality values for the association rules used by our automated explanation method have a skewed distribution. Most of the commonality values are clustered in the lower-value range. The commonality values of the rules generated by our automated explanation method for a patient are a sample from this distribution. We want the same weight ws to work for different patients, regardless of how the sample is taken from this distribution. Thus, for every patient, we want the variance of the terms computed on the corresponding rules’ commonality values to have approximately the same scale. For this purpose, we use the log10() function to transform the commonality values so that the resulting values are distributed more evenly than the raw values. As both norm() and log10() are monotonically increasing functions, norm(log10()) is also a monotonically increasing function. All else being equal, the term norm(log10Sr) gives a larger ranking score to a rule with a higher commonality Sr.As −norm() is a monotonically decreasing function, all else being equal, the term −norm(Nr) assigns a larger ranking score to an association rule with a smaller number Nr of feature-value pair items on its left-hand side.
In the k-th iteration of the association rule ranking process, the top k−1 rules have already been determined. We work on identifying the k-th ranked rule. For each feature-value pair item pi on the left-hand side of a rule r that is found for the patient and whose rank has not yet been decided, we compute the exponential decay function f(d, pi, r) exp(−d·ti). Here, d>0 is the decay constant, with a default value of 5. ti is the number of times pi appears in the top k−1 rules. A larger value of ti results in a smaller value of f(d, pi, r). Recall that the term mean(f(r)) is the mean of f(d, pi, r) over all the items on r’s left-hand side. All else being equal, mean(f(r)) assigns a smaller ranking score to a rule whose left-hand side has more items appearing in the top k−1 rules.
δactionable(r) is equal to 1 if the association rule r is actionable and is equal to 0 if r is nonactionable. All else being equal, the term δactionable(r) assigns a larger ranking score to an actionable rule compared with that of a nonactionable rule.
The Iterative Association Rule Ranking Process
If only one association rule is found for a patient, there is no need to rank the rule. If ≥2 rules are found for the patient, we rank these rules iteratively. In the k-th iteration, we compute the ranking score for every rule r that is found for the patient and whose rank has not yet been determined. Compared with the case in the previous iteration, the score needs to be updated if and only if the value of mean(f(r)) changes, that is, if and only if any feature-value pair item on r’s left-hand side also appears on the left-hand side of the (k−1)-th ranked rule. Among all the rules that are found for the patient and whose ranks have not yet been determined, we select the rule with the highest score as the k-th ranked rule. If ≥2 of these rules have the same highest score, we choose one of them randomly as the k-th ranked rule.
For Each Association Rule on Display, Sort the Feature-Value Pair Items on Its Left-Hand Side
The same feature-value pair item could appear on the left-hand side of ≥2 top-ranked association rules. The user of the automated explanation function tends to read both the rules and the items on the left-hand side of a rule in the display order. To help the user obtain the most useful information as quickly as possible, for each rule on display, we need to appropriately rank the items on its left-hand side. For this purpose, we considered two factors:
- Factor 6: The user wants to see new information as quickly as possible. Hence, all else being equal, an item for a rule that already appears in the higher-ranked rules should rank lower. As the number of times the item appears in higher-ranked rules increases, the rank of the item should decrease.
- Factor 7: The user wants to find suitable interventions for the patient. Thus, all else being equal, an actionable item should rank higher than a nonactionable item.
We incorporate the two factors listed above into an item scoring function to strike a balance between them. Consider the k-th ranked association rule. For each feature-value pair item p on its left-hand side, p’s ranking score is a linear combination of two terms, one per factor:
scorep=wg·exp(−d·t)+wb·δactionable(p) (5)
The terms in the equation above are further explained below:
- In the equation for scorep above, d is the same decay constant used in f(d, pi, r) in the rule scoring function. t is the number of times p appears in the top k−1 rules. The larger the value of t, the smaller the value of the exponential decay function exp(−d·t). Hence, all else being equal, the exp(−d·t) term assigns a smaller ranking score to an item that appears more times in the top k−1 rules. This addresses factor 6.
- The term δactionable(p) is an indicator function for whether p is actionable. The term δactionable(p) is equal to 1 if p is actionable and is equal to 0 if p is nonactionable. All else being equal, the δactionable(p) term causes an actionable item to have a higher ranking score than that of a nonactionable item. This addresses factor 7.
Both exp(−d·t) and δactionable(p) are between 0 and 1. For the weight wg>0 of the term exp(−d·t), we set its default value to 1. For the weight wb>0 of the term δactionable(p), we set its default value to 2, which is >1. The value of wb has no impact on the score differences and, hence, the relative ranking among the actionable items on the left-hand side of the association rule. When wb is >wg, the actionable items always have larger scores than those of the nonactionable items because exp(−d·t) is between 0 and 1.
When the rank of an association rule is decided, we compute the ranking score for each feature-value pair item on the rule’s left-hand side. We then sort these items in descending order of their scores. Items with the same score are randomly prescribed and given consecutive ranks.
Computer Coding Implementation
We used the R programming language to implement our explanation ranking method.
Providing Informative Examples of the Explanation Ranking Results
We want to demonstrate various aspects of the results produced by our explanation ranking method. For this purpose, we chose 8 patients with asthma in the test set, each of whom our UWM model correctly predicted to have ≥1 asthma hospital encounter in 2019, and our automated explanation method could explain this prediction. For each patient, we show the top three explanations produced by our explanation ranking method. Each patient satisfied one or more of the following conditions and was an informative case:
- Condition 1: The patient had numerous encounters, laboratory tests, or medication prescriptions in 2018, reflecting a complex condition. In this case, we want to show how well the top three explanations capture and summarize the patient’s key information related to asthma outcome prediction.
- Condition 2: All or most of the asthma-related encounters that the patient had in 2018 were ED visits. Such a patient often had poor asthma control because of poor treatment adherence. In this case, we want to show how well the interventions linking to the top three explanations address the poor asthma control.
- Condition 3: For each of the top three association rules produced for the patient, the rule’s confidence value is close to the minimum confidence threshold. The rule’s commonality value is close to the minimum commonality threshold. In this case, we want to illustrate these borderline rules. Recall that below either threshold, a rule will not be used by our automated explanation method.
- Condition 4: The top three rules produced for the patient share several common feature-value pair items on their left-hand sides. This could happen, for example, when our automated explanation method finds only a few rules for the patient because the patient had only a small amount of information recorded in the electronic health record (EHR) system during the past year. In this case, we want to demonstrate the information redundancy in these rules.
- Condition 5: A patient at high risk for future asthma hospital encounters often had ≥1 hospital encounter related to asthma during the past year. The patient being examined does not fall into this category. The patient had several feature values correlated with future asthma hospital encounters but no hospital encounter related to asthma during the past year. In this case, we want to show how well the top three explanations capture these feature values.
Sensitivity Analysis of the Parameters Used in the Rule Scoring Function
The rule scoring function uses six parameters whose default values are as follows: wc=1, ws=1, wn=1, wd=50, d=5, and wa=100. To assess the impact of the five parameters wc, ws, wn, wd, and d on the association rule ranking results, we performed five experiments. In each experiment, we changed the value of one of these five parameters and kept the other parameters at their default values. In comparison with the case of all parameters taking their default values, we measured the average percentage change in the unique feature-value pair items contained in the top min(3, q) rules for a patient, where q denotes the number of rules generated by our automated explanation method for the patient. The percentage change in the unique items was defined as 100×the number of changed unique items divided by the number of unique items in the top min(3, q) rules. The average was taken over all patients in the test set, each of whom was predicted to have ≥1 asthma hospital encounter in 2019 and had at least one applicable rule (ie, q≥1). Multiple rules often differ from each other by only one item on their left-hand sides. In addition, switching items among the top few rules for a patient has little impact on the total amount of information that the user of the automated explanation function obtains from these rules. Thus, we measured the number of changed unique items in the top few rules per patient instead of the number of changed top rules per patient or the number of changed items per top rule.
As explained before, when wa is >wc+ws+wn+wd, the actionable rules always rank higher than the nonactionable rules. Meanwhile, the concrete value of wa has no impact on the ranking of the actionable rules. All the rules that our automated explanation method used on the UWM data set were actionable [
]. Thus, we did not perform a sensitivity analysis on wa. For a similar reason, we did not perform a sensitivity analysis on the weights wg and wb used in the item scoring function.Results
The Demographic and Clinical Characteristics of Our Patient Cohort
Each UWM data instance used in this study corresponds to a distinct patient and index year pair and is used to predict the patient’s outcome in the succeeding 12 months. Tables S1 and S2 in
show our patient cohort’s demographic and clinical characteristics during 2011-2017 and 2018 separately. These two sets of characteristics were similar to each other. During 2011-2017, 1.74% (1184/68,244) of data instances were linked to asthma hospital encounters in the succeeding 12 months. During 2018, 1.49% (218/14,644) of data instances were linked to asthma hospital encounters in the succeeding 12 months. A detailed comparison of these two sets of characteristics is presented in our previous paper [ ].Execution Time
For an average patient with asthma, our explanation ranking method took <0.01 seconds to produce the top three explanations. This is sufficiently fast for providing real-time clinical decision support.
Informative Examples of the Explanation Ranking Results
The Top Three Association Rules That Our Explanation Ranking Method Produced in Each Informative Example
The test set included 134 patients with asthma, each of whom our UWM model correctly predicted to have ≥1 asthma hospital encounter in 2019, and our automated explanation method could explain this prediction. To show the reader various aspects of the results produced by our explanation ranking method, we chose 8 of these patients who were informative cases.
- present the top three association rules that our explanation ranking method produced for each of the eight patients. For each of the top three rules produced for the seventh selected patient, lists the interventions linked to the rule.Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 46 | 24 (52.17) | 24 (2.03) | |
2 |
| 28 | 14 (50) | 14 (1.18) | |
3 |
| 32 | 18 (56.25) | 18 (1.52) |
aED: emergency department.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 87 | 54 (62.07) | 54 (4.56) | |
2 |
| 32 | 18 (56.25) | 18 (1.52) | |
3 |
| 35 | 18 (51.43) | 18 (1.52) |
aED: emergency department.
bSpO2: peripheral capillary oxygen saturation.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 127 | 79 (62.2) | 79 (6.67) | |
2 |
| 68 | 38 (55.88) | 38 (3.21) | |
3 |
| 40 | 20 (50) | 20 (1.69) |
aED: emergency department.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 37 | 34 (91.89) | 34 (2.87) | |
2 |
| 105 | 66 (62.86) | 66 (5.57) | |
3 |
| 19 | 16 (84.21) | 16 (1.35) |
aED: emergency department.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 82 | 48 (58.54) | 48 (4.05) | |
2 |
| 55 | 37 (67.27) | 37 (3.13) | |
3 |
| 116 | 58 (50) | 58 (4.9) |
aED: emergency department.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 40 | 22 (55) | 22 (1.86) | |
2 |
| 25 | 14 (56) | 14 (1.18) | |
3 |
| 32 | 16 (50) | 16 (1.35) |
aED: emergency department.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 51 | 39 (76.47) | 39 (3.29) | |
2 |
| 48 | 28 (58.33) | 28 (2.36) | |
3 |
| 116 | 58 (50) | 58 (4.9) |
aED: emergency department.
Rank | Association rule | Confidence of the rule | Commonality of the rule (n=1184), n (%) | ||
Total, n | Value, n (%) | ||||
1 |
| 87 | 45 (51.72) | 45 (3.8) | |
2 |
| 19 | 12 (63.16) | 12 (1.01) | |
3 |
| 19 | 12 (63.16) | 12 (1.01) |
aED: emergency department.
bSpO2: peripheral capillary oxygen saturation.
Rank | Association rule | Linked interventions |
1 |
|
|
2 |
|
|
3 |
|
|
aED: emergency department.
As illustrated by the cases shown in
- , the top few explanations that our explanation ranking method produces for a patient offer five benefits for clinical decision support. We describe these five benefits sequentially in the following sections.Benefit 1: The Top Few Explanations Provide Succinct Summaries on a Wide Range of Aspects of the Patient’s Situation
To make good clinical decisions for a patient, the clinician needs to understand the patient’s situation well. For each of the eight selected patients, the top three rule-based explanations produced by our explanation ranking method provide succinct summaries on a wide range of aspects of the patient’s situation, such as demographics, encounters, vital signs, laboratory tests, and medications. From these summaries, the user of the automated explanation function can quickly gain a comprehensive understanding of the patient’s situation related to the prediction target. This saves the user a significant amount of time and effort. In comparison, to gain this understanding in a clinical setting, even if a clinician knows all of the features needed for this purpose, the clinician currently often needs to spend a significant amount of time laboriously checking many pages of information scattered in various places in the EHR system and performing manual calculations. For example, patient 1 had a total of >1000 encounters recorded in the EHR system at the UWM over time. In 2018, this patient had 164 encounters, only two of which were related to asthma, and both were ED visits. As
shows, the statistics of two ED visits related to asthma are reflected by the first item on the left-hand side of the first association rule produced for this patient. As another example, in 2018, patient 2 had 740 medication prescriptions, 153 of which were asthma medication prescriptions covering a total of 72 short-acting β-2 agonists. As shows, the statistic of 72 short-acting β-2 agonists is reflected by the first item on the left-hand side of the first rule produced for this patient. The statistics of 153 asthma medication prescriptions are reflected by the first item on the left-hand side of the second rule produced for this patient. The cases with the other items on the left-hand sides of the top three rules produced for these two patients were similar.To gain a comprehensive understanding of a patient’s situation quickly, a clinician could ask the patient to describe his or her situation. However, the patient often cannot perform this well. For example, patients 1, 3, and 7 had severe mental disorders, which affected their memory and ability to describe their situation. This was a common scenario. Over 29.99% (4393/14,644) of patients with asthma at the UWM have mental disorders. Moreover, when making clinical decisions, the clinician does not always have direct access to the patient. For instance, when identifying candidate patients for care management, care managers are sitting in a back office and cannot talk to patients. In either of these two cases, the summaries provided by the top few rule-based explanations can help the clinician gain an understanding of the patient.
Benefit 2: Showing the Top Few Explanations Can Save the User of the Automated Explanation Function From Having to Manually Think of Many Features Summarizing the Patient’s Situation and Computing Their Values
Often, many features must be used to adequately summarize a patient’s situation related to the prediction target. In a busy clinical environment, a clinician cannot be expected to enumerate all of these features in a short amount of time. The top few rule-based explanations that our explanation ranking method produces for a patient cover the values of various features summarizing the patient’s situation related to the prediction target. This saves the user of the automated explanation function from having to manually think of these features and to compute their values.
Benefit 3: The Top Few Explanations Can Provide Information Not Easily Obtainable From Using the Existing Search and Browsing Functions of the EHR System to Check the Patient’s Data
The EHR system provides some browsing and basic search functions. However, for certain important features summarizing a patient’s situation related to the prediction target, we cannot easily obtain their values by using these functions to check the patient’s EHR data. The top few rule-based explanations that our explanation ranking method produces for a patient cover the values of several such features. This saves the user of the automated explanation function a significant amount of work. For example, many different asthma medications exist. In 2018, patient 2 had 740 medication prescriptions. It is difficult and time-consuming to manually compute the number of asthma medication prescriptions and the total number of short-acting β-2 agonists prescribed for this patient in 2018. In comparison, as mentioned before, these two statistics are directly reflected by the first and second rules produced for this patient. As a second example, in 2018, patient 7 had 14 ED visits, eight of which were related to asthma. For two of these eight ED visits, asthma was not the primary diagnosis. To compute the patient’s number of ED visits related to asthma in 2018, a clinician needs to find all of the patient’s ED visits in 2018 and check each of them to see whether it has an asthma diagnosis code. This requires a nontrivial amount of time. In comparison, as
shows, the statistics of eight ED visits related to asthma are directly reflected by the first item on the left-hand side of the first rule produced for this patient. As a third example, in 2018, patient 8 had 12 outpatient visits, none of which was to the patient’s primary care provider. To compute the patient’s number of outpatient visits to the primary care provider, a clinician needs to find all of the patient’s outpatient visits in 2018 and manually check each of them to see whether it involved the patient’s primary care provider. This requires a nontrivial amount of time. In comparison, as shows, the third item on the left-hand side of the first rule produced for this patient directly shows that the patient had 0 outpatient visits to the primary care provider in 2018.Benefit 4: The Top Few Explanations Can Help the User of the Automated Explanation Function Avoid Overlooking Certain Important Information of the Patient and Discover Errors in the Data Recorded on the Patient in the EHR System
A patient with asthma often has several other diseases, which could distract the clinicians and cause them to pay insufficient attention to the patient’s asthma and record incorrect data on the patient in the EHR system. For example, in 2018, asthmatic patient 3 also had major depression disorder, anxiety, posttraumatic stress disorder, visual disturbance, chronic pain, and knee osteoarthritis. In the patient’s problem list, these diseases were recorded as major problems, whereas asthma was recorded as a minor problem. However, the patient had 15 primary asthma diagnoses, some of which were severe persistent asthma and indicated that asthma was a major problem for the patient at that time. In 2020, asthma was first recorded as two major problems in the patient’s problem list: one on asthma exacerbation and another on persistent asthma with status asthmaticus. As shown in
, the first and third rules produced for the patient covered the patient’s number of asthma diagnoses and the highest severity of these diagnoses in 2018, reflecting that the patient had severe persistent asthma at that time. This can help the user of the automated explanation function avoid overlooking this aspect and discover that asthma should be recorded as a major problem in the patient’s problem list in 2018.Benefit 5: The Top Few Explanations Can Help the User of the Automated Explanation Function Identify Certain Problems of the Patient Not Easily Findable in the EHR System
This can help the user of the automated explanation function identify suitable interventions for the patient. For example, as shown in
, the first and second rules produced for patient 6 showed that this patient had quite a few ED visits related to asthma; however, very few asthma medications were prescribed for this patient in 2018. This patient did not adhere to albuterol prescriptions due to personal preference. Realizing this, the user could consider adopting the intervention of replacing albuterol with some other asthma medications that the patient is willing to take. As another example, as shown in and , for patients 4 and 7, the top three rules produced for each patient revealed that the patient had many ED visits related to asthma but no outpatient visit in 2018. These two patients were found to be homeless. With this information, the user could consider providing social resources to reduce the socioeconomic burden of homelessness, which leads to ineffective access to health care.Description of the 5 Example Patient Cases, One Case Per Each of Conditions 1-5
In this section, for each of conditions 1-5, we choose one example patient satisfying it and show how this patient was an informative case.
As an example case for condition 1, patient 1 had 164 encounters and 644 medication prescriptions in 2018. As shown in
, the top three explanations produced for this patient effectively capture and summarize various aspects of the patient’s key information related to future asthma hospital encounters.As an example case for condition 2, patient 7 had eight asthma-related encounters in 2018, all of which were ED visits. As shown in
, the top three explanations produced for this patient revealed that the patient had many asthma diagnoses, had no outpatient visit, and was prescribed ≥4 systemic corticosteroids during 2018, reflecting poor asthma control. As shown in , the interventions linked to the top three explanations address various aspects related to poor asthma control.Patient 6 provides an example for condition 3. As shown in
, for each of the top three association rules produced for this patient, the rule’s confidence value is close to the minimum confidence threshold of 50%, and the rule’s commonality value is close to the minimum commonality threshold of 1%. These three rules cover a wide range of aspects of the patient’s situation, including demographics, encounters, diagnoses, vital signs, and medications.As an example case for condition 4, patient 6 had only three encounters and one medication order, and subsequently, a small amount of information was recorded for this patient in the EHR system in 2018. As shown in
, the top three explanations produced for this patient share three common feature-value pair items on their left-hand sides. Despite having moderate information redundancy, these explanations still cover a wide range of aspects of the patient’s situation, including demographics, encounters, diagnoses, vital signs, and medications.As an example case for condition 5, patient 8 had no hospital encounters related to asthma in 2018. As shown in
, the top three explanations produced for this patient capture several feature values of the patient correlated with future asthma hospital encounters, such as the patient having between 9 and 17 primary or principal asthma diagnoses during the past year, the patient having ≥16 asthma medication prescriptions during the past year, the patient having no outpatient visit to the primary care provider during the past year, and the patient having ≥12 encounters during the past year.Sensitivity Analysis Results of the Parameters Used in the Rule Scoring Function
We performed 5 sensitivity analysis experiments, 1 for each of the 5 parameters wc, ws, wn, wd, and d used in the rule scoring function. In each experiment, we changed the corresponding parameter’s value and kept the other parameters at their default values. In comparison with the case where all 5 parameters took their default values and for each of these 5 parameters,
- show the average percentage change in the unique feature-value pair items contained in the top min(3, q) association rules for a patient versus the parameter’s value. In each figure, the vertical dotted line represents the default value of the corresponding parameter. For each parameter value tested, the average percentage change in the unique items was relatively small (<20%). The only exception is the case of either wd=0 or d=0, where the average percentage change in the unique items was 43.57% (453.18/1040). In both cases, our explanation ranking method ignores the need for the top-ranked rules to provide diversified information (factor 4).Discussion
Principal Findings
In a busy clinical environment, the explanation ranking module is essential for our automated explanation function for machine learning predictions to provide high-quality real-time decision support. For an average patient with asthma correctly predicted by our UWM model to have future asthma hospital encounters, our automated explanation method generated over 5000 rule-based explanations, if any. Within a negligible amount of time, our explanation ranking method can appropriately rank them and return the few highest-ranked explanations. These few explanations typically have high quality and low redundancy. From these few explanations, the user of the automated explanation function can gain useful insights on various aspects of the patient’s situation. Many of these insights cannot be easily obtained by viewing the patient’s data in the current EHR system. With further improvements in model accuracy, our UWM model coupled with our automated explanation method and our explanation ranking method could be deployed to better guide the use of asthma care management to save costs and improve patient outcomes.
Similar to our automated explanation method, our explanation ranking method is general purpose and does not rely on any specific property of a particular prediction target, disease, patient cohort, or health care system. Our automated explanation method coupled with our explanation ranking method can be used for any predictive modeling problem on any tabular data set. This provides a unique solution to the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice.
In our sensitivity analysis, when we changed any parameter used in our explanation ranking method from its default value, the resulting average percentage change in the unique feature-value pair items contained in the top min(3, q) association rules for a patient was typically <20%. This is not a large change, as most (>80%) of the distinct feature-value pair items contained in these rules and, subsequently, most of the information seen by the user of the automated explanation function remain the same. For instance, if the top min(3, q) association rules contain 15 unique feature-value pair items, at most three of these feature-value pair items would vary due to the change in the parameter value, whereas the other 12 or more remain the same as before. Thus, each parameter used in our explanation ranking method has a reasonably large stable range, within which the top few explanations produced by our method do not vary greatly as the parameter value changes. The default value of the parameter was within this stable range. According to our test results, the stable ranges are 0 to 10 for wc, 0 to 10 for ws, 0 to 10 for wn, 25 to 200 for wd, and 0.5 to 15 for d.
Adjusting Certain Parameters Used in the Rule Scoring and the Item Scoring Functions
Both the rule scoring and item scoring functions have several parameters. On the basis of the preferences of the users of the automated explanation function and the specific needs of the particular health care application, the developer of the automated explanation function could change some of these parameters from their default values. In the UWM test case used in this study, all association rules used by our automated explanation method were actionable. For some other predictive modeling problems, certain rules used by our automated explanation method are nonactionable [
]. In this case, if we want to allow some nonactionable rules to rank higher than some non-top-scored actionable rules on any patient, we need to reduce the weight wa. Similarly, if we want to allow some nonactionable items to rank higher than some actionable items in any non-top-scored rule that our automated explanation method finds for any patient, we need to reduce the weight wb.Considerations on the Threshold That Is Used to Determine the Top Rules That Will Be Displayed by Default
Different patients have different distributions of the ranking scores for the association rules found for the patients. No single threshold on the ranking score works for all patients. Thus, we use a threshold on the number of rules rather than a threshold on the ranking score to determine the top rules that will be displayed by default. This is similar to the case with a web search engine such as Google. Google does not use any ranking score threshold to determine the search results that will be displayed on each search result page. Instead, by default, Google displays 10 search results on each search result page. The user can request to see more search results by clicking the next button.
Considerations Regarding Potential Clinical Use
Understanding how a predictive model works requires a global interpretation. Understanding a single prediction of a model requires only local interpretation [
, ]. Our automated explanation method provides local interpretations. For clinical applications, the user of the automated explanation function is frequently a clinician who has little or no background in machine learning, can see only the prediction results but not the internal of the machine learning predictive model, cares about understanding the prediction on an individual patient but not much about how the predictive model works internally, and possibly does not even know which predictive model is used because the model is often embedded in the clinical software. In this case, it does not matter whether the explanations provided by the automated explanation function match how the predictive model works internally, as long as the explanations can help the user understand the prediction for a specific patient. For a patient predicted to have a poor outcome, our automated explanation method will give the same set of explanations regardless of which machine learning model is used to make the prediction. In the case where a deep learning model built on longitudinal data is used to make predictions, we can use the method proposed in our paper [ ] to extract temporal features from the deep learning model and longitudinal data, use these temporal features to convert longitudinal data to tabular data, and then apply our automated explanation method to a predictive model built on the tabular data.To use our automated explanation method in clinical practice, we could implement our automated explanation method together with our explanation ranking method as a software library with an application programming interface. For any clinical decision support software that uses a machine learning predictive model, we could use the application programming interface to add the automated explanation function into the software to explain the model’s predictions.
Related Work
As surveyed in the book written by Molnar [
] and the previous papers written by several research groups [ , - ], other researchers have proposed many automated methods to explain machine learning predictions. Some of these methods are used for traditional machine learning algorithms, whereas others are specifically designed for deep learning algorithms [ ]. The explanations given by most of these methods are not in a rule form. Many of these methods can handle only a specific machine learning algorithm or degrade the performance measures of the predictive model. None of these methods can automatically suggest tailored interventions. Ribeiro et al [ ] and Rudin and Shaposhnik [ ] used rules to explain any machine learning model’s predictions automatically. However, automatically recommending tailored interventions is still beyond the reach of the methods proposed by Ribeiro et al [ ] and Rudin and Shaposhnik [ ], as the rules are not generated until the prediction time. In comparison, our automated explanation method mines the association rules before the prediction time, provides rule-based explanations, works for any machine learning predictive model built on tabular data, does not degrade model performance, and automatically recommends tailored interventions. Compared with other types of explanations, rule-based explanations can more directly recommend tailored interventions and are easier to understand.As surveyed in previous studies [
, , ], association rules have been used in various applications to discover interesting patterns in the data and to make predictions. Various methods have been proposed to rank the rules mined from a data set for these purposes [ , - ]. In comparison, we mine and rank association rules to automatically explain machine learning predictions and to recommend tailored interventions.Limitations
This work has three limitations that are excellent areas for future work:
- This study used data from a single health care system. In the future, it would be beneficial to test our explanation ranking method on data from other health care systems.
- This study tested our explanation ranking method for predicting one specific target in one disease. In the future, it would be beneficial to test our method on predictive modeling problems that address other prediction targets and diseases.
- The data set used in this work contains no information on patients’ encounters outside the UWM. This forced us to limit the prediction target to asthma hospital encounters at the UWM rather than asthma hospital encounters in any health care system. In addition, the features used in this study were computed solely from the data recorded for the patients’ encounters at the UWM. In the future, it would be worth investigating how the top few explanations produced by our explanation ranking method would differ if we have data on the patients’ encounters in other health care systems.
Conclusions
In this study, we developed a method to rank the rule-based explanations generated by our automated explanation method for machine learning predictions. Within a negligible amount of time, our explanation ranking method ranks the explanations and returns the few highest-ranked explanations. These few explanations typically have high quality and low redundancy. Many of them provide useful insights on the various aspects of the patient’s situation, which cannot be easily obtained by viewing the patient’s data in the current EHR system. Both our automated explanation method and our explanation ranking method are designed based on general computer science principles and rely on no special property of any specific disease, prediction target, patient cohort, or health care system. Although only tested in the case of predicting asthma hospital encounters in patients with asthma, our explanation ranking method is general and can be used for any predictive modeling problem on any tabular data set. The explanation ranking module is an essential component of the automated explanation function, which addresses the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice. In the next few years, we plan to test our explanation ranking method on predictive modeling problems addressing other diseases as well as on data from other health care systems.
Acknowledgments
The authors thank Brian Kelly for useful discussions. GL was partially supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health under award number R01HL142503. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Authors' Contributions
XZ participated in designing the study, conducting a literature review, writing the paper’s first draft, performing the computer coding implementation, and conducting experiments. GL conceptualized and designed the study, conducted a literature review, and rewrote the entire paper. Both authors read and approved the final manuscript.
Conflicts of Interest
None declared.
A summary of the demographic and clinical characteristics of patients with asthma at the University of Washington Medicine.
PDF File (Adobe PDF File), 94 KBReferences
- Most recent National Asthma Data. Centers for Disease Control and Prevention. 2020. URL: https://www.cdc.gov/asthma/most_recent_national_asthma_data.htm [accessed 2021-01-29]
- Chronic respiratory diseases: asthma. World Health Organization. 2020. URL: https://www.who.int/news-room/q-a-detail/chronic-respiratory-diseases-asthma [accessed 2021-01-31]
- Nurmagambetov T, Kuwahara R, Garbe P. The economic burden of asthma in the United States, 2008-2013. Ann Am Thorac Soc 2018 Mar;15(3):348-356. [CrossRef] [Medline]
- Mays GP, Claxton G, White J. Managed care rebound? Recent changes in health plans' cost containment strategies. Health Aff (Millwood) 2004;Suppl Web Exclusives:427-436 [FREE Full text] [CrossRef] [Medline]
- Lieu TA, Quesenberry CP, Sorel ME, Mendoza GR, Leong AB. Computer-based models to identify high-risk children with asthma. Am J Respir Crit Care Med 1998 Apr;157(4 Pt 1):1173-1180. [CrossRef] [Medline]
- Caloyeras JP, Liu H, Exum E, Broderick M, Mattke S. Managing manifest diseases, but not health risks, saved PepsiCo money over seven years. Health Aff (Millwood) 2014 Jan;33(1):124-131. [CrossRef] [Medline]
- Greineder DK, Loane KC, Parks P. A randomized controlled trial of a pediatric asthma outreach program. J Allergy Clin Immunol 1999 Mar;103(3 Pt 1):436-440. [CrossRef] [Medline]
- Kelly CS, Morrow AL, Shults J, Nakas N, Strope GL, Adelman RD. Outcomes evaluation of a comprehensive intervention program for asthmatic children enrolled in Medicaid. Pediatrics 2000 May;105(5):1029-1035. [CrossRef] [Medline]
- Axelrod RC, Zimbro KS, Chetney RR, Sabol J, Ainsworth VJ. A disease management program utilizing life coaches for children with asthma. J Clin Outcomes Manag 2001;8(6):38-42 [FREE Full text]
- Axelrod RC, Vogel D. Predictive modeling in health plans. Dis Manag Health Outcomes 2003;11(12):779-787. [CrossRef]
- Chen T, Guestrin C. XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016 Presented at: KDD'16: The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13-17, 2016; San Francisco, CA, USA p. 785-794. [CrossRef]
- Tong Y, Messinger AI, Wilcox AB, Mooney SD, Davidson GH, Suri P, et al. Forecasting future asthma hospital encounters of patients with asthma in an academic health care system: predictive model development and secondary analysis study. J Med Internet Res 2021 Apr 16;23(4):e22796 [FREE Full text] [CrossRef] [Medline]
- Schatz M, Cook EF, Joshua A, Petitti D. Risk factors for asthma hospitalizations in a managed care organization: development of a clinical prediction rule. Am J Manag Care 2003 Aug;9(8):538-547 [FREE Full text] [Medline]
- Grana J, Preston S, McDermott PD, Hanchak NA. The use of administrative data to risk-stratify asthmatic patients. Am J Med Qual 1997;12(2):113-119. [CrossRef] [Medline]
- Loymans RJ, Honkoop PJ, Termeer EH, Snoeck-Stroband JB, Assendelft WJ, Schermer TR, et al. Identifying patients at risk for severe exacerbations of asthma: development and external validation of a multivariable prediction model. Thorax 2016 Sep;71(9):838-846. [CrossRef] [Medline]
- Eisner MD, Yegin A, Trzaskoma B. Severity of asthma score predicts clinical outcomes in patients with moderate to severe persistent asthma. Chest 2012 Jan;141(1):58-65. [CrossRef] [Medline]
- Sato R, Tomita K, Sano H, Ichihashi H, Yamagata S, Sano A, et al. The strategy for predicting future exacerbation of asthma using a combination of the Asthma Control Test and lung function test. J Asthma 2009 Sep;46(7):677-682. [CrossRef] [Medline]
- Osborne ML, Pedula KL, O'Hollaren M, Ettinger KM, Stibolt T, Buist AS, et al. Assessing future need for acute care in adult asthmatics: the Profile of Asthma Risk Study: a prospective health maintenance organization-based study. Chest 2007 Oct;132(4):1151-1161. [CrossRef] [Medline]
- Miller MK, Lee JH, Blanc PD, Pasta DJ, Gujrathi S, Barron H, TENOR Study Group. TENOR risk score predicts healthcare in adults with severe or difficult-to-treat asthma. Eur Respir J 2006 Dec;28(6):1145-1155 [FREE Full text] [CrossRef] [Medline]
- Peters D, Chen C, Markson LE, Allen-Ramey FC, Vollmer WM. Using an asthma control questionnaire and administrative data to predict health-care utilization. Chest 2006 Apr;129(4):918-924. [CrossRef] [Medline]
- Yurk RA, Diette GB, Skinner EA, Dominici F, Clark RD, Steinwachs DM, et al. Predicting patient-reported asthma outcomes for adults in managed care. Am J Manag Care 2004 May;10(5):321-328 [FREE Full text] [Medline]
- Loymans RJ, Debray TP, Honkoop PJ, Termeer EH, Snoeck-Stroband JB, Schermer TR, et al. Exacerbations in adults with asthma: a systematic review and external validation of prediction models. J Allergy Clin Immunol Pract 2018;6(6):1942-1952. [CrossRef] [Medline]
- Lieu TA, Capra AM, Quesenberry CP, Mendoza GR, Mazar M. Computer-based models to identify high-risk adults with asthma: is the glass half empty of half full? J Asthma 1999 Jun;36(4):359-370. [CrossRef] [Medline]
- Schatz M, Nakahiro R, Jones CH, Roth RM, Joshua A, Petitti D. Asthma population management: development and validation of a practical 3-level risk stratification scheme. Am J Manag Care 2004 Jan;10(1):25-32 [FREE Full text] [Medline]
- Forno E, Fuhlbrigge A, Soto-Quirós ME, Avila L, Raby BA, Brehm J, et al. Risk factors and predictive clinical scores for asthma exacerbations in childhood. Chest 2010 Nov;138(5):1156-1165 [FREE Full text] [CrossRef] [Medline]
- Xiang Y, Ji H, Zhou Y, Li F, Du J, Rasmy L, et al. Asthma exacerbation prediction and risk factor analysis based on a time-sensitive, attentive neural network: retrospective cohort study. J Med Internet Res 2020 Jul 31;22(7):e16981 [FREE Full text] [CrossRef] [Medline]
- Tong Y, Messinger AI, Luo G. Testing the generalizability of an automated method for explaining machine learning predictions on asthma patients' asthma hospital visits to an academic healthcare system. IEEE Access 2020;8:195971-195979 [FREE Full text] [CrossRef] [Medline]
- Luo G, Johnson MD, Nkoy FL, He S, Stone BL. Automatically explaining machine learning prediction results on asthma hospital visits in asthmatic patients: secondary analysis. JMIR Med Inform 2020 Dec 31;8(12):e21965 [FREE Full text] [CrossRef] [Medline]
- Molnar C. Interpretable Machine Learning. Morrisville, NC: lulu.com; 2020.
- Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Comput Surv 2019 Jan 23;51(5):93. [CrossRef]
- Desai JR, Wu P, Nichols GA, Lieu TA, O'Connor PJ. Diabetes and asthma case identification, validation, and representativeness when using electronic health data to construct registries for comparative effectiveness and epidemiologic research. Med Care 2012 Jul;50 Suppl:30-35. [CrossRef] [Medline]
- Wakefield DB, Cloutier MM. Modifications to HEDIS and CSTE algorithms improve case recognition of pediatric asthma. Pediatr Pulmonol 2006 Oct;41(10):962-971. [CrossRef] [Medline]
- Luo G, Nau CL, Crawford WW, Schatz M, Zeiger RS, Rozema E, et al. Developing a predictive model for asthma-related hospital encounters in patients with asthma in a large, integrated health care system: secondary analysis. JMIR Med Inform 2020 Nov 09;8(11):e22689 [FREE Full text] [CrossRef] [Medline]
- Luo G, Nau CL, Crawford WW, Schatz M, Zeiger RS, Koebnick C. Generalizability of an automatic explanation method for machine learning prediction results on asthma-related hospital visits in patients with asthma: quantitative analysis. J Med Internet Res 2021 Apr 15;23(4):e24153 [FREE Full text] [CrossRef] [Medline]
- Luo G, He S, Stone BL, Nkoy FL, Johnson MD. Developing a model to predict hospital encounters for asthma in asthmatic patients: secondary analysis. JMIR Med Inform 2020 Jan 21;8(1):e16080 [FREE Full text] [CrossRef] [Medline]
- Luo G. Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction. Health Inf Sci Syst 2016;4:2 [FREE Full text] [CrossRef] [Medline]
- Alaa AM, van der Schaar M. Prognostication and risk factors for cystic fibrosis via automated machine learning. Sci Rep 2018 Jul 26;8(1):11242 [FREE Full text] [CrossRef] [Medline]
- Alaa AM, van der Schaar M. AutoPrognosis: automated clinical prognostic modeling via Bayesian optimization with structured kernel learning. In: Proceedings of 35th International Conference on Machine Learning. 2018 Presented at: ICML'18: 35th International Conference on Machine Learning; July 10-15, 2018; Stockholm, Sweden p. 139-148.
- Thabtah FA. A review of associative classification mining. The Knowledge Engineering Review 2007 Mar 01;22(1):37-65. [CrossRef]
- Liu B, Hsu W, Ma Y. Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining. 1998 Presented at: KDD'98: 4th International Conference on Knowledge Discovery and Data Mining; August 27-31, 1998; New York City, NY p. 80-86.
- Fayyad UM, Irani KB. Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence. 1993 Presented at: IJCAI'93: 13th International Joint Conference on Artificial Intelligence; August 28-September 3, 1993; Chambéry, France p. 1022-1029.
- Luo G, Thomas SB, Tang C. Automatic home medical product recommendation. J Med Syst 2012 Apr;36(2):383-398. [CrossRef] [Medline]
- Luo G, Tang C, Yang H, Wei X. MedSearch: a specialized search engine for medical information retrieval. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008 Presented at: CIKM'08: Conference on Information and Knowledge Management; October 26-30, 2008; Napa Valley, CA, USA p. 143-152. [CrossRef]
- Santos RL, Macdonald C, Ounis I. Search result diversification. Foundations and Trends in Information Retrieval 2015;9(1):1-90. [CrossRef]
- Luo G. A roadmap for semi-automatically extracting predictive and clinically meaningful temporal features from medical data for predictive modeling. Glob Transit 2019;1:61-82 [FREE Full text] [CrossRef] [Medline]
- Du M, Liu N, Hu X. Techniques for interpretable machine learning. Commun ACM 2020;63(1):68-77. [CrossRef]
- Gilpin LH, Bau D, Yuan BZ, Bajwa A, Specter M, Kagal L. Explaining explanations: an overview of interpretability of machine learning. In: Proceedings of the 5th IEEE International Conference on Data Science and Advanced Analytics. 2018 Presented at: DSAA'18: IEEE 5th International Conference on Data Science and Advanced Analytics; October 1-3, 2018; Turin, Italy p. 80-89. [CrossRef]
- Samek W, Montavon G, Lapuschkin S, Anders CJ, Muller K. Explaining deep neural networks and beyond: a review of methods and applications. Proc IEEE 2021 Mar;109(3):247-278. [CrossRef]
- Ribeiro MT, Singh S, Guestrin C. Anchors: high-precision model-agnostic explanations. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018 Presented at: AAAI'18: 32nd AAAI Conference on Artificial Intelligence; February 2-7, 2018; New Orleans, LA p. 1527-1535.
- Rudin C, Shaposhnik Y. Globally-consistent rule-based summary-explanations for machine learning models: application to credit-risk evaluation. In: Proceedings of INFORMS 11th Conference on Information Systems and Technology. 2019 Presented at: CIST'19: 11th Conference on Information Systems and Technology; October 19-20, 2019; Seattle, WA p. 1-19. [CrossRef]
- Altaf W, Shahbaz M, Guergachi A. Applications of association rule mining in health informatics: a survey. Artif Intell Rev 2017;47(3):313-340. [CrossRef]
- Pazhanikumar K, Arumugaperumal S. Association rule mining and medical application: a detailed survey. Int J Comput Appl 2013 Oct 18;80(17):10-19. [CrossRef]
- Yang G, Shimada K, Mabu S, Hirasawa K. A personalized association rule ranking method based on semantic similarity and evolutionary computation. In: Proceedings of the IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). 2008 Presented at: CEC'08: IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence); June 1-6, 2008; Hong Kong, China p. 487-494. [CrossRef]
- Bouker S, Saidi R, Yahia SB, Nguifo EM. Ranking and selecting association rules based on dominance relationship. In: Proceedings of the IEEE 24th International Conference on Tools with Artificial Intelligence. 2012 Presented at: ICTAI'12: IEEE 24th International Conference on Tools with Artificial Intelligence; November 7-9, 2012; Athens, Greece p. 658-665. [CrossRef]
- Chen MC. Ranking discovered rules from data mining with multiple criteria by data envelopment analysis. Expert Syst Appl 2007 Nov;33(4):1110-1116. [CrossRef]
Abbreviations
ED: emergency department |
EHR: electronic health record |
ICD: International Classification of Diseases |
UWM: University of Washington Medicine |
XGBoost: extreme gradient boosting |
Edited by C Lovis; submitted 06.03.21; peer-reviewed by P Elkin, A Rovetta; comments to author 17.05.21; revised version received 19.05.21; accepted 06.06.21; published 11.08.21
Copyright©Xiaoyi Zhang, Gang Luo. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 11.08.2021.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.