Automatically Explaining Machine Learning Predictions on Severe Chronic Obstructive Pulmonary Disease Exacerbations: Retrospective Cohort Study

Background: Chronic obstructive pulmonary disease (COPD) is a major cause of death and places a heavy burden on health care. To optimize the allocation of precious preventive care management resources and improve the outcomes for high-risk patients with COPD, we recently built the most accurate model to date to predict severe COPD exacerbations, which need inpatient stays or emergency department visits, in the following 12 months. Our model is a machine learning model. As is the case with most machine learning models, our model does not explain its predictions, forming a barrier for clinical use. Previously, we designed a method to automatically provide rule-type explanations for machine learning predictions and suggest tailored interventions with no loss of model performance. This method has been tested before for asthma outcome prediction but not for COPD outcome prediction. Objective: This study aims to assess the generalizability of our automatic explanation method for predicting severe COPD exacerbations. Methods: The patient cohort included all patients with COPD who visited the University of Washington Medicine facilities between 2011 and 2019. In a secondary analysis of 43,576 data instances, we used our formerly developed automatic explanation method to automatically explain our model’s predictions and suggest tailored interventions. Results: Our method explained the predictions for 97.1% (100/103) of the patients with COPD whom our model correctly predicted to have severe COPD exacerbations in the following 12 months and the predictions for 73.6% (134/182) of the patients with COPD who had ≥ 1 severe COPD exacerbation in the following 12 months. Conclusions: Our automatic explanation method worked well for predicting severe COPD exacerbations. After further improving our method, we hope to use it to facilitate future clinical use of our model.


Introduction
Background Chronic obstructive pulmonary disease (COPD) is a leading cause of death [1] and affects 6.5% of American adults [2]. In the United States, COPD leads to 0.7 million inpatient stays and 1.5 million emergency department (ED) visits every year [2]. Severe COPD exacerbations are exacerbations that need inpatient stays or ED visits [3]. These exacerbations often result in irreversible deterioration in health status and lung function [4][5][6][7][8][9] and account for 90.3% of the US $32.1 billion total annual medical costs of the United States associated with COPD [2,10]. Many of these exacerbations, which include 47% of inpatient stays and many ED visits because of COPD, are regarded as preventable with suitable outpatient care [3,11]. To reduce severe COPD exacerbations, many health care systems and health plans use predictive models to identify high-risk patients [12] for preventive care management [13]. Once a patient is enrolled in the care management program, care managers will regularly follow up with the patient on the phone to assess the patient's health status and help schedule health and related services. For patients with COPD, successful care management can cut up to 40% of their inpatient stays [14] and 27% of their ED visits [15].
As a care management program can take ≤3% of patients because of resource limits [16], the effectiveness of the program depends critically on the performance of the predictive model that is used. To optimize the allocation of precious care management resources and improve the outcomes for high-risk patients with COPD, we recently built the most accurate model to date to predict severe COPD exacerbations in the following 12 months [17]. Our model achieved an area under the receiver operating characteristic curve of 0.866, a sensitivity of 56.6% (103/182), and a specificity of 91.17% (6698/7347). In comparison, to the best of our knowledge, each published prior model for this prediction target  had an area under the receiver operating characteristic curve ≤0.809 and a sensitivity <50% when the specificity was set at approximately 91%. Our model is based on the machine learning algorithm of extreme gradient boosting (XGBoost) [52]. As is the case with most machine learning models, our model does not explain its predictions, forming a barrier for clinical use [53]. Offering explanations is essential for care managers to make sense of and trust the model's predictions to make care management enrollment decisions and identify suitable interventions. Currently, there is no consensus on what explanation means for machine learning predictions. In this paper, by explaining the prediction that a machine learning model makes on a patient, we mean to find ≥1 rule whose left-hand side is fulfilled by the patient and whose right-hand side is consistent with the prediction. Previously, we developed a method to automatically provide rule-type explanations for any machine learning model's predictions on tabular data and suggest tailored interventions with no loss of model performance [54][55][56][57][58]. This method has been tested before for asthma outcome prediction but not for COPD outcome prediction.

Objective
The goal of this particular study is to assess the generalizability of our automatic explanation method for predicting severe COPD exacerbations. After further improving our method in the future, our eventual goal is that care managers can use our method to make COPD care management enrollment and intervention decisions more quickly and reliably.

Ethics Approval and Study Design
The institutional review board of the University of Washington Medicine (UWM) approved this retrospective cohort study (STUDY00000118) using administrative and clinical data.

Patient Population
In Washington state, the UWM is the largest academic health care system. The enterprise data warehouse of the UWM contains administrative and clinical data from 12 clinics and 3 hospitals. This study used the same patient cohort as our previous predictive modeling study [17]. The patient cohort included all patients with COPD who visited the UWM facilities between 2011 and 2019. As adapted from the literature [59][60][61][62], a patient was deemed to have COPD if the patient was aged at least 40 years and met at least one of the following criteria:

Data Set
This study used the same structured data set as our previous predictive model paper [17]. The data set contained the administrative and clinical data of the patient cohort's encounters at the 12 UWM clinics and 3 UWM hospitals between 2011 and 2020.

Prediction Target (Dependent or Outcome Variable)
This study used the same prediction target as our previous predictive model [17]. For a patient with COPD and ≥1 encounter at the UWM in a particular year (index year), we used patient data up to the end of the year to predict the outcome-whether the patient would have ≥1 severe COPD exacerbation in the following 12 months. A severe COPD exacerbation is defined as an inpatient stay or an ED visit with a principal diagnosis of COPD (

Data Preprocessing, Predictive Model, and Features (Independent Variables)
We applied the same methods as in our previous predictive model paper [17] to perform data preprocessing. Using the upper and lower bounds provided by a clinical expert in our team, as well as the upper and lower bounds from the Guinness World Records, we pinpointed the biologically implausible values, marked them missing, and normalized each numerical feature. Our model used 229 features and the XGBoost classification algorithm [52] to make predictions. As listed in the second table in the web-based multimedia appendix of our previous paper [17], these features were calculated on the attributes in our structured data set and covered various aspects such as vital signs, diagnoses, visits, procedures, medications, laboratory tests, and patient demographics. An example feature is the number of days since the patient had the last diagnosis of acute COPD exacerbation. Each input data instance to the predictive model contained these 229 features, corresponded to a distinct patient and index year pair, and was used to predict the outcome of the patient in the following 12 months. As in our previous predictive model paper [17], the cutoff threshold for binary classification was set at the top 10% of patients with the largest predicted risk. A care management program can take ≤3% of patients because of resource limits [16]. After using our model to identify the top 10% of patients with the largest predicted risk and using our automatic explanation method to explain the predictions, care managers could review patient charts, consider factors such as social dimensions, and choose ≤3% of patients for care management enrollment. A value of 10% was chosen to strike a balance between covering a large percentage of patients who would have ≥1 severe COPD exacerbation in the following 12 months and keeping the care managers' workload manageable.

Overview
Previously, we developed a method to automatically provide rule-type explanations for any machine learning model's predictions on tabular data and suggest tailored interventions with no loss of model performance [54][55][56][57][58]. When creating the automatic explanation function before the prediction time, our method requires ≥1 expert in the function's design team to manually provide some information, such as marking the feature-value pairs that could have a positive correlation with the bad outcome value and compiling interventions for these feature-value pair items. This can typically be performed in a few man-hours. Once this information is obtained and stored in the function's knowledge base, our method can automatically explain the machine learning model's predictions and suggest tailored interventions at the prediction time.

Main Idea
Our automatic explanation method [54][55][56][57][58] uses 2 models at the same time to separate making predictions and providing explanations. Each model plays a different role. The first model is used to predict the outcome. This model can be any model that takes continuous and categorical features as its inputs and is typically chosen to be the model that performs the best at making predictions. The second model comprises class-based association rules [63,64] mined from the training set. We use the second model to explain the first model's predictions rather than to make predictions. After we convert each continuous feature into ≥1 categorical feature via automatic discretization [63,65], the association rules are mined using the Apriori algorithm, whereas other standard methods such as frequent pattern growth can also be used [64]. Every rule shows that a feature pattern links to a value z of the outcome variable in the form of: Here, each item p i (1≤i≤k) is a feature-value pair (x, c), indicating that feature x has a value c if c is a value or a value within c if c is a range. The values of k and z can vary by rules. For the binary classification of good versus bad outcomes, z is usually the bad outcome value. The rule indicates that a patient's outcome tends to take the value z if the patient satisfies all of p 1 , p 2 ,..., and p k . The following is an example of a rule: The patient's last diagnosis of acute COPD exacerbation was from the past 81.4 days AND the patient's COPD reliever prescriptions in the past year included >10 distinct medications → The patient will probably have at least one severe COPD exacerbation in the following 12 months.

Mining and Pruning Rules
Each rule has two quality measures: commonality and confidence. For a rule: p 1 AND p 2 AND...AND p k →z, (1) its commonality is defined as the percentage of data instances satisfying p 1 , p 2 ,..., and p k among all the data instances linked to z. Its confidence is defined as the percentage of data instances linked to z among all the data instances satisfying p 1 , p 2 ,..., and p k . Commonality measures the coverage of a rule within the context of z. Confidence measures the precision of a rule.
The process of mining and pruning rules is controlled by five parameters: the number of top features that are used to form rules, upper limit of the number of items on the left-hand side of a rule, lower limit of confidence, lower limit of commonality, and upper limit of the confidence difference. Our method uses rules that each contains at most the upper limit number of items on its left-hand side, has a commonality that is greater than or equal to the lower limit of commonality, and has a confidence that is greater than or equal to the lower limit of confidence.
Our automatic explanation method is intended to be used for real-time clinical decision support. Once the first model provides its predicted outcome of a patient, we need to use the second model to provide automatic explanations for the prediction quickly, ideally within a subsecond. For this purpose, we need to control the number of association rules in the second model to help reduce the overhead of retrieving and ranking the relevant rules at the prediction time. We used the following three techniques to cut the number of rules: 1. Some machine learning algorithms, such as XGBoost [52], automatically calculate the importance value of each feature.
When the data set included many features, we used only the top few features in the first model with the highest importance values to form rules. Usually, we set the number of top features to be used to the maximum possible number without making the association rule mining process run out of memory. 2. A rule r 1 was dropped if there exists another rule r 2 satisfying three conditions: r 1 and r 2 have the same value on their right-hand sides; the items on the left-hand side of r 2 are a proper subset of the items on the left-hand side of r 1 (ie, r 2 is more general than r 1 ); and the confidence of r 2 is greater than or equal to the confidence of r 1 − the upper limit of the confidence difference. 3. All distinct feature-value pairs were examined and labeled by a clinical expert in the automatic explanation function's design team. When forming rules, we used only those feature-value pairs that the clinical expert deemed could have a positive correlation with the bad outcome value.
For every feature-value pair item used to form association rules, a clinical expert in the automatic explanation function's design team compiled ≥0 intervention. An item is termed actionable if it is associated with ≥1 intervention. These interventions are automatically attached to the rules whose left-hand sides contain this item. A rule is termed actionable if its left-hand side contains ≥1 actionable item and, in turn, is associated with ≥1 intervention. In theory, for each combination of feature-value pair items that appears on the left-hand side of ≥1 mined rule, the clinical expert could compile additional interventions to be automatically attached to the rules whose left-hand sides contain this combination if these interventions have not already been compiled for any individual feature-value pair item in the combination. In practice, we have not needed to do this for predicting severe COPD exacerbations, whereas such a need could occur in some other clinical prediction tasks in the future.

Explaining the Predictions
For each patient predicted by the first model to have a bad outcome, we explained the prediction by presenting the association rules in the second model whose left-hand sides are fulfilled by the patient and whose right-hand sides have the bad outcome value. The rules were sorted using the method given in our paper [57]. This method incorporates 5 factors into a rule-scoring function, striking a balance among them. These factors include confidence, commonality, number of items on the left-hand side of the rule, whether the rule is actionable, and the degree of information redundancy with the higher-ranked rules. The rules are ranked based on the computed scores in an iterative fashion. Every rule offers an explanation for why the patient is predicted to have a bad outcome. For each actionable rule that is presented, the associated interventions are shown next to it. This helps the user of the automatic explanation function pinpoint suitable interventions for the patient. Typically, the rules in the second model provide common reasons for a patient to have a bad outcome. Although some patients could have bad outcomes because of rare reasons not covered by these rules, the second model usually explains most, although not all, of the bad outcomes correctly predicted by the first model.

Parameter Setting
Our model [17] used 229 features to predict patient outcomes.
In this study, we used the top 80 features that our model ranked with the highest importance values to form association rules.
Regardless of whether all 229 features or only the top 80 features were used, our model had the same area under the receiver operating characteristic curve of 0.866.
As in our prior study on automatically explaining predictions of asthma outcomes on the UWM data [55], we set the upper limit of the number of items on the left-hand side of a rule to 5, the lower limit of commonality to 1%, and the lower limit of confidence to 50%. The last 2 values were commonly used to mine association rules [63], whereas commonality was essentially support computed on all the data instances linked to the bad outcome [54]. The first value struck a balance between the explanation power of our automatic explanation method and not making the rules too complex to understand. To set the upper limit value of the confidence difference, we plotted the number of association rules remaining from the rule pruning process versus the upper limit of the confidence difference. Our prior automatic explanation papers [54][55][56]58] showed that the number of remaining rules first decreased rapidly as the upper limit of the confidence difference increased and then slowly decreased after the upper limit of the confidence difference became large enough. The upper limit value of the confidence difference was set at a point where a further increase in the confidence difference had a minor impact on reducing the number of remaining rules.

Split of the Training and Test Sets
We adopted the method from our previous predictive model paper [17] to split the entire data set into the training and test sets. As the outcomes were from the following year, the data set contained 9 years of effective data (2011-2019) over the 10-year period of 2011 to 2020. To reflect how our predictive model and our automatic explanation method will be used in clinical practice in the future, we used the 2011 to 2018 data as the training set to train our model and compute the association rules used by our automatic explanation method and the 2019 data as the test set to assess the performance of our model and our automatic explanation method.

Providing Examples of Automatic Explanations
To give the reader a concrete feeling of the results produced by our automatic explanation method, we randomly selected 3 example patients from the patients who were correctly predicted by our model to have ≥1 severe COPD exacerbation in the following 12 months and for whom our automatic explanation method could offer ≥1 explanation. For each example patient, we listed the top 3 explanations given by our automatic explanation method.

Performance Metrics
We examined the performance of our automatic explanation method using the following performance metrics from our prior automatic explanation papers [54][55][56]58]. Regarding the explanation power of our automatic explanation method, a performance metric is the percentage of patients for whom our method could provide explanations among the patients with COPD who were correctly predicted by our model to have ≥1 severe COPD exacerbation in the following 12 months. We assessed both the average and median number of (actionable) rules matching such a patient. A rule matches a patient if the patient satisfies all items on its left-hand side.
As shown by our prior automatic explanation papers [54][55][56]58], many rules matching a patient often differ from each other by only 1 item on their left-hand sides. In this case, the number of rules greatly exceeded the amount of nonrepeated information contained in these rules. To provide a comprehensive overview of the amount of information provided by the automatic explanations, we examined the distributions of (1) the number of (actionable) rules and (2) the number of unique actionable items in the rules matching a patient who was correctly predicted by our model to have ≥1 severe COPD exacerbation in the following 12 months.

Characteristics of Our Patient Cohort
Each data instance corresponds to a distinct patient and index year pair. Tables 1 and 2 summarize the patient demographic and clinical characteristics of the data instances in the training and test sets, respectively. These 2 sets of characteristics were relatively similar to each other. In the training set, 5.66% (2040/36,047) of the data instances were related to severe COPD exacerbations in the following 12 months. In the test set, 2.42% (182/7529) of the data instances were related to severe COPD exacerbations in the following 12 months. A detailed comparison of these 2 sets of characteristics was provided in our previous predictive model paper [17].

The Number of Association Rules
Using the top 80 features ranked with the highest importance values in our predictive model, 7,729,134 association rules were mined from the training set. Figure 1 shows the number of remaining rules versus the upper limit of the confidence difference. The number of remaining rules first rapidly decreased as the upper limit of the confidence difference increased and then slowly decreased after the upper limit of the confidence difference became ≥0.15. We set the upper limit of the confidence difference to the value of 0.15, resulting in 492,803 remaining rules.

Examples of the Produced Automatic Explanations
To give the reader a concrete feeling of the results produced by our automatic explanation method, we randomly selected 3 example patients from the patients who were correctly predicted by our model to have ≥1 severe COPD exacerbation in the following 12 months and for whom our automatic explanation method could offer ≥1 explanation. Tables 3-5 show the top 3 explanations that our automatic explanation method provided for every example patient.  Interventions linked to the item Interpretation of the item Rank, rule, and item on the rule's left-hand side

Rank 1: The patient's last diagnosis of acute COPD a exacerbation was from the past 81.4 days AND the patient's COPD reliever prescriptions in the past year included >10 distinct medications → the patient will probably have at least one severe COPD exacerbation in the following 12 months
Having a recent acute COPD exacerbation shows a need for better control of the disease.
The patient's last diagnosis of acute COPD exacerbation was from the past 81.4 days

•
Provide education on managing COPD and more frequent follow-ups • Ensure use of appropriate COPD medications • Consider influenza shot, pneumonia vaccination, or smoking cessation • Assess the need for pulmonary rehabilitation or home care • Ensure that the patient has a primary care provider or is referred to a specialist Using many rescue medications for COPD indicates ineffective regimen, The patient's COPD reliever prescriptions in • Simplify COPD medications to once-a-day formulations or combination medications poor treatment adherence, or poor control of the disease.
the past year included >10 distinct medications

•
Address concerns for adverse interactions between medications • Provide education on the correct use of COPD medications or inhalers • Consider strategies to improve medication adherence such as providing reminders for taking medications in time • Medication reconciliation review by a physician or a pharmacist

Rank 2: The patient had between 8 and 19 diagnoses of acute COPD exacerbation in the past year AND the patient's last COPD diagnosis was from the past 25.6 days AND the patient's nebulizer medication prescriptions in the past year included >11 medications → the patient will probably have at least one severe COPD exacerbation in the following 12 months
Frequently having acute COPD exacerbations shows a need for better control of the disease. Consider influenza shot, pneumonia vaccination, or smoking cessation stay indicates poor control of the disease.
• Assess the need for pulmonary rehabilitation or home care

Using many medications for COPD with a nebulizer indicates an ineffective
The patient's nebulizer medication prescrip-

•
Simplify COPD medications to once-a-day formulations or combination medications regimen, poor treatment adherence, or tions in the past year • Address concerns for adverse interactions between medications poor control of the disease. Using neb-included >11 medications

•
Provide education on the correct use of COPD medications or inhalers ulizer medications could be a sign of having a mild exacerbation or more severe COPD.

•
Consider strategies to improve medication adherence such as providing reminders for taking medications in time • Medication reconciliation review by a physician or a pharmacist

Rank 3: The patient's average length of an inpatient stay in the past year was between 0.61 and 7.66 days AND the patient's last outpatient visit on COPD occurred in the past 82.4 days AND the patient's nebulizer medication prescriptions in the past year included >11 medications AND the patient's maximum percentage of neutrophils in the past year was >76.5% → the patient will probably have at least one severe COPD exacerbation in the following 12 months
Having a long inpatient stay can indicate that the patient has a more severe disease or comorbidities. Assess the need for home care or pulmonary rehabilitation of the disease and a need for additional support to control COPD.

Using many medications for COPD with a nebulizer indicates an ineffective
The patient's nebulizer medication prescrip-

•
Simplify COPD medications to once-a-day formulations or combination medications regimen, poor treatment adherence, or tions in the past year • Address concerns for adverse interactions between medications poor control of the disease. Using neb-included >11 medications

•
Provide education on the correct use of COPD medications or inhalers ulizer medications could be a sign of having a mild exacerbation or more severe COPD.

•
Consider strategies to improve medication adherence such as providing reminders for taking medications in time • Medication reconciliation review by a physician or a pharmacist  Interventions linked to the item Interpretation of the item Rank, rule, and item on the rule's left-hand side

Rank 1: The patient's last diagnosis of acute COPD a exacerbation was from the past 81.4 days AND the patient had >2 ED b visits in the past 6 months AND the patient's nebulizer medication prescriptions in the past year included >11 medications → the patient will probably have at least one severe COPD exacerbation in the following 12 months
Having a recent acute COPD exacerbation shows a need for better control of the disease. Provide education on the correct use of COPD medications or inhalers ulizer medications could be a sign of having a mild exacerbation or more severe COPD.

•
Consider strategies to improve medication adherence such as providing reminders for taking medications in time • Medication reconciliation review by a physician or a pharmacist

Rank 2: The patient's maximum BMI in the past year was <22.81 AND the patient's last ED visit related to COPD occurred no less than 27.2 days ago and no more than 94.3 days ago AND the patient's average length of stay of an ED visit in the past year was between 0.03 and 0.29 day AND the patient had between 2 and 4 encounters related to acute COPD exacerbation or respiratory failure in the past year→ the patient will probably have at least one severe COPD exacerbation in the following 12 months
Having an unintentional weight loss can indicate comorbidities or other The patient's maximum BMI in the past year was <22.81 • Optimize nutritional status to address low BMI • Provide dietary education and advise appropriate exercise complications, such as malnutrition or metabolic syndrome.
Having a recent ED visit related to COPD shows a need for better control of the disease.
The patient's last ED visit related to COPD occurred no less than 27.2 days ago and no

•
Provide education on managing COPD and more frequent follow-ups • Ensure use of appropriate COPD medications • Consider influenza shot, pneumonia vaccination, or smoking cessation • Assess the need for pulmonary rehabilitation or home care more than 94.3 days ago

•
Ensure that the patient has a primary care provider or is referred to a specialist Using the ED indicates poor control of conditions or a lack of access to primary, specialty, or home care.
The patient's average length of stay of an ED visit in the past year was between 0.03 and 0.29 day • Provide education on managing COPD and more frequent follow-ups • Ensure use of appropriate COPD medications • Consider influenza shot, pneumonia vaccination, or smoking cessation • Assess the need for pulmonary rehabilitation or home care • Ensure that the patient has a primary care provider or is referred to a specialist Frequently having acute COPD exacerbations or respiratory failures shows a need for better control of the disease.
The patient had between 2 and 4 encounters related to acute COPD exacerbation or • Provide education on managing COPD and more frequent follow-ups • Ensure use of appropriate COPD medications • Consider influenza shot, pneumonia vaccination, or smoking cessation • Assess the need for pulmonary rehabilitation or home care respiratory failure in the past year

•
Ensure that the patient has a primary care provider or is referred to a specialist

Rank 3: The patient had between 3 and 5 ED visits in the past year AND the patient's minimum SpO 2 c in the past year was between 17%
and 89.5% AND the patient's maximum percentage of neutrophils in the past year was >76.5% AND the patient smoked >0.48 pack of cigarettes per day in the past year → the patient will probably have at least one severe COPD exacerbation in the following 12 months Interventions linked to the item Interpretation of the item Rank, rule, and item on the rule's left-hand side

•
Provide education on managing COPD and more frequent follow-ups • Ensure use of appropriate COPD medications • Consider influenza shot, pneumonia vaccination, or smoking cessation • Assess the need for pulmonary rehabilitation or home care • Ensure that the patient has a primary care provider or is referred to a specialist Using the ED indicates poor control of conditions or a lack of access to primary, specialty, or home care.  Interventions linked to the item Interpretation of the item Rank, rule, and item on the rule's left-hand side

Rank 1: The patient had between 24 and 49 COPD a diagnoses in the past year AND the patient had >11 nebulizer medication prescriptions in the past year AND the patient is Black or an African American→ the patient will probably have at least one severe COPD exacerbation in the following 12 months
Frequently receiving COPD diagnoses indicates poor control of the disease.
The patient had between 24

Performance of the Automatic Explanation Method
The automatic explanation method was evaluated using the test set. For the patients with COPD who were correctly predicted by our model to have severe COPD exacerbations in the following 12 months, Figure 2 shows the distribution of the number of actionable rules matching a patient. This distribution is highly skewed toward the left with a long tail. As the number of actionable rules matching a patient increases, the frequency of cases in the corresponding equal-width bucket tends to rapidly decrease in a nonmonotonic way. The largest number of actionable rules matching a patient is rather large (111,062). Nevertheless, only 1 patient matches so many rules. For the patients with COPD who were correctly predicted by our model to have severe COPD exacerbations in the following 12 months, Figure 3 shows the distribution of the number of unique actionable items in the rules matching a patient. The largest number of unique actionable items in the rules matching a patient is 57, which is much smaller than the largest number of actionable rules matching a patient. As shown in Tables 3-5, the same intervention could be linked to ≥1 distinct actionable item in the rules matching a patient. Figure 3. The distribution of the number of unique actionable items in the rules matching a patient who was correctly predicted by our model to have ≥1 severe chronic obstructive pulmonary disease exacerbation in the following 12 months.
Our automatic explanation method explained the predictions for 73.6% (134/182) of the patients with COPD who had ≥1 severe COPD exacerbation in the following 12 months.

Principal Findings
Our automatic explanation method generalizes well in predicting severe COPD exacerbations. Our method explained the predictions for 97.1% (100/103) of the patients with COPD who were correctly predicted by our model to have severe COPD exacerbations in the following 12 months. This percentage is comparable with the corresponding percentages of 87.6% to 97.6% that we previously obtained to explain the predictions of asthma outcomes [54][55][56]. This percentage is sufficiently large to apply our automatic explanation method to routine clinical use for COPD management. After further improving the performance of our model for predicting severe COPD exacerbations and our automatic explanation method, we hope our model can be used in conjunction with our automatic explanation method to provide decision support for allocating COPD care management resources and improve outcomes.
Our automatic explanation method explained the predictions for 73.6% (134/182) of the patients with COPD who had ≥1 severe COPD exacerbation in the following 12 months. This percentage is <97.1% (100/103), the success rate at which our method explained the predictions for the patients with COPD whom our model correctly predicted to have severe COPD exacerbations in the following 12 months. This seems likely to be because of the correlation between the prediction results of our model and the association rules. Among the patients whom our model correctly predicted to have severe COPD exacerbations in the following 12 months, many seem to be easy cases for using association rules to explain the outcomes. Among the patients who had severe COPD exacerbations but were incorrectly predicted by our model to have no severe COPD exacerbation in the following 12 months, many seem to be difficult cases for any model to correctly predict or explain the outcomes.

Related Work
Several years ago, we designed our automatic explanation method to handle relatively balanced data and demonstrated our method for predicting the diagnosis of type 2 diabetes [58]. Later, other researchers demonstrated our method on several other clinical predictive modeling tasks, such as predicting lung transplantation or mortality in patients with cystic fibrosis [66] and predicting cardiac mortality in patients with cancer [67]. Recently, we extended our automatic explanation method so it can also handle imbalanced data, where one value of the outcome variable appears much less often than another. We demonstrated our extended method for predicting hospital encounters for asthma in patients with asthma in 3 health care systems separately [54][55][56]. Imbalanced data also appear in the case of predicting severe COPD exacerbations, which is the use case of this paper.
As discussed in the reviews [68,69], other researchers have developed a variety of methods to automatically explain the predictions made by machine learning models. Many of these methods lower the model performance or work only for a specific machine learning algorithm. Most of these methods provide explanations that are not of rule types. More importantly, none of these methods can automatically suggest tailored interventions, which is desired in many clinical applications. In comparison, our automatic explanation method has four properties that make it particularly suitable for providing clinical decision support: (1) it provides rule-type explanations, which are easier to understand than other kinds of explanations; (2) it works for any machine learning model on tabular data; (3) it does not lower model performance; and Rudin et al [70], Ribeiro et al [71], Rasouli et al [72], Pastor and Baralis [73], Guidotti et al [74], and Panigutti et al [75] used rules to automatically explain machine learning predictions. These rules are not known before the time of prediction, making it impossible to use them to automatically suggest tailored interventions at the time of prediction. Except for the case of Pastor and Baralis [73], these rules are not association rules. In comparison, our automatic explanation method mines association rules before the time of prediction and uses them to automatically suggest tailored interventions at the time of prediction.

Limitations
This study has 5 limitations that are worth addressing in future work.
First, this study used data from a single health care system. It is worth assessing our automatic explanation method's performance in explaining the predictions of severe COPD exacerbations in other health care systems.
Second, this study focuses on the prediction of one outcome-whether a patient with COPD will have ≥1 severe COPD exacerbation in the following 12 months. It is worth assessing our automatic explanation method's performance in explaining the predictions of other outcomes.
Third, our automatic explanation method currently works for explaining the predictions that traditional non-deep-learning machine learning algorithms make on tabular data. It is worth investigating the extension of our method to handle the predictions made by deep learning models on longitudinal data [76,77].
Fourth, we currently know no optimal way to present automatic explanations and automatically suggested interventions. It is worth investigating an optimal way to present this information based on a user-centered design.
Finally, researchers have assessed the impact of automatic explanations on decision-making for several other applications [78][79][80][81][82] before but not for care management. For the automatic explanation function for predicting severe COPD exacerbations presented in this paper, it is worth assessing the impact of showing automatic explanations and automatically suggested interventions on care management enrollment and intervention decisions.

Conclusions
Our automatic explanation method generalizes well in predicting severe COPD exacerbations. After further improving the performance of our model for predicting severe COPD exacerbations and our automatic explanation method, we hope our model can be used in conjunction with our automatic explanation method to provide decision support for allocating COPD care management resources and improve outcomes.