This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
It is important to exploit all available data on patients in settings such as intensive care burn units (ICBUs), where several variables are recorded over time. It is possible to take advantage of the multivariate patterns that model the evolution of patients to predict their survival. However, pattern discovery algorithms generate a large number of patterns, of which only some are relevant for classification.
We propose to use the diagnostic odds ratio (DOR) to select multivariate sequential patterns used in the classification in a clinical domain, rather than employing frequency properties.
We used data obtained from the ICBU at the University Hospital of Getafe, where 6 temporal variables for 465 patients were registered every day during 5 days, and to model the evolution of these clinical variables, we used multivariate sequential patterns by applying 2 different discretization methods for the continuous attributes. We compared 4 ways in which to employ the DOR for pattern selection: (1) we used it as a threshold to select patterns with a minimum DOR; (2) we selected patterns whose differential DORs are higher than a threshold with regard to their extensions; (3) we selected patterns whose DOR CIs do not overlap; and (4) we proposed the combination of threshold and nonoverlapping CIs to select the most discriminative patterns. As a baseline, we compared our proposals with Jumping Emerging Patterns, one of the most frequently used techniques for pattern selection that utilizes frequency properties.
We have compared the number and length of the patterns eventually selected, classification performance, and pattern and model interpretability. We show that discretization has a great impact on the accuracy of the classification model, but that a trade-off must be found between classification accuracy and the physicians’ capacity to interpret the patterns obtained. We have also identified that the experiments combining threshold and nonoverlapping CIs (Option 4) obtained the fewest number of patterns but also with the smallest size, thus implying the loss of an acceptable accuracy with regard to clinician interpretation. The best classification model according to the trade-off is a JRIP classifier with only 5 patterns (20 items) that was built using unsupervised correlation preserving discretization and differential DOR in a beam search for the best pattern. It achieves a specificity of 56.32% and an area under the receiver operating characteristic curve of 0.767.
A method for the classification of patients’ survival can benefit from the use of sequential patterns, as these patterns consider knowledge about the temporal evolution of the variables in the case of ICBU. We have proved that the DOR can be used in several ways, and that it is a suitable measure to select discriminative and interpretable quality patterns.
Advances in the collection and storage of data have led to the emergence of complex temporal data sets, in which the data instances are traces of complex behavior characterized by time series of multiple variables.
In the clinical domain, patients who have incurred severe burns are treated in intensive care burn units (ICBUs). The first 5 days are fundamental: there is a resuscitation phase during the first 2 days and a stabilization phase during the following 3 days, and the patient’s evolution (incomings, diuresis, fluid balance, pH, bicarbonate, base excess) is registered over this period. These variables are not considered in scores for mortality prediction and may play a relevant role in improving the current knowledge of the problem.
Designing algorithms that are capable of learning patterns and classification models from such data is one of the most challenging topics in data mining research [
The number of patterns initially generated is usually very large, but only a few of these patterns are likely to be of interest to the domain expert that analyzes the data. There are several reasons for this: many of the patterns are either irrelevant or obvious, many patterns do not provide new knowledge regarding the domain, and many of them are similar or are included in others. Measures of the level of interest are, therefore, required to reduce the number of patterns, thus increasing the utility, usefulness, and relevance of the patterns discovered [
In addition to traditional multidimensional analysis and data mining tasks, one interesting task is that of discovering notable changes and comparative differences. This leads to gradient mining and discriminant analysis [
Discriminative pattern mining is one of the most important techniques in data mining. This challenging task comprises a group of pattern mining techniques designed to discover a set of significant patterns that occur with disproportionate frequencies in different class-labeled data sets [
The exploration of discriminative patterns generally includes 2 aspects: frequency and statistical significance. On the one hand, the frequency of a pattern can be assessed by its support, which is defined as the percentage of transactions (in our case, patients) that this pattern contains. A pattern is frequent if its support value is higher than a given threshold.
On the other hand, the statistical significance of discriminative patterns can be measured by using various statistic tests. A pattern is deemed significant if its significance value generated from a certain statistical measure could meet certain user-defined conditions, for example, no less (or more) than a given threshold. Any statistical measure that is capable of quantifying the differences between classes, such as the odds ratio, information gain, or chi-square, is generally applicable, and the choice of this measure will not typically affect the overall performance of the discriminative pattern discovery algorithms [
Many specific quantitative indicators of diagnostic test performance have been introduced into the clinical domain, such as sensitivity and specificity, positive and negative predictive values, chance-corrected measures of agreement, likelihood ratios or area under the receiver operating characteristic curve (AUC), among others. But there is a single indicator of diagnostic performance, denominated as the diagnostic odds ratio (DOR), which is closely linked to existing indicators, facilitates the formal meta-analysis of studies on diagnostic test performance, and is derived from logistic models [
We propose and compare 4 approaches in which the DOR is used as a statistical measure to select a reduced number of patterns, and we put forward the use of these patterns as predictors in a classification model. The calculation of the DOR for a pattern enables us to use a terminology that is closer to the language of clinicians, in which a pattern is considered to be a risk factor or to have a protection factor.
The first approach consists of using the DOR as a minimum threshold with which to select patterns. In the second approach, we calculate the difference in the DOR of a sequential pattern with respect to its extensions, and we establish a threshold for this difference to reduce the number of patterns selected. One advantage of this approach is that it can be used as an early pruning within the pattern discovery algorithm. In the third place, we calculate a CI for the DOR, and use this CI to prune patterns that are not statistically different from their extension patterns. Finally, we combine the second and third approaches to select patterns with different properties.
We have verified that these propositions provide acceptable results by building a model for the classification of patients’ survival using their daily evolution in an ICBU, employing multivariate sequential patterns. We have additionally compared the 4 approaches with the selection of patterns founded on classical frequency-based measures such as Jumping Emerging Patterns (JEPs).
A sequence database is based on ordered elements or events, recorded with or without a concrete notion of time. There are many applications involving sequence data, such as economic and sales forecasting, speech or audio signals, web click streams, or biological sequences. The mining of frequently occurring ordered events or subsequences as patterns was first introduced by Agrawal and Srikant [
The purpose of sequential pattern mining is to discover interesting subsequences in a sequence database, that is, sequential relationships between items that are of interest to the user. Various measures can be used to estimate how interesting a subsequence is. In the original sequential pattern mining problem, the support measure is used. The support (or absolute support) of a sequence
Sequential pattern mining is the task of finding all the frequent subsequences in a sequence database. A sequence
With regard to the algorithms employed to mine sequential patterns, there are 3 pioneer proposals: the GSP algorithm with the a priori strategy [
The researchers refer the reader to [
Classification rule mining attempts to discover a small set of rules in the database to form an accurate classifier.
Initial approaches that combined pattern mining and classification models employed a strict stepwise approach, in which a set of patterns was computed once and those patterns were subsequently used in models. However, a large number of methods were later proposed, whose aim was to integrate pattern mining, feature selection, and model construction [
Some of these are Classification Based on Predictive Association Rules (CPAR), Classification Based on Multiple Association Rules (CMAR) [
The classification of sequence patterns is one of the most popular methodologies whose power has been demonstrated by multiple studies [
The sequence classification methods can be divided into 3 large categories [
The first category is that of feature-based classification, during which a sequence is transformed into a feature vector, after which conventional classification methods are applied. Feature selection plays an important role in this kind of methods.
The second category is sequence distance–based classification. The distance function that measures the similarity between sequences determines the quality of the classification in a significant manner.
The third category is model-based classification, such as using the hidden Markov model and other statistical models to classify sequences.
Conventional classification methods, such as neural networks or decision trees, are designed to classify feature vectors. One way to solve the problem of sequence classification is to transform a sequence into a vector of features by means of feature selections. Sequences can be classified by employing conventional classification methods, such as support vector machine and decision trees.
Several researchers have worked toward building sequence classifiers based on frequent sequential patterns. Lesh et al [
Tseng and Lee [
With regard to the ICBU, few studies have dealt with the problem of survival prediction using machine learning or intelligent data analysis [
In the original sequential pattern mining problem, the main measure used is support. The assumption is that frequent subsequences are of interest to the user.
A first important limitation of the traditional sequential pattern mining problem is that a huge number of patterns may be generated by the algorithms, depending on how the
Many other rule interestingness measures are already used in data mining, machine learning, and statistics. Geng and Hamilton [
In this paper we focus on the probability-base objective measures used in the clinical domain. Some examples of objective rule interestingness measures that are often used in epidemiology as a statistical metric are presented in
Usual clinical objective rule interestingness measures for rules in the form of A→c.
Measure | Formula |
Support | |
Confidence | |
Coverage | |
Prevalence | |
Specificity |
|
Accuracy |
|
Diagnostic odds ratio |
|
Relative risk |
|
Relative risk and the DOR are statistical metrics that are often used in epidemiological studies. They are consistent: a larger odds ratio leads to a larger relative risk, and vice versa. Under the rare disease assumption, the DOR approximates the relative risk [
Li et al [
Most of the conventional frequent pattern–based classification algorithms follow 2 steps [
In fact, the most important consideration in sequence classification is not that of finding the complete rule set, but rather that of discovering the most discriminative patterns. In this respect, more attention has recently been paid to discriminative frequent pattern discovery for effective classification.
Heierman et al [
In the clinical domain, univariate frequent episodes of Sequential Organ Failure Assessment (SOFA) subscores during the first days after admission were identified in Toma et al [
After mining JEPs, Ghosh [
ICBUs are specialized units in which the main pathologies treated are inhalation injuries and severe burns. Early mortality prediction after admission is essential before an aggressive or conservative therapy can be recommended. Severity scores are simple but useful tools for physicians when evaluating the state of the patient [
The evolution of other parameters during the resuscitation phase (first 2 days) and during the stabilization phase (3 following days) may, however, also be important. The initial evaluation and resuscitation of patients with large burns that require inpatient care can be guided only loosely by formulas and rules. The inherent inaccuracy of formulas requires the continuous reevaluation and adjustment of infusions based on resuscitation targets. Incomings, diuresis, fluid balance, acid-base balance (pH, bicarbonate, base excess), and others help define objectives and assess the evolution and treatment response.
In the ICBU, a patient’s evolution is registered but not considered in scores for mortality prediction. In a previous paper [
Let
The number of instances of items in a sequence is denominated as the length of the sequence. A sequence with a length
Each itemset in a sequence represents the set of events that occur at the same time (same timestamp). A different itemset appears at a different time.
Sequence
The temporal representation of the patterns is principally carried out using time point representation or time interval representation.
In the time interval representation, there are different ways in which to relate intervals to each other, of which the best known is Allen’s interval algebra [
Time point–based data are a special case of the time interval–based data, in which both the beginning and the end points occur at the same time (for each interval) and the relations between these points become simpler (before, equals or co-occurs, and after), usually denoted as (<, =, >). Furthermore, because the “after” operator (>) is the inverse of the “before” relation (<), if we always consider a relation from the point that occurs first, it is not necessary to use the “after” relation. For instance, if we have A>B, we will instead say B<A.
It is, therefore, possible to define patterns or sequences with only these 2 relations (<, =). Two patterns
We have used the FaSPIP algorithm [
In candidate generation, FaSPIP distinguishes between 2 operations to extend a sequence with an item, thus creating a new sequence: Sequence extensions (S-extensions), when the frequent points take place after, and Itemset extensions (I-extensions), when the points take place at the same time as the last item in the pattern. For instance, given the sequence
The classical approach employed for pattern selection is based on the frequency of the patterns. Emerging patterns (EPs) or contrast sets are a type of knowledge pattern that describes significant changes (differences or trends) between 2 classes of data [
Like other rules or patterns composed of conjunctive combinations of elements, EPs can be easily understood and used directly by clinicians.
Furthermore, the concept of JEPs [
Clinicians must rely on the correct interpretation of diagnostic data in a variety of clinical environments. A 2×2 table is an essential tool to present the data regarding epidemiological studies for diagnostic test evaluation (
Other multiple tests with which to improve diagnostic decision making in different clinical situations have also been suggested. For example, Glas et al [
2×2 Contingency table.
Test | Reference test | |
Target disorder | No target disorder | |
Positive | TPa | FPb |
Negative | FNc | TNd |
aTP: true positive.
bFP: false positive.
cFN: false negative.
dTN: true negative.
The DOR is used to measure the discriminative power of a diagnostic test: the ratio of the odds of a positive test result among the diseased to the odds of a positive test result among the nondiseased. The DOR is not prevalence dependent, and may be easier to understand, as it is a familiar epidemiological measure. It can be expressed in terms of sensitivity and specificity.
The value of a DOR ranges from 0 to infinity. To calculate the DOR, the potential problems involving division by 0 are solved by adding 0.5 to the selected cells in the diagnostic 2×2 table.
The further the odds ratio is from 1, the more likely it is that those with the disease are exposed when compared with those without the disease (risk factor). A value of 1 means that a test does not discriminate between patients with the disorder and those without it. Values lower than 1 suggest a reduced risk of disease associated with exposure (protection factor).
CIs for range estimates can be conventionally calculated as shown in the next equation:
where
Li et al [
Several studies based on the nonoverlapping of the DOR have been performed. Toti et al [
A database contains 480 patient registries, which were recorded between 1992 and 2002. In this database, the temporal attributes that allow the monitoring and evaluation of the response to the treatment of patients are recorded once a day for 5 days. All attributes are continuous variables and represent the value accumulated during 24 hours. The registered variables are (1) total of managed liquids measured in cubic centimeters (cc) represented in the patterns as
We have removed from the database only those patients who died during the course of the study or those for whom it was not possible to estimate the duration of their hospital stay. After this cleansing, 465 patients remained, of whom 378 patients (81.3%) survived, 324 patients (69.7%) were male, and 201 patients (43.2%) had inhalation injuries.
Attribute summary.
Attribute | Minimum | Maximum | Median | SD |
Age (years) | 9 | 95 | 46.42 | 20.34 |
Weight (kg) | 25 | 120 | 71.05 | 10.77 |
Length of stay (days) | 3 | 162 | 25.02 | 24.24 |
Total burn surface area (%) | 1 | 90 | 31.28 | 20.16 |
Deep burn surface area (%) | 0 | 90 | 17.01 | 17.41 |
Simplified Acute Physiology Score | 6 | 58 | 20.67 | 9.49 |
We carried out the experiments by following the 4-step knowledge discovery process described in our previous paper [
In the first step, the preprocessing was carried out by employing 2 different discretization methods for the continuous attributes. One method was attribute discretization performed by an expert. This method provided the patterns with greater interpretability, because they are expressed in clinical language. The other method is the unsupervised correlation preserving discretization (UCPD), because it provided the best classification in comparison to several automatic discretization algorithms [
In the second step, we used the FaSPIP algorithm [
The best results were not produced with the lowest supports, which seems to imply that there is no overfitting.
The third step consisted of reducing the number of patterns found to select only those that would be relevant for the classification. If the support used in the previous step is low, the number of frequent patterns increases acutely: the pattern explosion phenomenon is one important disadvantage of using patterns as predictors for classifiers.
We decided to use a baseline experiment to compare it with our proposed methods. We therefore employed the frequency property (because it is frequently used to measure interestingness) to select discriminative patterns. To this end, we selected only JEPs that are not common in the subset of nonsurvivors and survivors, thus enabling us to remove common behavior or a patient’s evolution that is not discriminative.
Finally, the fourth step consisted of building a classification model with the constraint that it had to be interpretable. We wished to obtain a model with a small number of patterns that would be easy for the physician to interpret. In this case, we used a rule learner and a decision tree.
On the one hand, we used Repeated Incremental Pruning to Produce Error Reduction (RIPPER) as a rule learner. With this sequential covering algorithm, rules are learned one at a time, and each time a rule is learned, the tuples covered by the rule are removed. This process is repeated until there are no more training examples or if the quality of a rule obtained is below a user-specified threshold. JRIP (the implementation of RIPPER in WEKA) is one of the best classification algorithms to combine human readability and accuracy [
On the other hand, we choose the J48 decision tree implemented by WEKA for the C4.5 algorithm. This employs a greedy technique that is a variant of ID3, which determines the most predictive attribute in each step, and splits a node based on this attribute. Mohamed et al [
In all cases, we configured the classifiers with the same minimum number of elements in each leaf to 2% and also with the minimal weights of rule instances within a split to 2%. The accuracy, sensitivity, specificity, and AUC were calculated using a 10-fold cross validation.
Number of interesting patterns selected after mining on the subset of survivors and on the set of nonsurvivors for UCPDa and expert discretization
Discretization and support (%) | Survival + death initial patterns | Baseline JEPsb | Experiment 1, DORc | Experiment 2, differential DOR | Experiment 3, nonoverlapping DOR | Experiment 4, differential + nonoverlapping DOR | |||||
<.08, >16 | <.04, >32 | All | Best | All | Best | All | Best | ||||
|
|
|
|
|
|
|
|
|
|
|
|
|
10 | 46,041 + 83,015 | 391 | 2065 | 750 | 2795 | 2359 | 858 | 746 | 236 | 198 |
8 | 88,084 + 241,866 | 4931 | 14,424 | 5798 | 10,655 | 8781 | 2195 | 1856 | 701 | 504 | |
6 | 224,952 + 492,504 | 47,113 | 51,352 | 41,059 | 32,406 | 26,157 | 4545 | 3803 | 1556 | 1293 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
16 | 238,337 + 49,947 | 2179 | 14,158 | 2766 | 2401 | 1990 | 1529 | 1415 | 325 | 272 |
14 | 396,238 + 68,654 | 7556 | 33,979 | 7483 | 4153 | 3465 | 2296 | 2052 | 487 | 411 | |
12 | 647,943 + 137,546 | 22,940 | 65,564 | 16,272 | 9907 | 8173 | 6418 | 5228 | 1397 | 1212 |
aUCPD: unsupervised correlation preserving discretization.
bJEP: Jumping Emerging Pattern.
cDOR: diagnostic odds ratio.
The study was approved by the Ethics Committee of Hospital Universitario de Getafe (38/17, approved on 30/11/2017). This research study was conducted from data obtained for clinical purposes. Informed consent was not required.
The results of the baseline experiment and the results of our 4 different proposals using the DOR are shown below. The number of patterns generated in the subset of survivors and in the set of nonsurvivors with different supports is shown in
In the discussion, we explore 3 aspects: classification performance, number and length of patterns selected, and classification interpretability.
Number (and percentage) of interesting patterns by length (from 2 to 10) for 8% expert discretization and selecting all the patterns when it is possible.
Pattern length | Baseline JEPsa (n=4931) | Experiment 1a, DORb (<0.08, >16) |
Experiment 1b, DOR (<0.04, >32) (n=5798) | Experiment 2, differential DOR |
Experiment 3, nonoverlapping DOR (n=2195) | Experiment 4, differential + nonoverlapping DOR |
2 | 0 (0) | 5 (0.0) | 0 (0) | 289 (2.7) | 76 (3.5) | 39 (5.6) |
3 | 41 (0.8) | 187 (1.3) | 49 (0.8) | 2063 (19.4) | 461 (21.0) | 198 (28.2) |
4 | 542 (11.0) | 1610 (11.2) | 552 (9.5) | 3912 (36.7) | 857 (39.0) | 299 (42.7) |
5 | 1377 (27.9) | 4176 (29.0) | 1545 (26.6) | 3004 (28.2) | 612 (27.9) | 140 (20.0) |
6 | 1518 (30.8) | 4811 (33.4) | 1960 (33.8) | 1155 (10.8) | 175 (8.0) | 23 (3.3) |
7 | 987 (20.0) | 2698 (18.7) | 1190 (20.5) | 212 (2) | 14 (0.6) | 2 (0.3) |
8 | 372 (7.5) | 785 (5.4) | 407 (7.0) | 20 (0.2) | 0 (0) | 0 (0) |
9 | 84 (1.7) | 139 (1.0) | 85 (1.5) | 0 (0) | 0 (0) | 0 (0) |
10 | 10 (0.2) | 13 (0.1) | 10 (0.2) | 0 (0) | 0 (0) | 0 (0) |
aJEP: Jumping Emerging Pattern.
bDOR: diagnostic odds ratio.
In the baseline experiment, we searched for discriminative patterns, one of the most important techniques in data mining [
Results of the baseline experiment with JEPs.a,b
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCc | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 7 | 33 | 4.71 | 100.00 | 43.68 | 89.46 | 0.709 |
|
|
8 |
|
|
|
|
|
|
|
|
|
6 | 16 | 80 | 5 | 100.00 | 44.83 | 89.68 | 0.720 |
|
|
|
|
|
|
|
|
|
|
|
|
16 | 8 | 29 | 3.63 | 100.00 | 52.87 | 91.18 | 0.763 |
|
|
14 |
|
|
|
|
|
|
|
|
|
12 | 12 | 48 | 4 | 100.00 | 59.77 | 92.47 | 0.796 |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 8 | 37 | 4.63 | 100.00 | 40.23 | 88.82 | 0.704 |
|
|
8 |
|
|
|
|
|
|
|
|
|
6 | 18 | 87 | 4.83 | 100.00 | 44.83 | 89.68 | 0.729 |
|
|
|
|
|
|
|
|
|
|
|
|
16 | 7 | 34 | 4.86 | 100.00 | 47.13 | 90.11 | 0.711 |
|
|
14 |
|
|
|
|
|
|
|
|
|
12 | 12 | 51 | 4.25 | 100.00 | 62.07 | 92.90 | 0.833 |
aJEP: Jumping Emerging Pattern.
bHighest specificity is in italics.
cAUC: area under the receiver operating characteristic curve.
dUCPD: unsupervised correlation preserving discretization.
As will be noted, the JEPs make it possible to achieve a sensitivity of 100%, but the specificity has lower values. This is due to the fact that the data set is imbalanced with a majority of survivors, and the patterns cover only those patients that will survive or those that will die. It is necessary to achieve a higher specificity to predict the nonsurvivors, so the highest specificity is in italics in
The expert discretization is preferred by clinicians, because it is based principally on reference ranges values. But note that it is possible to improve the results by using an automatic discretization, such as UCPD (see [
When using expert discretization, the highest specificity (58.62%) is obtained using the JRIP classifier with 8% support.
This classifier requires 15 patterns, with a total length of 79 items, with the average length per pattern being 5.27 items. As an example, we show a pattern found in the subset of nonsurvivors. For each variable, the subindex
There is also an interesting pattern that appears in all the 5 experiments for the subset of nonsurvivors:
It would, therefore, be possible to interpret this pattern as “a patient will die if his/her diuresis is very high on one day, and during the next 2 days there is a low income with a very high diuresis the following day.”
In this experiment, we calculated the DOR for each pattern as shown in “Methods” section. In clinical language, a DOR>1 implies that the exposure to the pattern is a risk factor. Conversely, a DOR<1 implies that the pattern is a protection factor and selecting a DOR threshold with a very low value therefore suggests a reduced risk of disease associated with exposure. A value of DOR=1 signifies that the pattern does not discriminate between patients with the disorder and those without it.
The selection of patterns with either a high value or a low value for the DOR will therefore generate discriminative patterns. It is necessary to establish a manual threshold for the value of the DOR to choose the patterns. We have carried out 2 experiments. In the first experiment (1a), we have selected the patterns with a DOR value higher than 16 or lower than 0.08, and in the second experiment (1b), we have selected more exigent values, which were double or half the DOR value, that is, with a DOR value higher than 32 or lower than 0.04. This allowed us to reduce the number of patterns (
Results of Experiment 1a using the DORa (<0.08, >16).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 13 | 67 | 5.15 | 90.21 | 62.07 | 84.95 | 0.766 |
8 | 18 | 89 | 4.94 | 88.62 | 58.62 | 83.01 | 0.759 | ||
6 | 16 | 80 | 5 | 91.80 | 47.13 | 83.44 | 0.702 | ||
|
|
|
|
|
|
|
|
||
|
16 | 8 | 29 | 3.62 | 100.00 | 52.87 | 91.18 | 0.763 | |
14 | 11 | 43 | 3.91 | 100.00 | 62.07 | 92.90 | 0.787 | ||
12 | 12 | 48 | 4 | 100.00 | 59.77 | 92.47 | 0.796 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 10 | 46 | 4.6 | 91.27 | 55.17 | 84.52 | 0.716 |
8 | 12 | 58 | 4.83 | 93.12 | 54.02 | 85.81 | 0.720 | ||
6 | 14 | 67 | 4.79 | 94.44 | 52.87 | 86.67 | 0.706 | ||
|
|
|
|
|
|
|
|
||
|
16 | 8 | 33 | 4.13 | 100.00 | 41.38 | 89.03 | 0.716 | |
14 | 12 | 47 | 3.92 | 100.00 | 62.07 | 92.90 | 0.828 | ||
12 | 12 | 46 | 3.83 | 100.00 | 59.77 | 92.47 | 0.816 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
Results of Experiment 1b using the DORa (<0.04, >32).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 10 | 49 | 4.9 | 93.65 | 50.57 | 85.59 | 0.710 |
8 | 17 | 84 | 4.94 | 94.18 | 55.17 | 86.88 | 0.767 | ||
6 | 16 | 80 | 5 | 95.50 | 37.93 | 84.73 | 0.656 | ||
|
|
|
|
|
|
|
|
||
|
16 | 8 | 29 | 3.62 | 100.00 | 52.87 | 91.18 | 0.763 | |
14 | 11 | 43 | 3.91 | 100.00 | 62.07 | 92.90 | 0.787 | ||
12 | 12 | 48 | 4 | 100.00 | 59.77 | 92.47 | 0.796 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 11 | 50 | 4.55 | 97.09 | 44.83 | 87.31 | 0.704 |
8 | 14 | 67 | 4.79 | 95.50 | 62.07 | 89.25 | 0.801 | ||
6 | 16 | 87 | 5.44 | 98.15 | 48.28 | 88.82 | 0.715 | ||
|
|
|
|
|
|
|
|
||
|
16 | 7 | 26 | 3.71 | 100.00 | 47.13 | 90.11 | 0.727 | |
14 | 11 | 45 | 4.09 | 100.00 | 60.92 | 92.69 | 0.792 | ||
12 | 14 | 55 | 3.93 | 100.00 | 60.92 | 92.69 | 0.822 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
If we choose expert discretization, with a JRIP classifier and the highest values of the DOR (
This pattern, with a DOR value of 72.30, classifies a group of patients that will die, although we know that there will be minimal errors (1 patient survives).
We selected the pattern
A sequential pattern
For a better interpretation of the DOR, we calculated the risk factor probability
In our experiment we, therefore, selected the patterns with 2 conditions: (1) when the difference between the risk factor probability
We additionally used 2 alternative strategies to select patterns: it is possible to maintain all the extensions with a difference in the DOR value that is higher than a threshold or to explore the extensions with a beam search, in which case we select only the most promising extension with the highest DOR difference among all extensions.
With regard to the number of patterns selected (
If we study the length of the patterns (
However, the results obtained with UCPD are similar, and even with the JRIP classification and beam search, we need the lowest number of items and patterns from all the experiments: only 5 patterns with a total length of 20 items are required to attain 56.32% specificity.
Results of Experiment 2a using the differential DORa (keeping all pattern extensions).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 28 | 100 | 3.57 | 89.42 | 49.43 | 81.94 | 0.662 |
8 | 21 | 89 | 4.24 | 86.51 | 62.07 | 81.94 | 0.773 | ||
6 | 18 | 84 | 4.67 | 96.30 | 44.83 | 86.67 | 0.694 | ||
|
|
|
|
|
|
|
|
||
|
16 | 21 | 81 | 3.86 | 93.65 | 49.43 | 85.38 | 0.677 | |
14 | 15 | 56 | 3.73 | 94.97 | 56.32 | 87.74 | 0.759 | ||
12 | 12 | 52 | 4.33 | 100.00 | 58.62 | 92.26 | 0.788 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 4 | 13 | 3.25 | 90.74 | 31.03 | 79.57 | 0.620 |
8 | 8 | 25 | 3.13 | 86.77 | 29.89 | 76.13 | 0.600 | ||
6 | 3 | 7 | 2.33 | 89.68 | 29.89 | 78.49 | 0.594 | ||
|
|
|
|
|
|
|
|
||
|
16 | 10 | 37 | 3.70 | 92.86 | 24.14 | 80.00 | 0.594 | |
14 | 11 | 41 | 3.73 | 94.18 | 33.33 | 82.80 | 0.674 | ||
12 | 8 | 26 | 3.25 | 96.03 | 62.07 | 89.68 | 0.831 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
Results of Experiment 2b using the differential DORa (using beam search for best pattern extension).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 20 | 73 | 3.65 | 89.15 | 44.83 | 80.86 | 0.642 |
8 | 21 | 88 | 4.19 | 87.57 | 62.07 | 82.80 | 0.783 | ||
6 | 18 | 84 | 4.67 | 97.35 | 43.68 | 87.31 | 0.710 | ||
|
|
|
|
|
|
|
|
||
|
16 | 21 | 81 | 3.86 | 93.65 | 49.43 | 85.38 | 0.675 | |
14 | 15 | 56 | 3.73 | 94.71 | 56.32 | 87.53 | 0.760 | ||
12 | 12 | 52 | 4.33 | 100.00 | 57.47 | 92.04 | 0.764 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 18 | 59 | 3.28 | 89.15 | 27.59 | 77.63 | 0.582 |
8 | 5 | 17 | 3.4 | 90.48 | 21.84 | 77.63 | 0.569 | ||
6 | 8 | 29 | 3.62 | 91.53 | 31.03 | 80.22 | 0.623 | ||
|
|
|
|
|
|
|
|
||
|
16 | 9 | 31 | 3.44 | 91.01 | 28.74 | 79.35 | 0.618 | |
14 | 19 | 71 | 3.74 | 94.18 | 34.48 | 83.01 | 0.683 | ||
12 | 5 | 20 | 4 | 97.09 | 56.32 | 89.46 | 0.767 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
The J48 classification tree used to classify with expert discretization and 8% support, using beam search for the best pattern extension, makes it possible to attain 62.07% specificity, and require 21 patterns, with an average length of 4.19 items per pattern. This average is the lowest value of all the experiments carried out using the J48 classifier with expert discretization. Within these 21 patterns, we can find 2 patterns with only 2 items, which are used to classify the survivors:
The first pattern,
Furthermore, we have discovered a pattern with which to classify the nonsurvivors that can also be found in the J48 tree classifiers of the subsequent experiments, and that was not selected in the classification algorithms used in the previous experiments:
This pattern has a DOR value of DOR(
In this experiment, we have selected patterns based on the nonoverlapping of 95% CI of the DOR (as stated in [
We also obtain a reduced number of patterns with respect to the previous experiment (
In general, the classification performance is similar to that of the previous experiments, although with the JRIP classification using expert discretization, we obtain better results when selecting only the best child.
The J48 classification tree used to classify with expert discretization, and 8% support, using beam search for best pattern extension, allows us to obtain 58.62% specificity and a higher sensitivity than the previous experiment: 16 patterns are required.
One of the shortest patterns that we find in the J48 classification tree is:
This pattern has a DOR value of 27.93 in the interval (6.71, 116.26). Its super-pattern
Results of Experiment 3a using the nonoverlapping CI of DORa (keeping all pattern extensions).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 10 | 41 | 4.1 | 93.92 | 48.28 | 85.38 | 0.721 |
8 | 16 | 77 | 4.81 | 94.97 | 58.62 | 88.17 | 0.741 | ||
6 | 18 | 90 | 5 | 96.56 | 56.32 | 89.03 | 0.768 | ||
|
|
|
|
|
|
|
|
||
|
16 | 18 | 70 | 3.89 | 97.35 | 57.47 | 89.89 | 0.794 | |
14 | 11 | 43 | 3.91 | 99.74 | 62.07 | 92.69 | 0.803 | ||
12 | 11 | 47 | 4.27 | 100.00 | 57.47 | 92.04 | 0.786 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 11 | 37 | 3.36 | 93.65 | 41.38 | 83.87 | 0.682 |
8 | 13 | 60 | 4.62 | 91.80 | 33.33 | 80.86 | 0.641 | ||
6 | 7 | 30 | 4.29 | 96.56 | 42.53 | 86.45 | 0.722 | ||
|
|
|
|
|
|
|
|
||
|
16 | 6 | 23 | 3.83 | 96.30 | 41.38 | 86.02 | 0.727 | |
14 | 9 | 33 | 3.67 | 98.94 | 56.32 | 90.97 | 0.803 | ||
12 | 14 | 58 | 4.14 | 96.30 | 60.92 | 89.68 | 0.793 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
Results of Experiment 3b using the nonoverlapping CI of DORa (using beam search for best pattern extension).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 10 | 41 | 4.1 | 94.18 | 51.72 | 86.24 | 0.742 |
8 | 16 | 77 | 4.81 | 94.71 | 58.62 | 87.96 | 0.739 | ||
6 | 18 | 90 | 5 | 96.83 | 55.17 | 89.03 | 0.758 | ||
|
|
|
|
|
|
|
|
||
|
16 | 16 | 68 | 4.25 | 96.30 | 55.17 | 88.60 | 0.798 | |
14 | 13 | 51 | 3.92 | 100.00 | 62.07 | 92.90 | 0.795 | ||
12 | 11 | 45 | 4.09 | 100.00 | 60.92 | 92.69 | 0.812 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 6 | 20 | 3.33 | 94.44 | 48.28 | 85.81 | 0.735 |
8 | 16 | 62 | 3.88 | 95.24 | 41.38 | 85.16 | 0.700 | ||
6 | 12 | 51 | 4.25 | 95.77 | 52.87 | 87.74 | 0.747 | ||
|
|
|
|
|
|
|
|
||
|
16 | 16 | 66 | 4.13 | 95.50 | 40.23 | 85.16 | 0.695 | |
14 | 12 | 44 | 3.67 | 97.88 | 54.02 | 89.68 | 0.747 | ||
12 | 15 | 60 | 4 | 99.21 | 55.17 | 90.97 | 0.788 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
The last proposal consists of using the previous 2 approaches together (Experiments 2 and 3), signifying that we prune the patterns based on the overlapping of the CI of the DOR, and also based on the difference between the risk (or protection) factor probabilities. In both cases, we maintain the same thresholds.
In this experiment we substantially reduced the number of patterns generated (
It is necessary to consider that if the number of patterns is too low, we do not usually achieve a good classification result. But with this experiment, for example, with 8% support, expert discretization, and the J48 classifier, with only 504 patterns, we have obtained a similar result to previous ones, using only 13 patterns in the classifier, with a sensitivity of 96.30% and a specificity of 57.47% in the beam search for the best pattern extension (
The classification performance, as is shown in
Let us now analyze the pattern that is selected in this experiment and in all the previous experiments:
Results of Experiment 4b using the differential DORa and the nonoverlapping CI (using beam search for best pattern extension).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 10 | 35 | 3.5 | 95.50 | 41.38 | 85.38 | 0.694 |
8 | 13 | 55 | 4.23 | 96.30 | 57.47 | 89.03 | 0.770 | ||
6 | 16 | 75 | 4.69 | 98.41 | 50.57 | 89.46 | 0.739 | ||
|
|
|
|
|
|
|
|
||
|
16 | 20 | 74 | 3.7 | 93.92 | 50.57 | 85.81 | 0.758 | |
14 | 7 | 28 | 4 | 96.83 | 58.62 | 89.68 | 0.808 | ||
12 | 12 | 50 | 4.17 | 100.00 | 59.77 | 92.47 | 0.812 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 6 | 21 | 3.5 | 92.59 | 25.29 | 80.00 | 0.597 |
8 | 14 | 43 | 3.07 | 91.80 | 29.89 | 80.22 | 0.614 | ||
6 | 15 | 57 | 3.8 | 92.59 | 29.89 | 80.86 | 0.626 | ||
|
|
|
|
|
|
|
|
||
|
16 | 10 | 37 | 3.7 | 96.83 | 35.63 | 85.38 | 0.671 | |
14 | 10 | 36 | 3.6 | 98.68 | 32.18 | 86.24 | 0.673 | ||
12 | 15 | 59 | 3.93 | 98.68 | 50.57 | 89.68 | 0.759 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
Results of Experiment 4a using the differential DORa and the nonoverlapping CI (keeping all pattern extensions).
Classifier, discretization, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCb | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 13 | 42 | 3.23 | 94.18 | 44.83 | 84.95 | 0.672 |
8 | 13 | 55 | 4.23 | 95.50 | 55.17 | 87.96 | 0.743 | ||
6 | 17 | 78 | 4.59 | 97.88 | 47.13 | 88.39 | 0.711 | ||
|
|
|
|
|
|
|
|
||
|
16 | 20 | 74 | 3.7 | 94.97 | 50.57 | 86.67 | 0.761 | |
14 | 7 | 28 | 4 | 98.41 | 58.62 | 90.97 | 0.804 | ||
12 | 12 | 50 | 4.17 | 100.00 | 65.52 | 93.55 | 0.820 | ||
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
10 | 4 | 13 | 3.25 | 93.12 | 29.89 | 81.29 | 0.622 |
8 | 12 | 40 | 3.33 | 94.44 | 29.89 | 82.37 | 0.625 | ||
6 | 20 | 74 | 3.7 | 91.80 | 39.08 | 81.94 | 0.668 | ||
|
|
|
|
|
|
|
|
||
|
16 | 7 | 24 | 3.43 | 94.44 | 27.59 | 81.94 | 0.632 | |
14 | 6 | 23 | 3.83 | 97.35 | 32.18 | 85.16 | 0.653 | ||
12 | 16 | 63 | 3.94 | 98.68 | 59.77 | 91.40 | 0.795 |
aDOR: diagnostic odds ratio.
bAUC: area under the receiver operating characteristic curve.
cUCPD: unsupervised correlation preserving discretization.
We have proposed different ways of using the DOR as a single indicator of diagnostic performance, by carrying out a classification of the survival of patients in an ICBU by studying their daily evolution using multivariate sequential patterns. We now discuss the factors that we have to consider to have a trade-off mainly between interpretability and classification performance.
In relation to interpretability, a model is more interpretable than another model if its decisions are easier for a human to comprehend than decisions from the other model. In this sense, the presented method shows 3 advantages: (1) the readability and interpretability of the content of the patterns, (2) the reduced length of the patterns, and (3) the small set of significant patterns selected to build the classifier.
Of these 3 advantages, the most direct one for the clinician is that the patterns themselves have an interpretation in the language understood by the clinician, who does not have to spend time looking for a correspondence between what he/she read in the pattern and his/her usual way of working. Moreover, the definition of the patterns provides not only static information about the patient at admission time, as it is usual, but also the evolution of the patient. For example, a pattern like
For the second factor, if we study the length of the patterns eventually selected (
Overall, these shorter patterns produce worse classification results when we use expert discretization with a JRIP classifier. On the one hand, expert discretization generally performs worse, because it is not based on a statistical or information theory that has been specifically designed for classification purposes, and on the other hand, JRIP provides the best performance in terms of the complexity of the tree structure, while J48 produces a high classification accuracy (as the authors explain in [
With respect to the third factor, we could say that a model that allows us to achieve a good classification result with a low number of patterns (and consequently of items) is, therefore, preferable. In
The baseline experiment (using JEPs) and Experiment 3 (nonoverlapping CI of DOR) do not depend on a threshold value and we also obtain a reasonably small number of patterns. Nevertheless the threshold value that has been established in the other experiments (Experiments 1, 2, and 4) leads to changes in the number of patterns eventually selected. We have therefore made 2 variations in Experiment 1 (using DOR), by restricting the minimum DOR value that is necessary to select patterns (
When we work with imbalanced data, as is usual in medical domains, it is necessary to highlight the correct classification of rarely occurring cases when compared with other general cases. It is consequently necessary to check the highest specificity to choose the best classification result, which in our experiments is produced by using UCPD automatic discretization with JEPs as a classical frequency-based discriminative measure. JEPs have usually been used to build accurate classifiers, while UCPD exploits the underlying correlation structure in the data so as to obtain the discrete intervals and ensure that the inherent correlations are preserved.
Moreover, we have generally shown that this automatic discretization performs better classifications than expert discretization. But clinicians prefer to use a reference range discretization for laboratory and physiologic values. This signifies that, for example, they prefer to use the interval (7.35, 7.45) as a normal value for
Comparison of experimental results with the highest specificity using expert discretization.
Experiment, classifier, and pattern support (%) | Number of patterns | Total length (items) | Average length (items/pattern) | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUCa | ||
|
|
|
|
|
|
|
|
||
|
J48 | 8 | 17 | 84 | 4.94 | 100.00 | 56.32 | 91.83 | 0.782 |
JRIP | 8 | 15 | 79 | 5.27 | 100.00 | 58.62 | 92.26 | 0.777 | |
|
|
|
|
|
|
|
|
||
|
J48 | 8 | 17 | 84 | 4.94 | 94.18 | 55.17 | 86.88 | 0.767 |
JRIP | 8 | 14 | 67 | 4.79 | 95.50 | 62.07 | 89.25 | 0.801 | |
|
|
|
|
|
|
|
|
||
|
J48 | 8 | 21 | 88 | 4.19 | 87.57 | 62.07 | 82.80 | 0.783 |
JRIP | 6 | 8 | 29 | 3.62 | 91.53 | 31.03 | 80.22 | 0.623 | |
|
|
|
|
|
|
|
|
||
|
J48 | 8 | 16 | 77 | 4.81 | 94.71 | 58.62 | 87.96 | 0.739 |
JRIP | 6 | 12 | 51 | 4.25 | 95.77 | 52.87 | 87.74 | 0.747 | |
|
|
|
|
|
|
|
|
||
|
J48 | 8 | 13 | 55 | 4.23 | 96.30 | 57.47 | 89.03 | 0.770 |
JRIP | 6 | 15 | 57 | 3.8 | 92.59 | 29.89 | 80.86 | 0.626 |
aAUC: area under the receiver operating characteristic curve.
bJEP: Jumping Emerging Pattern.
cDOR: diagnostic odds ratio.
If we therefore consider only expert discretization, the best classification result is achieved in Experiment 1b (using DOR), with a specificity of 62.07% and an AUC value of 0.801 (
The classification model that is easiest to comprehend and has high specificity requires only 5 patterns (with a total length of 20 items) and is achieved with UCPD and a JRIP classifier in Experiment 2b (differential DOR) using beam search for the best pattern. It obtains a specificity of 56.32% and an AUC value of 0.767 (
If we take into consideration only expert discretization, with a J48 classifier we need at least 13 patterns (with a total length of 55 items) to obtain a specificity of 57.47% and an AUC value of 0.770 (
In this research, we have developed a model to predict the survival of patients by considering 2 aspects: the relevance of the temporal evolution of the patients as part of the model and an interpretable model for the physicians. We have achieved these aspects by (1) using the multivariate sequential patterns used in classification models that can be easily understood by experts, (2) using a reduced number of patterns, and (3) using a language that is well known by clinicians with regard to both the discretization of values and measures of interest of the patterns.
The main contribution of this work is the proposal and evaluation of 4 ways in which to employ DOR to reduce the number of patterns and to select only the most discriminative ones, because pattern explosion is a principal problem in pattern-based classifiers. We have compared the 4 proposals with a baseline experiment using JEPs. This is, to the best of our knowledge, the first time that some of these approaches have been proposed and compared in scientific literature.
With regard to the number of patterns, the best option is that of using both a differential and a nonoverlapping DOR (as in Experiment 4). As we have increased the restrictions applied, we have significantly reduced the number of patterns, thus attaining more general, simple, and interesting patterns. With expert discretization and 10% support, there are, for example, only 198 patterns (using beam search for best pattern), and, very interestingly, these patterns cover all the patients who did not survive. Despite not being within the scope of this paper, it would be interesting for a clinician to carry out a manual interpretation of these patterns.
This experiment provides the second contribution of this paper, because we have shown that beam search with the DOR could be used in the algorithm to extract sequential patterns for classification rather than using a traditional algorithm for sequential pattern mining.
Despite the efforts made to reduce the amount and the length of patterns in Experiments 2-4, in which we have compared each pattern with its extensions, the classifier built is less accurate. The shorter patterns are easier to understand, more general, and describe the population well, but simultaneously cover survivors and nonsurvivors.
With regard to accuracy, the best classification results are, not surprisingly, produced using JEPs along with UCPD. JEPs have been extensively used to build accurate classifiers and produce better results when we use a discretization based on statistical or information theory that is specifically intended for classification. Nevertheless, we require interpretable patterns that are easy for the clinician to understand, and must therefore use a reference range discretization created by an expert. If we consider only expert discretization, the highest specificity is attained using only the DOR to select the patterns (as in Experiment 1;
With regard to interpretability, we can observe that discretization has a great impact on classification performance at the expense of interpretability, because more and longer patterns are required. With UCPD, we require only 5 patterns (with a total length of 20 items) to build a rule set and to obtain 56.32% specificity when we use the differential DOR (see Experiment 2). With expert discretization, we need at least 13 patterns (with a total length of 55 items) to obtain a specificity of 57.47% using both a differential and a nonoverlapping DOR to select the patterns (see Experiment 4).
Our future research will consist of exploring domain-based measures to evaluate clinical patterns or to reduce the number of patterns in postprocessing to an even greater extent. In this respect, we intend to investigate more specific properties, such as closed, maximal, or minimal patterns as a trade-off between improving classification performance and not losing information or representativeness of the population. The researchers additionally intend to explore other measures and search strategies that could be integrated into new algorithms.
area under the receiver operating characteristic curve
Classification Based on Associations
Classify-By-Sequence
Classification Based on Multiple Association Rules
Classification Based on Predictive Association Rules
diagnostic odds ratio
emerging pattern
false negative
false positive
intensive care burn unit
Jumping Emerging Pattern
Multi-class, Multi-label Associative Classification
Repeated Incremental Pruning to Produce Error Reduction
Sequential Organ Failure Assessment
true negative
true positive
unsupervised correlation preserving discretization
This work was partially funded by the SITSUS project (Ref: RTI2018-094832-B-I00), the CONFAINCE project (Ref: PID2021-122194OB-I00), supported by the Spanish Ministry of Science and Innovation the Spanish Agency for Research (MCIN/AEI/10.13039/501100011033) and, as appropriate, by ERDF A way of making Europe.
None declared. This work does not relate to the employment of AG at Amazon.