Abstract
Background: Chronic pain is a complex condition that affects more than a quarter of people worldwide. The development and progression of chronic pain are unique to each individual due to the contribution of interacting biological, psychological, and social factors. The subjective nature of the experience of chronic pain can make its clinical assessment and prognosis challenging. Personalized digital health apps, such as Manage My Pain (MMP), are popular pain self-tracking tools that can also be leveraged by clinicians to support patients. Recent advances in machine learning technologies open an opportunity to use data collected in pain apps to make predictions about a patient’s prognosis.
Objective: This study applies machine learning methods using real-world user data from the MMP app to predict clinically significant improvements in pain-related outcomes among patients at the Toronto General Hospital Transitional Pain Service.
Methods: Information entered into the MMP app by 160 Transitional Pain Service patients over a 1-month period, including profile information, pain records, daily reflections, and clinical questionnaire responses, was used to extract 245 relevant variables, referred to as features, for use in a machine learning model. The machine learning model was developed using logistic regression with recursive feature elimination to predict clinically significant improvements in pain-related pain interference, assessed by the PROMIS Pain Interference 8a v1.0 questionnaire. The model was tuned and the important features were selected using the 10-fold cross-validation method. Leave-one-out cross-validation was used to test the model’s performance.
Results: The model predicted patient improvement in pain interference with 79% accuracy and an area under the receiver operating characteristic curve of 0.82. It showed balanced class accuracies between improved and nonimproved patients, with a sensitivity of 0.76 and a specificity of 0.82. Feature importance analysis indicated that all MMP app data, not just clinical questionnaire responses, were key to classifying patient improvement.
Conclusions: This study demonstrates that data from a digital health app can be integrated with clinical questionnaire responses in a machine learning model to effectively predict which chronic pain patients will show clinically significant improvement. The findings emphasize the potential of machine learning methods in real-world clinical settings to improve personalized treatment plans and patient outcomes.
doi:10.2196/67178
Keywords
Introduction
Chronic pain affects more than a quarter of people worldwide and carries substantial personal and economic burdens [
]. It is a complex condition that can be challenging to assess clinically due to the individualized and subjective nature of symptoms [ , ]. The development and progression of chronic pain are unique to each patient and difficult to predict at the individual level due to the contribution of interacting biological, psychological, and social factors [ , - ]. Physiological markers associated with chronic pain can provide useful insights about a patient’s condition [ ]; however, clinical assessment of the subjective magnitude of pain severity and its impact on function and quality of life primarily relies on self-report measures [ , ]. Self-tracking of symptoms, medications, and daily activities is a popular approach to providing insights about symptom trends in chronic conditions [ , ], and digital apps have become a particularly useful tool to support self-tracking of pain and related symptoms [ - ]. With recent advances in machine learning technologies, digital symptom tracking opens an opportunity to evaluate numerous factors that contribute to pain symptoms and functioning to make predictions about an individual patient’s progress [ ]. This work examines whether information obtained from a pain tracking app can be used to accurately predict improvement in clinical pain-related outcomes in patients at a transitional pain clinic.Manage My Pain (MMP) is a digital health app designed by ManagingLife with a patient-centric approach, aimed at helping patients and health care professionals measure, manage, and communicate pain, function, and medication use both at home and in clinical settings. The app has over 100,000 users worldwide and is available in 7 languages on both mobile and web platforms. MMP has been integrated into the Transitional Pain Service (TPS), a multidisciplinary pain clinic at Toronto General Hospital (TGH), to support symptom assessment and patient engagement with symptom tracking [
]. The TPS at TGH is a pioneering clinic that treats patients during the transitional period when acute pain after a major surgical procedure has the risk of becoming chronic [ , ]. The clinic also treats complex patients with chronic pain to support them in medication management and opioid weaning [ ]. With this patient population, the clinic relies on regular symptom tracking to monitor patient progress and to intervene appropriately during critical phases of pain treatment. The MMP app provides a comprehensive digital platform where patients can fill out intake and follow-up questionnaires, track their symptoms, view symptom patterns and trends, and access educational resources about managing pain. Clinicians, in turn, can view each patient’s record in the app to gain insights into their ongoing pain and symptom patterns and to support patient-clinician communication and decision-making about treatment. The MMP app has become the standard of care at the TPS since May 2020. Evaluations of the app showed that it was acceptable to TPS patients [ ] and patients who used it reported significantly lower anxiety and pain catastrophizing scores [ ].The aim of this study is to use machine learning techniques to predict clinically significant improvements in pain-related outcomes among TPS patients who used the MMP app. In previous studies, we used symptom tracking, profile, and usage data from users of the MMP app to predict variability in reported pain levels over time (ie, pain volatility) [
, ]. In this study, we focused on a population of users from the TPS clinic and incorporated data from clinical questionnaires alongside symptom tracking, profile, and use data to predict clinical outcomes related to pain interference. Pain interference refers to the impact pain has on engagement in daily activities and participation [ ]. It is related to the perceived severity of pain [ ] and is considered a key aspect of the pain experience and a primary outcome in many clinical trials [ ]. It is an informative measure from a clinical treatment and pain management standpoint that focuses on patients’ daily functioning rather than pain intensity itself [ - ]. It is therefore a valuable measure to consider in predicting meaningful improvement for patient outcomes. We hypothesize that a machine learning model using data entered by TPS patients into the MMP app during the first 30 days of use can accurately predict subsequent pain interference scores reported within the next 5-month period.Methods
Ethical Considerations
The study protocol was reviewed and approved by the McGill University Research Ethics Board (File Number 23-12-016). Informed consent to the use of data was obtained when users registered an account through the MMP app and agreed to its End User Licence Agreement [
]. Privacy and confidentiality of user data were protected in accordance with ManagingLife’s Privacy Policy [ ]. All user data used in the study dataset was deidentified. Users did not receive compensation for the use of their data in the study.Manage My Pain App
The MMP app is available on Android, iOS, and web devices. The main features of the app are the daily reflection and the pain record that allows in-the-moment logging of pain experiences. Initially, users interact with the MMP app by responding to a daily push notification that prompts them to reflect on their day and rate it by completing a daily reflection at a default time of 8 PM. Users have the option to customize or disable the timing and frequency of notifications. The daily reflection, based on Acceptance and Commitment Therapy principles known for their efficacy in chronic pain management [
, ], asks users, “What did you do that mattered to you today?” Users then rate their day on a visual analog scale from 0 (Nothing) to 10 (Everything I wanted) and can record any meaningful activities. Following the daily reflection, users are invited to complete a pain record where they assess their current pain level by responding to “How is your pain right now?” and rating it on a scale from 0 (No pain) to 10 (Worst ever). Additionally, users can detail up to 7 aspects of their pain episodes, including body location, symptoms, characteristics, aggravating factors, medications, interventions, and environment. There is no limit to the number of pain records a user can enter in a day and these entries are independent of completing a daily reflection. Users can also enhance their MMP app profile with personal information about their medications, health conditions, and demographics such as age, height, weight, and gender. The screen interface of each of the different features of the MMP app is shown in .
TPS Clinic Patient Flow
The TPS treats patients who are at risk of developing chronic pain after surgery and patients with chronic pain who have complex needs. Patients are typically referred to the TPS during the perioperative period, before or after a surgical procedure, or they are referred through the Toronto Academic Pain Medicine Institute [
]. Prior to their initial assessment by a TPS physician, patients are asked to fill out a set of clinical intake questionnaires (see below for details). Patients are invited to access the questionnaires on the MMP app and staff at the clinic support patients in accessing the app and registering an MMP account. They are also informed about the other features available in the app to track their pain and daily activities, view symptom patterns and trends, and learn about managing pain in the Pain Guide. The TPS clinical team works with patients to address their needs through a multidisciplinary approach to pain care that includes medical treatment alongside psychological care and physical rehabilitation. Patients are asked to fill out follow-up clinical questionnaires using the MMP app at subsequent visits to the clinic. Patients are followed by the TPS for up to 6 months, at which point they are typically discharged to primary care.Clinic Questionnaires
The TPS clinic uses a battery of standard clinical questionnaires to evaluate patient pain-related symptoms at intake and to assess treatment progress over time. The following questionnaires are regularly assigned using the MMP app at both intake and follow-up visits and are common across all TPS patients: Numeric Rating Scale (NRS) to rate pain severity on an 11-point scale [
]; PROMIS Pain Interference 8a v1.0 (PROMIS PI) to measure the extent to which pain hinders an individual’s engagement with physical, mental, cognitive, emotional, recreational, and social activities [ ]; Pain Catastrophizing Scale (PCS) to assess catastrophic thinking related to pain [ ]; Generalized Anxiety Disorder-7 (GAD-7) as a measure of general anxiety [ ]; and Patient Health Questionnaire-9 (PHQ-9) as a measure for screening, diagnosing, monitoring, and measuring the severity of depression [ ]. Additional questionnaires are assigned as needed to meet the specific needs of each patient.Study Dataset
A total of 780 TGH TPS patients entered responses to clinic questionnaires and pain experience records in the MMP app during the period between May, 2020 and March, 2024, producing 14,127 questionnaire responses and 30,033 pain experience records. For this study, we selected users who had at least 1 PROMIS PI questionnaire response in a predictor period, and at least 1 PROMIS PI questionnaire response in an outcome period, resulting in 160 users. The predictor period was defined as the first 30 days of MMP app use, and the outcome period was set between 30 days and 6 months (183 d) from the first app use (
). There were 680 users who recorded a response in the predictor period, 182 users who recorded a response in the outcome period, and 180 users who had a response in both the predictor and outcome periods. Therefore, 180 users were selected for this study.
We aimed to predict users who improved based on changes in their final score on the PROMIS PI questionnaire between the predictor and outcome period. The final score on the PROMIS PI questionnaire is generated by converting the total raw score into a T-score for each participant using a web-based calculation tool [
]. The first questionnaire responses from the predictor period and the first subsequent recorded responses in the outcome period were used in the prediction model.Clinical questionnaires have a research-backed minimal clinically important difference (MCID) indicator that guides clinicians in evaluating the progress of symptoms [
]. The MCID was used in this study as a marker of patient improvement. The MCID for the PROMIS PI questionnaire is 2 [ ]. Patients who showed an improvement in PROMIS PI questionnaire T-scores greater than the MCID of 2 were classified as improved.Data Preprocessing and Feature Extraction
Overview
The data were preprocessed to remove any anomalous MMP app records (eg, missing values where entries were expected) and to convert any categorical questionnaire responses into a numerical format.
A total of 245 relevant variables, referred to as features in the field of machine learning, were extracted from the available data. A total of 194 features were extracted from the patients’ MMP app user profiles, pain records, daily reflections, and app use records. Another 51 features were extracted from the patients’ questionnaire responses. Instances with missing values were imputed using the mean value for each feature and subsequently, each feature was z score normalized. The features were divided into 8 categories as provided below.
Demographics
Demographics (6 features) consisted of data on gender, age, and age category (unknown, 0 to 19, 20 to 29, 30 to 39, 40 to 49, 50 to 59, 60 to 69, and 70+ years), height, weight, and body mass index entered by users into their profiles.
Medications
Medications (10 features) consisted of the total number of medications reported by users in their profile and 9 binary features for the specific medications reported, including opioids, tricyclic antidepressants, anticonvulsants, cannabinoids, serotonin-norepinephrine reuptake inhibitors, nonsteroidal anti-inflammatory drugs, acetaminophen, metamizole, and benzodiazepines.
Health Conditions
Health Conditions (9 features) consisted of the number of health conditions reported by users in their profile, and the number of conditions by category (unknown, 1 condition, 2 conditions, 3 conditions, and more than 3 conditions). Five features included binary categories of whether a user reported or not one of the following most observed health conditions: fibromyalgia, headaches or migraines, back pain, arthritis, or depression/anxiety. One feature represented an indication of neuropathic pain determined by the app and characterized as the presence of reports of sensations of pins and needles or tingling, burning, numbness, electric shocks, or an aggravating factor of light touch or clothing. The final feature was based on the presence of mental health issues as indicated by reports of anxiety, depression, negative mood, or stress.
Pain Record Statistics
Pain Record Statistics (11 features) consisted of the mean and SD of pain severity ratings, the mean and SD of absolute values of changes between consecutive severity ratings, the average of pain ratings in the following categories: mild (average pain rating <4), moderate (average pain rating ≥4 to ≤7), or severe (average pain rating >7), the mean and SD of pain, the number of pain records in the predictor period, the slope of the trendline of the severity scores, and the absolute value of the slope of the trendline.
Pain Descriptors
Pain Descriptors (127 features) consisted of descriptors of the pain experience entered into the app, including body locations (32 features), symptoms (21 features), characteristics (21 features), environment (8 features), aggravating factors (15 features), effective factors (15 features), and ineffective factors (15 features).
Daily Reflections
Daily Reflections (21 features) consisted of the mean and SD of the daily reflection score, the number of daily reflections in the predictor period, and descriptors of meaningful activities that contributed to the daily reflection rating.
App Usage
App Usage (10 features) consisted of the number of completed sections in the user profile, the number of days with a pain record, the percent of descriptor elements that were completed in their pain records, whether users were referred to the app via an institution, provider, or payer, and the average hour, day, week, or month of their pain records. Please note that in the current dataset, all users were referred to the app via an institution.
Questionnaires
Questionnaires (51 features) consisted of responses and outcome scores on five clinic questionnaires regularly assigned to all TPS clinic patients at both intake and follow-up visits, including (1) PROMIS PI, 8 questions and 3 outcome scores; (2) NRS, 4 questions and 4 outcome scores; (3) PHQ-9, 9 questions and one outcome score; (4) PCS, 13 questions and one outcome score; and (5) GAD-7, 7 questions and one outcome score. Please refer to
to see the questions included in each questionnaire.Prediction Model
Given the relatively small sample size, we used binary logistic regression as it is a simple and straightforward model that has been shown to have good performance with limited datasets [
]. Binary logistic regression is a method used for binary classification, which models the probability of each class as a function of the input variables [ ]. It operates by fitting a logistic function to the data. The output of logistic regression lies between 0 and 1, representing the probability that a given input point belongs to the class labeled as 1 (improved class). This is achieved by calculating a linear combination of the input features passed through the logistic function. The coefficients are learned by minimizing a cost function. To keep the weights constrained to a reasonable size and to reduce overfitting, we added an L2 regularization term to the cost function which penalizes the scale of the class weights. Additionally, we balanced the weighting of each class inversely proportional to the class frequencies to correct for class imbalance. The model was implemented with the sklearn library [ ] in Python, with the liblinear solver [ ], as it performs well on smaller datasets.We then performed feature selection to identify the significant features and improve the model’s generalizability by reducing overfit to the training data. We implemented a recursive feature elimination (RFE) with cross-validation [
]. RFE is a method to identify important features influencing a model’s prediction by systematically eliminating the least important features. First, the training dataset is split into 10 train/test subsets using 10-fold cross-validation, ensuring that the feature elimination process is validated across different subsets of data for reliability. Starting with all features, a logistic regression model is trained, and its performance is evaluated on the validation set. Features are ranked based on their importance from their coefficients, and the least important feature is removed from the set. This process is repeated iteratively, each time removing the least important feature, and the model’s performance is assessed with cross-validation at each step. The optimal number of features is determined by the point where the model’s cross-validation performance is the highest. We used the area under the receiver operating characteristic curve (AUC) as a metric to determine performance. We then implemented a second 10-fold cross-validation to optimize the regularization strength in the logistic regression model using only the selected features.Model Evaluation
The model algorithm was validated using leave-one-out cross-validation to assess how well the model will perform in practice on unseen data. In this approach, data from one MMP user is used as the test set while data from the remainder of the users is used as the training set. This process is iteratively repeated such that each subject is used exactly once as the test instance. In each iteration, the entire proposed algorithm is repeated with only the data in the training set being used to train the model. This method allows the model to be evaluated on every possible training and test set combination, providing a comprehensive measure of how well the model performs across the entire dataset. We evaluated the model using 4 metrics as follows: the overall accuracy, the accuracy of the improved class (sensitivity), the accuracy of the not-improved class (specificity), and the AUC. Due to the novel nature of both the dataset and the approach, there are currently no standards to compare against. Therefore, we also evaluated using 3 other standard machine learning models in place of the logistic regression model to compare performance: AdaBoost, random forest, and linear support vector machine (SVM). AdaBoost is an ensemble learning algorithm that iteratively combines weak classifiers to improve overall accuracy by focusing on misclassified instances [
]. Random forest, another ensemble method, builds multiple decision trees and combines their predictions, offering robustness to overfitting and the ability to capture nonlinear relationships in the data [ ]. The linear SVM is a classification algorithm that identifies the optimal hyperplane to separate data points into distinct classes, making it effective for high-dimensional datasets [ ].Feature Importance Estimation
Logistic regression is valued for its simplicity and interpretability [
]. Feature importance within this model is estimated by analyzing the coefficients. Larger absolute values of these coefficients suggest a stronger impact on the outcome, with positive coefficients increasing the log odds of the outcome (thus making it more likely), and negative coefficients decreasing the log odds (thus making it less likely). In our approach, we used an RFE algorithm, which selects varying numbers of features in each training fold. We first identified which features were consistently selected across all training folds. We then calculated the average coefficients for these features across all training folds and ranked them based on the absolute values of these averages. This method highlights the features that the model relies on to predict the likelihood of a patient’s improvement.Results
Sample and Dataset Characteristics
The characteristics of the sample of TPS patient users of the MMP app who were included in the study and their records in the MMP app are shown in
. Using an MCID of 2, 72 out of 160 (45%) of the patients showed improvements on the PROMIS PI questionnaire between the predictor and the outcome period. An overview of the PROMIS PI questionnaire response characteristics is shown in .Category and variable | Value | |
Users | ||
160 | ||
40.4 (16.6) | ||
129 | ||
14 | ||
19 | ||
127 | ||
3.5 (3.4) | ||
90 | ||
3.7 (3.3) | ||
58 | ||
MMP | records||
124 | ||
4009 | ||
123 | ||
2820 | ||
75 | ||
1189 | ||
5.1 (2.6) | ||
4.0 (2.7) | ||
18,545 | ||
13,541 | ||
0.3 | ||
5004 | ||
411 (438) |
aMMP: Manage My Pain app.
bValues derived from total app use for each user.
Variable | Value |
Total responses in predictor period | 199 |
Total responses in outcome period | 236 |
Mean days between first response and outcome response, mean (SD) | 74.8 (39.4) |
Mean T-score in predictor period, mean (SD) | 65.6 (6.7) |
Mean T-score in outcome period, mean (SD) | 63.1 (7.4) |
Mean T-score change, mean (SD) | −2.5 (6.5) |
aRange: 22‐183 days.
Prediction Results
The model was evaluated on 160 subjects using leave-one-out verification. When evaluated without using RFE, including all features in the model, the accuracy was 74%. Including RFE improved performance by reducing overfitting. Using fewer features, the model had an overall accuracy of 79%, with an even performance across both improved and not improved classes. The accuracy for subjects that improved (sensitivity) was 76% and for subjects that did not improve (specificity) was 82%. The confusion matrix of the prediction results can be seen in
. On average, the algorithm selected a mean of 88 (SD 13) out of the 245 features. The receiver operating characteristic curve is shown in . The AUC was 0.82.Predicted improved class | Predicted not improved class | Total actual | |
Actual improved class | 55 | 17 | 72 |
Actual not improved class | 16 | 72 | 88 |
Total predicted | 71 | 89 | 160 |

Comparison
Three additional models were evaluated to compare against our chosen approach: AdaBoost, random forest, and linear SVM. To directly compare against the logistic regression model, the same prediction pipeline was used. First, the data were preprocessed, and then each model was trained with RFE to optimize the features. While all features were inputted into the pipeline, each model had different RFE optimizations across the training folds. Default hyperparameters were used for each of the models.
shows the results for each model. The logistic regression model consistently outperforms the other models across all metrics.Model | Not-improved class accuracy | Improved class accuracy | Accuracy | AUC |
Logistic regression | 0.82 | 0.76 | 0.79 | 0.82 |
Linear SVM | 0.74 | 0.67 | 0.71 | 0.75 |
Random forest | 0.81 | 0.35 | 0.60 | 0.62 |
ADABoost | 0.67 | 0.57 | 0.62 | 0.61 |
aAUC: area under the receiver operating characteristic curve.
bSVM: support vector machine.
Feature Importance
In our study, a varying number of features were selected across 160 training folds, with an average of 88 out of 245 features chosen. Notably, 37 features consistently appeared across all folds.
displays the mean coefficients from the logistic regression model for these 37 features, indicating their importance in prediction.Rank | Feature | Mean coefficient |
1 | PCS | question 1: “I worry all the time about whether the pain will end”1.685 |
2 | Body locations: Legs | 1.440 |
3 | PROMIS PI | question 3: “How much did pain interfere with your ability to participate in social activities?”1.437 |
4 | PROMIS PI question 5: “How much did pain interfere with the things you usually do for fun?” | 1.286 |
5 | PROMIS PI question 1: “How much did pain interfere with your day-to-day activities?” | 1.285 |
6 | Meaningful activities: Exercised | 1.253 |
7 | Aggravating factors: Stress | −1.252 |
8 | PCS question 2: “I feel I can’t go on” | −1.123 |
9 | NRS | pain question 3: “Please rate your pain by marking the box beside the number that tells how much pain you have right now”−1.108 |
10 | NRS pain score: Right now | −1.108 |
11 | Effective factors: Massage | 1.040 |
12 | Ineffective factors: Talking to someone | 1.027 |
13 | Meaningful activities: “Connected with supportive people online or through text” | −1.000 |
14 | Pain trend | −0.994 |
15 | Number of conditions (Categories) | 0.983 |
16 | Locations: Neck | −0.979 |
17 | PCS question 12: “There’s nothing I can do to reduce the intensity of the pain” | 0.978 |
18 | Locations: Head (Right) | −0.977 |
19 | PHQ-9 | Question 7: “Trouble concentrating on things, such as reading the newspaper or watching television”−0.892 |
20 | Locations: Joints | 0.891 |
21 | Environment: Home | 0.881 |
22 | Medications: Tricyclic antidepressants | −0.880 |
23 | GAD-7 | question 4: “Trouble relaxing”−0.860 |
24 | PCS question 5: “I feel I can’t stand it anymore” | −0.825 |
25 | Symptoms: Insomnia | −0.825 |
26 | Meaningful activities: Errands outside the home | −0.815 |
27 | Pain characteristic: Custom entry | 0.790 |
28 | Height | 0.789 |
29 | Percent of completed descriptor elements | 0.772 |
30 | Mean time of day of pain record entry | 0.746 |
31 | Effective factors: Rest | 0.736 |
32 | Medications: Acetaminophen | −0.714 |
33 | Conditions: Back pain | 0.713 |
34 | Environment: Work | −0.665 |
35 | Effective factors: Ice | 0.660 |
36 | PROMIS PI score: Raw score | 0.476 |
37 | Symptoms: Dizziness | −0.41 |
aPCS: Pain Catastrophizing Scale.
bPROMIS PI: PROMIS Pain Interference 8a v1.0.
cNRS: Numeric Rating Scale.
dPHQ-9: Patient, Health Questionnaire-9.
eGAD-7: General Anxiety Disorder-7.
Questionnaire Comparison
While this work was focused on optimizing a model for predicting improved scores on the PROMIS PI measure, we repeated the same algorithm for each of the questionnaires included in the dataset. As demonstrated in
, responses on PROMIS PI had the best performance with an AUC of 0.81. Responses on GAD-7 and PHQ-9 showed some moderate ability for prediction, with AUCs of 0.64 and 0.74, respectively. Responses on GAD-7, PHQ-9, and PCS demonstrated lower rates of improvement and produced a more imbalanced dataset, possibly leading to decreased performance by the algorithm. Responses on the NRS for pain severity, on the other hand, indicated a similar percentage of improved MMP users as PROMIS PI. Despite the balanced dataset, the predictive model did not achieve any meaningful results, with an AUC of 0.51, which is nearly equivalent to random chance.Questionnaire | MCID | Number improved / Total Users, n/N (%) | Accuracy | Class 0 | Class 1 | Balanced accuracy | AUC |
PROMIS PI | [ ]2 [ | ]72/160 (45%) | 0.79 | 0.82 | 0.76 | 0.79 | 0.82 |
GAD-7 | [ ]4 [ | ]52/180 (29%) | 0.70 | 0.70 | 0.71 | 0.70 | 0.64 |
PHQ-9 | [ ]5 [ | ]34/169 (20%) | 0.72 | 0.76 | 0.59 | 0.67 | 0.74 |
PCS | [ ]38% [ | ]47/173 (27%) | 0.6 | 0.68 | 0.38 | 0.53 | 0.57 |
NRS | [ ]1 [ | ]89/173 (46%) | 0.53 | 0.55 | 0.51 | 0.53 | 0.51 |
aMCID: minimal clinically important difference.
bAUC: area under the receiver operating characteristic curve.
cPROMIS PI: PROMIS Pain Interference 8a v1.0.
dGAD-7: General Anxiety Disorder-7.
ePHQ-9: Patient, Health Questionnaire-9.
fPCS: Pain Catastrophizing Scale.
gNRS: Numeric Rating Scale.
Discussion
Principal Findings
This study examined whether a machine learning model could predict clinical outcomes related to pain in a population of TPS clinic patients who used the MMP digital health solution to track symptoms and answer clinic questionnaires. Using MMP app data entered by patients for 30 days, a linear regression model predicted clinically significant improvement in pain interference measured by the PROMIS PI questionnaire with 79% accuracy. The model showed balanced accuracy between improved and not improved classes with a sensitivity of 0.76 and specificity of 0.82.
Analysis of the features used in the model showed that all MMP app data, not just questionnaire responses, were relevant in predicting patient improvement. This finding underscores the critical role of all types of data in the algorithm’s efficacy. Features like exercise showed a positive correlation with improved outcomes, while stress was negatively correlated, aligning with clinical expectations. However, many top-ranked features lacked such clinical clarity. This is not unexpected as the machine learning model integrates all features collectively to make predictions, preventing the isolation of individual feature impacts. Some features might be correlated with other variables that influence improvement, and they should not be interpreted independently in this type of prediction approach. Despite these complexities, our results affirm the value of leveraging extensive datasets, allowing the model to identify influential factors beyond traditional assumptions and providing a robust statistical foundation to determine the factors that predict improvement.
Predicting the development and progression of a chronic pain condition has important clinical implications. Currently, clinicians rely on known risk factors implicated in the development and progression of chronic pain conditions [
- ] to inform clinical decision-making regarding treatment and pain management. However, the individualized nature of chronic pain and the interacting contribution of physical, emotional, and social factors impose considerable challenges in accurately predicting patient outcomes [ , ]. Recent efforts have applied machine learning to large datasets to demonstrate that individualized pain risk scores can be determined from a set of biopsychosocial factors [ ]. However, applying this approach in the clinical context is limited by the availability of relevant data for a specific patient population. This study bridges this gap by demonstrating that a pain-tracking app can be used in a real-world clinical setting to gather relevant data in a short period of time and effectively predict clinically significant outcomes related to pain. Information from the MMP app can be used by clinicians alongside traditional approaches to patient assessment to more effectively guide critical decision-making around medication management, such as tapering opioids, and allocation of finite clinic resources to patients with the greatest need. It is important to note that the approach presented here is not intended to identify specific predictors of improvement, but rather to help clinicians evaluate which patients are more or less likely to improve so they can prioritize health care resources accordingly.Limitations and Future Work
The findings presented here are limited to a small sample of a specific patient population who selected to use the MMP app, which may have contributed to a degree of selection bias in the dataset used for prediction modeling. Additionally, the dataset had many missing data points. Missing data frequently occur in real-world self-reported data sources, such as the one used here [
]. To replace missing values in our dataset, we relied on mean imputation. However, it is important to note that the missing data in the current dataset and the mean imputation approach may have contributed to unrecognized bias in the prediction model and affected the study outcome. Finally, only one clinical outcome is considered in the prediction model and further work is needed to identify how other clinically relevant outcomes can be incorporated into a more comprehensive prediction tool. Further refinement is also needed to increase the accuracy of the prediction model. The next steps in this ongoing work focus on integrating additional data from electronic patient records and facilitating greater engagement from patients with the MMP app. The current real-world use of the app was sufficient to achieve useful predictive insights, despite some caveats in their interpretation. Additional efforts to facilitate user engagement as well as data completion and integration will provide a richer dataset for the prediction algorithm to help improve its predictive capacity.Conclusion
This study builds on a growing body of work showing the capacity of pain apps like MMP to not only provide retrospective insights on symptom trends, but also serve as a clinical outcome prediction tool. Effectively predicting the progression of pain has the potential to improve clinical decision-making and personalized prevention and treatment of chronic pain. The findings of this study demonstrate that existing digital solutions like the MMP app offer a feasible approach to integrating patients’ self-tracking and clinical data in a machine-learning algorithm to develop accurate prediction models that can be used in a real-world clinical setting.
Acknowledgments
JS is supported by Mitacs and the MedTech Talent Accelerator. JK was supported by a Tier 1 Canadian Institutes of Health Research Canada Research Chair in Health Psychology at York University. HC is supported by a Merit award from the University of Toronto, Department of Anesthesia.
Data Availability
The datasets generated or analyzed during this study are not publicly available in accordance with Managing Life’s Privacy Policy. Managing Life limits data access to its academic collaborators who have executed a data licensing agreement and undergone the company’s security and privacy training.
Authors' Contributions
JS, TJ, JK, HC, and QAR conceptualized the overall study design and methodology. JS, TJ, HL-R, and QAR contributed to data curation and formal data analysis. JS and AML wrote the original draft of the manuscript. All authors reviewed, edited, and approved the final manuscript for submission.
Conflicts of Interest
TJ is the founder and CEO of ManagingLife, Inc. JS is an independent contractor for ManagingLife. JK and HC are unpaid members of the ManagingLife Advisory Board, providing guidance on the product and the company’s research initiatives. All other authors declare no conflicts of interest.
Clinic questionnaires.
DOCX File, 24 KBReferences
- Cohen SP, Vase L, Hooten WM. Chronic pain: an update on burden, best practices, and new advances. Lancet. May 29, 2021;397(10289):2082-2097. [CrossRef] [Medline]
- Treede RD, Rief W, Barke A, et al. Chronic PAIN as a symptom or a disease: the IASP classification of chronic PAIN for the International Classification of Diseases (ICD-11). PAIN. Jan 2019;160(1):19-27. [CrossRef] [Medline]
- Wideman TH, Edwards RR, Walton DM, Martel MO, Hudon A, Seminowicz DA. The multimodal assessment model of pain. Clin J Pain. 2019;35(3):212-221. [CrossRef]
- Fillingim RB. Individual differences in pain: understanding the mosaic that makes pain personal. PAIN. Apr 2017;158 Suppl 1(Suppl 1):S11-S18. [CrossRef] [Medline]
- Tanguay-Sabourin C, Fillingim M, Guglietti GV, et al. A prognostic risk score for development and spread of chronic pain. Nat Med. Jul 2023;29(7):1821-1831. [CrossRef] [Medline]
- Katz J, Pagé MG, Weinrib A, Clarke H. Identification of risk and protective factors in the transition from acute to chronic post surgical pain. In: Clinical Pain Management. John Wiley & Sons, Ltd; 2022:50-59. [CrossRef] ISBN: 978-1-119-70117-0
- Edwards RR, Dworkin RH, Sullivan MD, Turk DC, Wasan AD. The role of psychosocial processes in the development and maintenance of chronic pain. J Pain. Sep 2016;17(9 Suppl):T70-T92. [CrossRef] [Medline]
- Fillingim RB, Loeser JD, Baron R, Edwards RR. Assessment of chronic pain: domains, methods, and mechanisms. J Pain. Sep 2016;17(9 Suppl):T10-T20. [CrossRef] [Medline]
- Breivik H, Borchgrevink PC, Allen SM, et al. Assessment of pain. Br J Anaesth. Jul 2008;101(1):17-24. [CrossRef] [Medline]
- Lomborg S, Frandsen K. Self-tracking as communication. Information, Communication & Society. Jul 2, 2016;19(7):1015-1027. [CrossRef]
- Feng S, Mäntymäki M, Dhir A, Salmela H. How self-tracking and the quantified self promote health and well-being: systematic review. J Med Internet Res. Sep 21, 2021;23(9):e25171. [CrossRef] [Medline]
- Janevic MR, Murnane E, Fillingim RB, Kerns RD, Reid MC. Mapping the design space of technology-based solutions for better chronic pain care: introducing the pain tech landscape. Psychosom Med. Sep 1, 2023;85(7):612-618. [CrossRef] [Medline]
- Weinrib A, Azam MA, Latman VV, Janmohamed T, Clarke H, Katz J. Manage my pain: a patient-driven mobile platform to prevent and manage chronic postsurgical pain. In: Novel Applications of Virtual Communities in Healthcare Settings. IGI Global; 2018:93-126. [CrossRef]
- Schroeder J, Chung CF, Epstein DA, et al. Examining self-tracking by people with migraine: goals, needs, and opportunities in a chronic health condition. 2018. Presented at: DIS ’18: Proceedings of the 2018 Designing Interactive Systems Conference; Jun 9-13, 2018:135-148; Hong Kong China. [CrossRef]
- Katz J, Weinrib AZ, Clarke H. Chronic postsurgical pain: from risk factor identification to multidisciplinary management at the Toronto General Hospital Transitional Pain Service. Can J Pain. 2019;3(2):49-58. [CrossRef] [Medline]
- Katz J, Weinrib A, Fashler SR, et al. The Toronto General Hospital Transitional Pain Service: development and implementation of a multidisciplinary program to prevent chronic postsurgical pain. J Pain Res. 2015;8:695-702. [CrossRef] [Medline]
- Clarke H, Azargive S, Montbriand J, et al. Opioid weaning and pain management in postsurgical patients at the Toronto General Hospital Transitional Pain Service. Can J Pain. 2018;2(1):236-247. [CrossRef] [Medline]
- Slepian PM, Peng M, Janmohamed T, et al. Engagement with Manage My Pain mobile health application among patients at the Transitional Pain Service. Digit HEALTH. 2020;6. [CrossRef] [Medline]
- Bhatia A, Kara J, Janmohamed T, et al. User engagement and clinical impact of the Manage My Pain app in patients with chronic pain: a real-world, multi-site trial. JMIR Mhealth Uhealth. Mar 4, 2021;9(3):e26528. [CrossRef] [Medline]
- Rahman QA, Janmohamed T, Pirbaglou M, et al. Defining and predicting pain volatility in users of the Manage My Pain app: analysis using data mining and machine learning methods. J Med Internet Res. Nov 15, 2018;20(11):e12001. [CrossRef] [Medline]
- Rahman QA, Janmohamed T, Clarke H, Ritvo P, Heffernan J, Katz J. Interpretability and class imbalance in prediction models for pain volatility in Manage My Pain app users: analysis using feature selection and majority voting methods. JMIR Med Inform. Nov 20, 2019;7(4):e15601. [CrossRef] [Medline]
- Amtmann D, Cook KF, Jensen MP, et al. Development of a PROMIS item bank to measure pain interference. PAIN. Jul 2010;150(1):173-182. [CrossRef] [Medline]
- Jensen MP, Tomé-Pires C, de la Vega R, Galán S, Solé E, Miró J. What determines whether a pain is rated as mild, moderate, or severe? the importance of pain beliefs and pain interference. Clin J Pain. May 2017;33(5):414-421. [CrossRef] [Medline]
- Askew RL, Cook KF, Revicki DA, Cella D, Amtmann D. Evidence from diverse clinical populations supported clinical validity of PROMIS pain interference and pain behavior. J Clin Epidemiol. May 2016;73:103-111. [CrossRef] [Medline]
- Miettinen T, Kautiainen H, Mäntyselkä P, Linton SJ, Kalso E. Pain interference type and level guide the assessment process in chronic pain: categorizing pain patients entering tertiary pain treatment with the brief pain inventory. PLoS ONE. 2019;14(8):e0221437. [CrossRef] [Medline]
- Wilson M. Integrating the concept of pain interference into pain management. Pain Manag Nurs. Jun 2014;15(2):499-505. [CrossRef] [Medline]
- Pelletier R, Bourbonnais D, Higgins J, Mireault M, Harris PG, Danino MA. Pain interference may be an important link between pain severity, impairment, and self-reported disability in participants with wrist/hand pain. J Hand Ther. 2020;33(4):562-570. [CrossRef] [Medline]
- End user licence agreement. Manage My Pain. URL: https://managemypainapp.com/eula [Accessed 2025-01-15]
- Privacy policy. Manage My Pain. URL: https://managemypainapp.com/privacy-policy [Accessed 2025-01-15]
- Hughes LS, Clark J, Colclough JA, Dale E, McMillan D. Acceptance and commitment therapy (ACT) for chronic pain: a systematic review and meta-analyses. Clin J Pain. Jun 2017;33(6):552-568. [CrossRef] [Medline]
- Ma TW, Yuen ASK, Yang Z. The efficacy of acceptance and commitment therapy for chronic pain: a systematic review and meta-analysis. Clin J Pain. Mar 1, 2023;39(3):147-157. [CrossRef] [Medline]
- Di Renna T, Burke E, Bhatia A, et al. Improving access to chronic pain care with central referral and triage: the 6-year findings from a single-entry model. Can J Pain. 2024;8(1):2297561. [CrossRef] [Medline]
- Sullivan MJL, Bishop SR, Pivik J. The pain catastrophizing scale: development and validation. Psychol Assess. 1995;7(4):524-532. [CrossRef]
- Spitzer RL, Kroenke K, Williams JBW, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. May 22, 2006;166(10):1092-1097. [CrossRef] [Medline]
- Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. Sep 2001;16(9):606-613. [CrossRef] [Medline]
- Jaeschke R, Singer J, Guyatt GH. Measurement of health status. ascertaining the minimal clinically important difference. Control Clin Trials. Dec 1989;10(4):407-415. [CrossRef] [Medline]
- Chen CX, Kroenke K, Stump TE, et al. Estimating minimally important differences for the PROMIS pain interference scales: results from 3 randomized clinical trials. Pain. Apr 2018;159(4):775-782. [CrossRef] [Medline]
- Hosmer Jr. DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression. John Wiley & Sons, Ltd; 2013. [CrossRef] ISBN: 978-1-118-54838-7
- Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. Nov 1, 2011;12:2825-2830.
- Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ. LIBLINEAR: a library for large linear classification. J Mach Learn Res. Jun 1, 2008;9:1871-1874.
- Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Mach Learn. 2002;46(1/3):389-422. [CrossRef]
- Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci. Aug 1997;55(1):119-139. [CrossRef]
- Breiman L. Random forests. Mach Learn. 2001;45(1):5-32. [CrossRef]
- Suthaharan S. Support vector machine. In: Suthaharan S, editor. Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning. Springer US; 2016:207-235. [CrossRef] ISBN: 978-1-4899-7641-3
- Toussaint A, Hüsing P, Gumz A, et al. Sensitivity to change and minimal clinically important difference of the 7-item generalized anxiety disorder questionnaire (GAD-7). J Affect Disord. Mar 15, 2020;265:395-401. [CrossRef] [Medline]
- Löwe B, Kroenke K, Herzog W, Gräfe K. Measuring depression outcome with a brief self-report instrument: sensitivity to change of the patient health questionnaire (PHQ-9). J Affect Disord. Jul 2004;81(1):61-66. [CrossRef] [Medline]
- Scott W, Wideman TH, Sullivan MJL. Clinically meaningful scores on pain catastrophizing before and after multidisciplinary rehabilitation: a prospective study of individuals with subacute pain after whiplash injury. Clin J Pain. Mar 2014;30(3):183-190. [CrossRef] [Medline]
- Salaffi F, Stancati A, Silvestri CA, Ciapetti A, Grassi W. Minimal clinically important changes in chronic musculoskeletal pain intensity measured on a numerical rating scale. Eur J Pain. Aug 2004;8(4):283-291. [CrossRef] [Medline]
- Katz J, Seltzer Z. Transition from acute to chronic postsurgical pain: risk factors and protective factors. Expert Rev Neurother. May 2009;9(5):723-744. [CrossRef] [Medline]
- Rosenberger DC, Pogatzki-Zahn EM. Chronic post-surgical pain - update on incidence, risk factors and preventive treatment options. BJA Educ. May 2022;22(5):190-196. [CrossRef] [Medline]
- Hebert SV, Green MA, Mashaw SA, et al. Assessing risk factors and comorbidities in the treatment of chronic pain: a narrative review. Curr Pain Headache Rep. Jun 2024;28(6):525-534. [CrossRef] [Medline]
- Chou R, Shekelle P. Will this patient develop persistent disabling low back pain? JAMA. Apr 7, 2010;303(13):1295-1302. [CrossRef] [Medline]
- Lipton RB, Buse DC, Nahas SJ, et al. Risk factors for migraine disease progression: a narrative review for a patient-centered approach. J Neurol. Dec 2023;270(12):5692-5710. [CrossRef] [Medline]
- Stevans JM, Delitto A, Khoja SS, et al. Risk factors associated with transition from acute to chronic low back pain in US patients seeking primary care. JAMA Netw Open. Feb 1, 2021;4(2):e2037371. [CrossRef] [Medline]
- van Hecke O, Torrance N, Smith BH. Chronic pain epidemiology - where do lifestyle factors fit in? Br J Pain. Nov 2013;7(4):209-217. [CrossRef] [Medline]
- Liu F, Panagiotakos D. Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Med Res Methodol. Nov 5, 2022;22(1):287. [CrossRef] [Medline]
Abbreviations
AUC: area under the receiver operating characteristic curve |
GAD-7: Generalized Anxiety Disorder-7 scale |
MCID: minimal clinically important difference |
MMP: Manage My Pain |
NRS: Numeric Rating Scale |
PCS: Pain Catastrophizing Scale |
PHQ-9: Patient Health Questionnaire-9 |
PROMIS PI: PROMIS Pain Interference 8a v1.0 scale |
RFE: recursive feature elimination |
SVM: support vector machine |
TGH: Toronto General Hospital |
TPS: Transitional Pain Service |
Edited by Andrew Coristine; submitted 09.10.24; peer-reviewed by Edgar Ross, Robert Jamison; final revised version received 06.02.25; accepted 17.02.25; published 28.03.25.
Copyright© James Skoric, Anna M Lomanowska, Tahir Janmohamed, Heather Lumsden-Ruegg, Joel Katz, Hance Clarke, Quazi Abidur Rahman. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 28.3.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.