Published on in Vol 10 , No 8 (2022) :August

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/37829, first published .
Synergy Between Public and Private Health Care Organizations During COVID-19 on Twitter: Sentiment and Engagement Analysis Using Forecasting Models

Synergy Between Public and Private Health Care Organizations During COVID-19 on Twitter: Sentiment and Engagement Analysis Using Forecasting Models

Synergy Between Public and Private Health Care Organizations During COVID-19 on Twitter: Sentiment and Engagement Analysis Using Forecasting Models

Original Paper

Department of Computer Science, Lakehead University, Thunder Bay, ON, Canada

*these authors contributed equally

Corresponding Author:

Aditya Singhal, MSc

Department of Computer Science

Lakehead University

955 Oliver Rd

Thunder Bay, ON, P7B 5E1

Canada

Phone: 1 807 709 9571

Email: asinghal@lakeheadu.ca


Background: Social media platforms (SMPs) are frequently used by various pharmaceutical companies, public health agencies, and nongovernment organizations (NGOs) for communicating health concerns, new advancements, and potential outbreaks. Although the benefits of using them as a tool have been extensively discussed, the online activity of various health care organizations on SMPs during COVID-19 in terms of engagement and sentiment forecasting has not been thoroughly investigated.

Objective: The purpose of this research is to analyze the nature of information shared on Twitter, understand the public engagement generated on it, and forecast the sentiment score for various organizations.

Methods: Data were collected from the Twitter handles of 5 pharmaceutical companies, 10 US and Canadian public health agencies, and the World Health Organization (WHO) from January 1, 2017, to December 31, 2021. A total of 181,469 tweets were divided into 2 phases for the analysis, before COVID-19 and during COVID-19, based on the confirmation of the first COVID-19 community transmission case in North America on February 26, 2020. We conducted content analysis to generate health-related topics using natural language processing (NLP)-based topic-modeling techniques, analyzed public engagement on Twitter, and performed sentiment forecasting using 16 univariate moving-average and machine learning (ML) models to understand the correlation between public opinion and tweet contents.

Results: We utilized the topics modeled from the tweets authored by the health care organizations chosen for our analysis using nonnegative matrix factorization (NMF): cumass=–3.6530 and –3.7944 before and during COVID-19, respectively. The topics were chronic diseases, health research, community health care, medical trials, COVID-19, vaccination, nutrition and well-being, and mental health. In terms of user impact, WHO (user impact=4171.24) had the highest impact overall, followed by public health agencies, the Centers for Disease Control and Prevention (CDC; user impact=2895.87), and the National Institutes of Health (NIH; user impact=891.06). Among pharmaceutical companies, Pfizer’s user impact was the highest at 97.79. Furthermore, for sentiment forecasting, autoregressive integrated moving average (ARIMA) and seasonal autoregressive integrated moving average with exogenous factors (SARIMAX) models performed best on the majority of the subsets of data (divided as per the health care organization and period), with the mean absolute error (MAE) between 0.027 and 0.084, the mean square error (MSE) between 0.001 and 0.011, and the root-mean-square error (RMSE) between 0.031 and 0.105.

Conclusions: Our findings indicate that people engage more on topics such as COVID-19 than medical trials and customer experience. In addition, there are notable differences in the user engagement levels across organizations. Global organizations, such as WHO, show wide variations in engagement levels over time. The sentiment forecasting method discussed presents a way for organizations to structure their future content to ensure maximum user engagement.

JMIR Med Inform 2022;10(8):e37829

doi:10.2196/37829

Keywords



Background

Social media platforms (SMPs), such as Twitter, Facebook, and Reddit, are commonly used by people to access health information. In the United States, 8 in 10 internet users access health information online, and 74% of these use SMPs. Meanwhile, public health agencies and pharmaceutical companies often use social media to engage with the public [1]. SMPs significantly contribute to the community by providing a communication platform for the public, patients, and health care professionals (HCPs) to talk about health concerns, eventually leading to better outcomes [2]. Additionally, SMPs also function as a medium to motivate patients by promoting health care education and providing the latest information to the community [1]. Analyzing social media content in the health care domain can reveal important dimensions, such as audience reach (eg, followers and subscribers), post source (eg, pharmaceutical companies, public health agencies), and post interactivity (eg, number of likes, retweets) [3]. A recent study discussed a machine learning (ML) approach to examining COVID-19 on Twitter [4]. Although it identifies discussion themes, there is no research on understanding the content shared by public health agencies and private organizations.

Related Works

The positive impacts of using SMPs by patients and HCPs have been previously discussed [5]. Patients feel empowered and develop positive relationships with their HCPs. For instance, Ventola [1] discussed SMPs as a tool to share and promote healthy habits, share information, and interact with the public. Li et al [6] presented an analysis of social media's impact on the public. Their research discusses public perceptions of health-related content being classified as true, debatable, or false; the study shows that people have a strong tendency to adopt collective opinions while sharing health-related statements on social media.

There are different topic-clustering and content analysis techniques available to identify the characteristics of stakeholders (eg, pharmaceutical companies’ tweets for drug information) on SMPs [7,8]. A previous study presented an overview of techniques used for sentiment analysis in health care [9]. The researchers discuss multiple lexicon-based and ML-based approaches. The previous discussion on pharmaceutical companies has focused on COVID-19 vaccine–related public opinions [10,11]. Using latent dirichlet allocation (LDA) and valence aware dictionary and sentiment reasoner (VADER), researchers have examined topics, trends, and sentiments over time [10].

Prior research work has also focused on the response of G7 leaders during COVID-19 on Twitter [12,13]. The research classified viral tweets into appropriate categories, the most common being informative. Furthermore, researchers have recently presented a discussion on the harms and benefits of using Twitter during COVID-19 [14]. An epidemiological study conducted in 2020 investigated the news-sharing behavior on Twitter. Although it concluded that tweets that include news articles sharing pandemic information are popular, they cannot substitute public health agencies, organizations, or HCPs [15]. In addition, the study of public sentiments via artificial intelligence (AI) can provide a way to frame public health policies [16].

COVID-19 led to a rapid change in public sentiments over a short span of time [17]. People expressed sentiments of joy and gratitude toward good health and sadness and anger at the loss of life and stay-at-home orders [17,18]. Understanding public perceptions toward health-related content is important. Although the majority of people have a positive attitude toward social media, some feel more attention is required to promote the credibility of shared information [19]. Attempts have been made to capture peoples’ reactions to the pandemic; however, they are limited in scope. One study investigated the concerns originating toward public health interventions in North America via topic modeling [20], while another examined the role of beliefs and susceptibility information in public engagement on Twitter [21]. Statistical analysis also shows that health care organizations have to come forward to engage more with consumers [22]. The importance of risk communication strategies while using SMPs cannot be undermined [23].

Although a tweet’s engagement and sentiment can only be calculated once it has been posted, forecasting presents a fascinating way to predict the sentiments beforehand. Time series–based strategies, such as autoregressive integrated moving average (ARIMA) and vector autoregressions (VAR), have been used for forecasting emotions from SMPs [24,25]. The seasonal autoregressive integrated moving average with exogenous factors (SARIMAX) model was recently used to gain insights into people’s current emotional state via sentiment nowcasting on Twitter [26].

ML and natural language processing (NLP) algorithms have been recently used in various instances; for example, Bayesian ridge and ridge regression models were used for emotion prediction and health care analysis on large-scale data sets [27,28]. The elastic net and lasso regression have been previously used for health care access management and information exchange [29,30], while linear regression, decision tree, and random forest models are commonly used for epidemic-level disease tracking [31]. Different regression boosting algorithms, such as AdaBoost, light gradient boost , and gradient boost, have also been used for disease outbreak prediction [31]. Prophet, a Python library package, was recently used for COVID-19 outbreak prediction [32].

Objective

The implications of social media communication by HCPs have been extensively discussed [33,34]. Although they focus on the advantages and methods of extracting health- and disease-related content from social media, there is currently a lack of understanding of how social media usage by public health agencies, nongovernment organizations (NGOs), and pharmaceutical companies resonates with society. Additionally, the study of tweets’ sentiments can supplement existing models for generating content for future tweets. Predicting the tweet sentiment is 1 way to achieve this goal. Therefore, it is crucial to convert this textual content into information for formulating future strategies and gaining valuable insights into perceptions of social media users.

The remainder of the paper is structured as follows: First, a preliminary analysis of topic modeling using the best-performing clustering algorithm is presented in the Methods section, followed by sentiment and engagement analysis using CardiffNLP’s twitter-roberta-base-sentiment model. We then conducted time series–based sentiment forecasting using 16 univariate models on the complete data set. The Results section outlines model topics obtained, which were used for generating heatmaps to obtain insights into topicwise tweets. Next, we discussed user engagement with its impact to understand whether there were specific occurrences of higher levels of engagement impacted by any offline events. In addition, we discussed results from best-performing sentiment-forecasting models. Finally, in the Discussion section, we draw conclusions and present an outline for future work.


Data Set

The data for this study (181,469 tweets) were gathered from the accounts of major US and Canadian health care organizations, pharmaceutical companies, and the World Health Organization (WHO) using the Twitter Academic API for Research v2 [35] during the time frame of January 1, 2017, to December 31, 2021. The top 5 pharmaceutical companies were selected based on the recommendations made by HCPs on Twitter [36]. Table 1 lists the number of tweets scraped for each Twitter handle. Each organization is referred to as a user, and the type of organization (ie, pharmaceutical company, public health agency, NGO) is referred to as a user group for the scope of this study.

The complete timeline was divided into 2 phases for analysis, before COVID-19 and during COVID-19, based on the confirmation of the first COVID-19 community transmission case in North America on February 26, 2020 [37]. Figure 1 presents an overview of the research framework.

Table 1. Distribution of tweets for the selected user accounts of 3 types of organizations.
Name of organization (Twitter handle)Before COVID-19, n (%)During COVID-19, n (%)Total tweets, N
Public health agencies

Centers for Disease Control and Prevention (CDCgov)8435 (58.6)5963 (41.4)14,398

Centers for Disease Control and Prevention (CDC_eHealth)1376 (86.3)219 (13.7)1594

Government of Canada for Indigenous (GCIndigenous)3505 (54.0)2989 (46.0)6494

Health Canada and PHAC (GovCanHealth)7878 (17.2)37,907 (82.8)45,785

US Department of Health & Human Services (HHSGov)7890 (56.9)5969 (43.1)13,859

Indian Health Service (IHSgov)1090 (44.7)1346 (55.3)2436

Canadian Food Inspection Agency (InspectionCan)4145 (62.2)2516 (37.8)6661

National Institutes of Health (NIH)5837 (71.6)2314 (28.4)8151

National Indian Health Board (NIHB1)1247 (51.1)1195 (48.9)2442

US Food and Drug Administration (US_FDA)5810 (59.7)3925 (40.3)9735

Total47,213 (42.3)64,343 (57.7)111,555
Pharmaceutical companies

AstraZeneca (AstraZeneca)3462 (78.2)963 (21.8)4425

Biogen (biogen)1819 (61.9)1120 (38.1)2939

Glaxo SmithKline (GSK)4200 (69.3)1857 (30.7)6057

Johnson & Johnson (JNJNews)4813 (71.4)1926 (28.6)6739

Pfizer (pfizer)3637 (64.1)2039 (35.9)5676

Total17,931 (69.4)7905 (30.6)25,836
NGOa

World Health Organization (WHO)24,775 (56.2)19,303 (43.8)44,078

aNGO: nongovernment organization.

Figure 1. Overall research framework. WHO: World Health Organization.
View this figure

Content Analysis

The content of each user was divided into 2 phases, before and during COVID-19. We performed topic modeling on the tweets authored by the organizations by using the topics yielded by the best-performing topic model in order to explore the most and least talked about topics with the help of heatmaps. Additionally, we examined the top 10 hashtags used by these organizations.

Preprocessing

First, all nonalphabets (numbers, punctuation, new-line characters, and extra spaces) and Uniform Resource Locators (URLs) were removed using the regular expression module (re 2.2.1) [38] for all tweets. The cleaned text was then tokenized using the nltk 3.2.5 library [39]. Next, stopwords were removed, followed by stemming using PorterStemmer, and lemmatizing using the WordNetLemmatizer from nltk.

Topic Modeling

Researchers have used term frequency–inverse document frequency (TF-IDF) to create document embeddings for tweets [40]. Following their approach, we preprocessed and generated document embeddings for tweets and input them to 5 different clustering algorithms: LDA, parallel LDA, nonnegative matrix factorization (NMF), latent semantic indexing (LSI), and the hierarchical dirichlet process (HDP). These clustering algorithms were executed 5 times with varying random seed values. The seed values accounted for the short and noisy nature of tweets. We calculated the coherence scores of the topic models, cumass [41] and cv [42], to confirm performance consistency over multiple runs.

We used Gensim LDA [43], Gensim LDA multicore (parallel LDA) [44], and Gensim LSI [44,45] models. For NMF and HDP models, we used online NMF for large corpora [46] and online variational inference [46,47] models, respectively.

Heatmaps

Heatmaps were generated using seaborn to analyze the volume of tweets for each topic. The topics yielded by the best-performing topic model as per the time phase (ie, before and during COVID-19) were leveraged to generate heatmaps. Each cell represented the total count of tweets for a particular topic by an organization. For example, among pharmaceutical companies, AstraZeneca had the highest number of tweets (n=1729, 49.9%) before COVID-19 for chronic diseases.

Hashtags

The top 10 hashtags mentioned in the users’ tweets were evaluated using the advertools 0.13.0 module [48]. This tool extracts hashtags in social media posts. It was used for analyzing the similarities and differences in the tweeting behavior before and during COVID-19 and conducting topic analysis.

Sentiment Analysis

Sentiment analysis is an NLP approach used to categorize the sentiments appearing in Twitter messages based on the keywords used in each tweet. We tested different models that classify a user’s tweet in 1 of 3 categories: positive, negative, and neutral. Although there is no common threshold for how many tweets should be sampled, we witnessed a range of around 2000 tweets [49-51] to several thousand tweets [52-54] when testing a model. For this study, we sampled 3000 tweets uniformly distributed over the span of our data collection time frame and from all Twitter handles. The tweets were then labeled by 3 distinct annotators, and the sentiment category with the highest votes was chosen as the overall sentiment. CardiffNLP’s twitter-roberta-base-sentiment model [55], which is trained on a 60 million Twitter corpus, was used to obtain sentiment labels on the sampled data set. We checked for similarity between human annotations and model labels, and the similarity percentage for CardiffNLP’s model was 69.96%; the model was therefore used to predict the sentiment on the remaining tweets of the users.

Engagement Analysis

For a given user, Twitter defines the engagement rate [56] as presented in Equation (1):

where “Engagement is the summation of the number of likes, replies, retweets, media views, tweet expansion, profile, hashtag, URL clicks, and new followers gained for every tweet, and Impressions is the total number of times a tweet has been seen on Twitter, such as through a follower’s timeline, Twitter search, or as a result of someone liking your tweet.”

Researchers have analyzed the impact (popularity) of Twitter handles by proposing heuristic and neural network–based models [57-59]. We defined it as a function of followers, following, the total number of tweets, and the profile age and calculated it using Equation (2):

where listedCount is the number of public lists of which this user is a member.

The total number of tweets produced by a user was considered inversely proportional to the user’s impact, because a user tweeting occasionally and receiving higher engagement is more impactful than a user tweeting regularly with lower engagement.

Engagement analysis was performed to quantify the popularity of a topic generated. The engagement for each user was defined as the product of average engagement per day and their impact, as described in Equation (3). The average engagement per day was calculated as the sum of the count of likes, replies, retweets, and quotes per day. These reactions were aggregated from January 1, 2017, to December 31, 2021.

The exponential moving average (EMA) was calculated with a window span of 151 days for every user, and outliers were removed using the z-score, followed by smoothening of the average engagement per day to the eighth degree using the Savitzky-Golay filter [60].

Sentiment Forecasting

To forecast the sentiment per day, we first needed to quantify the overall sentiment of the tweets from each user every day. We leveraged CardiffNLP’s twitter-roberta-base-sentiment model [55] to calculate the sentiments of all the tweets collected for our analysis and then calculated the daily sentiment score, as mentioned in Equation (4), based on the sentiment category with the maximum number of tweets for that day, followed by assigning the sentiment score based on the sentiment: 0 for neutral sentiment, the ratio of the count of positive tweets to total tweets for positive sentiment, and the negation of the ratio of the count of negative tweets to the total tweets for negative sentiment.

The daily sentiment scores were then resampled to a monthly mean sentiment score, which also helped us in handling missing values, if any. The complete timeline was divided into 2 phases (ie, before and during COVID-19), as discussed before, and the sentiment score was forecasted on 20% of the data set in each period for all user groups.

A grid search was used to find optimal hyperparameters, and 5-fold cross-validation was performed for every model. The statsmodel library [61] was used for ARIMA [62] and SARIMAX [63] models, and pycaret [64] was used for regression-based models. We also reported the performance of the prophet [65] model on the data set.

Three metrics, the mean absolute error (MAE), the mean square error (MSE), and the root-mean-square error (RMSE), were selected to evaluate the forecasting accuracy of the models. We considered 1-step-ahead forecasting for this study as it helped avoid problems related to cumulative errors from the preceding period.

Computational Resources

The study was performed using Compute Canada (now called the Digital Research Alliance of Canada) resources, which provide access to advanced research computing (ARC), research data management (RDM), and research software (RS). The following is a list of the computing resources offered by one of the clusters from National Services (Digital Research Alliance), Graham:

  • Central processing unit (CPU): 2x Intel E5-2683 v4 Broadwell@2.1 GHz
  • Memory (RAM): 30 GB

Content Analysis

The details of the parameters used for each model are discussed in Multimedia Appendix 1, Table S1. Table 2 shows the mean coherence scores (cv and cumass) for each clustering algorithm. Although the HDP had the highest cv scores in both time phases (ie, 0.696 and 0.650 before and during COVID-19, respectively), NMF had the best cumass scores (–3.653 and –3.794, respectively) and generated the most meaningful topics for the data set (see Multimedia Appendix 1, Tables S2 and S3). Therefore, the top 5 topics generated by NMF were selected to search for on the first page of Google Search results. The resulting contents were then retrieved to interpret the extracted topic keywords to propose a suitable topic name. For example, for the set of keywords yielded by the topic model “community health, care, community health services, health center, family health centers, community plan, community clinic, family health care, qualified health centers, health services,” we assigned the topic community health care.

The scaled heatmaps showing the topic distribution for different Twitter handles are shown in Figure 2. Prior to COVID-19, chronic diseases were the most active topic, with a total of 9488 tweets from pharmaceutical companies and WHO (see Figure 2a). However, during COVID-19, we observed that COVID-19, health research, and chronic diseases were the most-discussed topics, with 52,148 tweets from all data sets combined (see Multimedia Appendix 1, Figures S1b and S1d).

This shift in the tweets’ content was observed across the complete data set, and we further made the following inferences:

  • Before COVID-19: Chronic diseases were the most talked about topic for pharmaceutical companies (AstraZeneca, 1729, 49.9%, tweets; Pfizer, 1168, 32.1%, tweets) and for WHO (4831, 19.5%, tweets), followed by tweets on health research (WHO, 1703, 6.9%, tweets; AstraZeneca, 1037, 29.9%, tweets). This is supported by Figure 3a, which shows #cancer, #lungcancer, #alzheimers, #hiv, and #ms to be prominently used in tweets. Among public health agencies, the NIH’s and the CDC’s Twitter handles were the most active, with 1840 (31.6%) and 1742 (20.6%) tweets discussing health research and chronic diseases, respectively, strongly supported by the most used hashtags #nativehealth and #foodsafety (refer to Multimedia Appendix 1, Figures S2a and S2c).
  • During COVID-19: Chronic diseases and health research were the most active topics for AstraZeneca (680, 70.6%, tweets) and Glaxo SmithKline (GSK, 655, 35.2%, tweets), respectively. In addition, COVID-19 and vaccination were most talked about by GSK (398, 21.4%, tweets) and Pfizer (396, 19.4%, tweets). Figure 3b shows the hashtags supporting this: #covid19, #alzheimers, #cancer, #multiplesclerosis, and #vaccine. GovCanHealth was by far the most active public health agency on Twitter, with 16,832 (87.2%) tweets on health research, 16,449 (85.2%) tweets on vaccination, and 14,260 (73.8%) tweets on COVID-19, having #covid19, #coronavirus, and #covidvaccine as trending hashtags. The majority of the tweets by WHO were on COVID-19 (8911 tweets) and vaccination (2131 tweets), with #covid19, #coronavirus, and #vaccineequity appearing frequently in the tweets (refer to Multimedia Appendix 1, Figure S2d).
Table 2. Mean coherence scores and CPUa time for different clustering algorithms.
Clustering algorithmcvcumassTime taken (minutes:seconds)
Before COVID-19

LDAb0.352–5.52617:11

Parallel LDA0.396–3.7095:48

NMFc0.493–3.6537:38

LSId0.316–5.9210:16

HDPe0.696–18.6683:24
During COVID-19

LDA0.456–5.68814:01

Parallel LDA0.446–3.9906:08

NMF0.567–3.7947:04

LSI0.381–5.3560:16

HDP0.650–17.6103:01

aCPU: central processing unit.

bLDA: latent dirichlet allocation.

cNMF: nonnegative matrix factorization.

dLSI: latent semantic indexing.

eHDP: hierarchical dirichlet process.

Figure 2. Scaled heatmaps showing topic distribution for pharmaceutical companies before and during COVID-19.
View this figure
Figure 3. Top hashtags of pharmaceutical companies before and during COVID-19.
View this figure

Engagement Analysis

WHO (user impact=4171.24) had the highest impact overall, followed by public health agencies (CDC user impact=2895.87; NIH user impact=891.06). Among pharmaceutical companies, Pfizer’s user impact was the highest at 97.79. The user impact was normalized between the range of 0 and 1 and is shown in Figure 4.

Among pharmaceutical companies, Pfizer’s user engagement was far higher than that of others (Figure 5), both before and during COVID-19, with the highest engagement observed at the time of its COVID-19 vaccine’s success in November 2020. A jump in engagement was also observed in May 2021, when Pfizer announced its plan for helping India fight the second wave of coronavirus (refer to Multimedia Appendix 1, Table S4).

A similar trend was observed in public health agencies, with the CDC’s account showing the highest user engagement between March and June 2020, the early months of the COVID-19 pandemic. A sharp rise in user engagement was observed in May 2021, when the CDC announced a relaxation on social distancing and masking rules for fully vaccinated individuals. The user engagement on WHO’s account varied significantly over time. Its engagement was the highest in the time frame of February-April 2020, the early months of the pandemic, similar to what was observed for public health agencies. A sharp increase was seen in October 2020 following the announcement of the World Mental Health Day and in late 2020, when WHO made an announcement for COVID-19 vaccine development (refer to Multimedia Appendix 1, Figure S3).

Figure 4. User impact of all Twitter handles scaled between 0 and 1. CDC: Centers for Disease Control and Prevention; NIH: National Institutes of Health; WHO: World Health Organization.
View this figure
Figure 5. User engagement on Twitter accounts of pharmaceutical companies from January 1, 2017, to December 31, 2021.
View this figure

Sentiment Forecasting

Table 3 shows the MAE, MSE, and RMSE for the 16 models used on the data sets. Overall, ARIMA (univariate) and SARIMAX models performed best on the majority of the subsets of the data (divided as per the organization and period), and we further made the following inferences:

  • Before COVID-19: ARIMA and SARIMAX models generated the lowest MSE (0.005) and RMSE (0.072) for pharmaceutical companies. When measuring the model performance through the MAE, ARIMA performed better than all other models (0.063). A similar trend was observed for public health agencies, with ARIMA having the lowest MAE (0.027) and SARIMAX having the lowest RMSE (0.031) and a tie between them for the MSE (0.001). SARIMAX had the lowest MAE (0.054), MSE (0.004), and RMSE (0.080) on the WHO data set.
  • During COVID-19: Using the CatBoost regressor gave the lowest MAE (0.072) and RMSE (0.086), while the K-neighbors regressor yielded the lowest MSE (0.008) for pharmaceutical companies. Performing regression using AdaBoost generated the lowest MAE (0.084) and RMSE (0.105) among all models used, and SARIMAX had the lowest MSE (0.011) for public health agencies. For WHO, the elastic net, lasso regression, and light gradient boosting performed equally well, with all 3 models having the same MAE (0.046) and RMSE (0.059), and SARIMAX had the lowest MSE (0.004).

Figure 6a shows the 1-step-ahead forecast for pharmaceutical companies before COVID-19 using ARIMA. The model was trained on sentiment scores from January 2017 to June 2019 and tested on data from July 2019 to February 2020 for tweets before COVID-19. The 1-step-ahead forecasting aligned well with the observed sentiment scores, and we obtained similar results for public health agencies and WHO. The organizations showed some deviations from observed sentiments while conducting 1-step-ahead forecasting during COVID-19, making it difficult to predict their sentiment accurately, as seen in Multimedia Appendix 1, Figure S4.

To verify the forecasting performance of these models, we checked for the nature of their residual errors (ie, whether the residuals of the models were normally distributed with mean 0 and SD 1 and were uncorrelated). From Multimedia Appendix 1, Figure S5, as in the case of public health agencies, before COVID-19 using ARIMA, we confirmed the aforementioned through plot_diagnostics. The green kernel density estimation (KDE) line closely followed the normal distribution (N ∊ {0,1}) line in the top-right corner of Multimedia Appendix 1, Figure S5, which is a positive indicator that the residuals were scattered normally. The quantile-quantile (Q-Q) plot on the bottom left shows that the distribution of residuals (blue dots) approximately followed the linear trend of samples drawn from a standard normal distribution, N. This confirms again that the residuals were normally distributed. The residuals over time (top left in Multimedia Appendix 1, Figure S5) showed no apparent seasonality and have 0 mean. The autocorrelation plot (ie, correlogram) attested this, indicating that the time series residuals exhibited minimal correlation with lagged forms of themselves. Thus, these findings encouraged us to believe that our models provide an adequate fit, which might aid us in understanding the sentiments of the organizations and forecasting their values without overburdening our hardware with computationally heavy models.

Table 3. Results of time series sentiment forecasting using different MLa models (all metrics are 5-fold cross-validation).
ModelsPharmaceutical companiesPublic health agenciesWHOb
Before COVID-19During COVID-19Before COVID-19During COVID-19Before COVID-19During COVID-19
MAEcMSEdRMSEeMAEMSERMSEMAEMSERMSEMAEMSERMSEMAEMSERMSEMAEMSERMSE
ARIMAf0.063g0.005g0.072g0.0980.0130.1120.027g0.001g0.032h0.2400.0820.2860.066h0.006h0.080h0.1060.0120.111
SARIMAXi0.065h0.005g0.072g0.0840.0110.1040.028j0.001g0.031g0.7090.011g0.106h0.054g0.004g0.061g0.047h0.004g0.066
Bayesian ridge0.0830.0100.1000.1020.0180.1190.0310.0010.0370.1410.0370.1630.075j0.009j0.087j0.0610.0080.075
Ridge regression0.0690.0080.0850.0790.0110.0940.0300.0020.0380.1240.0290.1470.0760.0090.0910.0560.0070.068
CatBoost regressor0.0660.007j0.080h0.072g0.008h0.086g0.027h0.001h0.0350.1040.0230.1270.0790.0090.0890.0520.0070.065
K-neighbors regressor0.0700.0090.0870.075h0.008g0.087h0.0300.0010.0360.093j0.0220.1130.0810.0110.1000.0500.0070.061j
Elastic net0.0700.0080.0880.0800.009j0.093j0.0290.001h0.0350.087h0.021j0.109j0.0820.0110.1000.046g0.006h0.059g
Lasso regression0.0700.0080.0880.0800.009j0.093j0.0290.0010.0350.087h0.021j0.109j0.0820.0110.1000.046g0.006h0.059g
Random forest regressor0.065j0.007h0.081j0.0800.0100.0930.0280.001h0.034j0.1100.0240.1340.0820.0090.0900.047j0.006j0.060h
Light gradient boosting machine0.0700.0080.0880.0800.009j0.093j0.0290.001h0.0350.087h0.021j0.109j0.0820.0110.1000.046g0.006h0.059g
Gradient boosting regressor0.0750.0080.0860.0790.0100.0940.0290.001j0.0360.1410.0340.1680.0820.0100.0940.0510.0080.064
AdaBoost regressor0.0700.0070.0820.0800.0100.0910.0290.0010.0370.084g0.020h0.105g0.0870.0100.0960.0570.0070.072
Extreme gradient boosting0.0680.0090.0870.0800.0110.0980.0310.0020.0400.1510.0450.1710.0870.0110.0980.0550.0070.065
Decision tree regressor0.0760.0090.0860.0870.0130.1060.0290.0010.0370.1120.0300.1420.0980.0140.1110.0480.006j0.061
Linear regression0.2450.3120.3140.0940.0170.1140.1570.1640.2160.1240.0290.1482.36752.7193.3340.0620.0080.076
Prophet0.1080.0160.1260.0890.0110.1040.0400.0020.0490.1200.0150.1240.1140.0200.1430.0860.0110.106

aML: machine learning.

bWHO: World Health Organization.

cMAE: mean absolute error.

dMSE: mean squared error.

eRMSE: root-mean-square error.

fARIMA: autoregressive integrated moving average.

gThe highest-performing forecasting method.

hThe second-highest-performing forecasting method.

iSARIMAX: seasonal autoregressive integrated moving average with exogenous factors.

jThe third-highest-performing forecasting method.

Figure 6. One-step-ahead forecast for all pharmaceutical companies before and during COVID-19 using the best-performing models from Table S1 (Multimedia Appendix 1). ARIMA: autoregressive integrated moving average.
View this figure

Principal Findings

In this paper, we proposed a framework for using NLP-based text-mining techniques for performing comprehensive social media content analysis of various health care organizations. We processed reasonably large amounts of textual data for topic modeling, sentiment and engagement analysis, and sentiment forecasting. Our study revealed the following key findings:

  • Being the most active organization on social media does not translate to more user impact. WHO and the US public health agency CDC generated far more user impact than the Public Health Agency of Canada, even though the latter had a high number of relevant tweets when analyzed topicwise. People are more likely to engage with neutral tweets, which usually consist of some public health announcement rather than exclusively positive or negative tweets. This might mean that organizations can leverage this knowledge while creating content for social media posts in the future to increase their visibility in the online sphere.
  • Certain topics normally translate to more user engagement. Although the content on chronic diseases and health research dominated most of the tweets posted over the study period, there was a marked shift toward a discussion on COVID-19 and vaccination for public health agencies, more than what was observed in pharmaceutical companies. Tweets on COVID-19 and chronic diseases generate more interest among the public. Perhaps surprisingly, we found that people are not much receptive to content on medical trials, often shared by pharmaceutical companies, unless it concerns a public health emergency, such as the COVID-19 pandemic. Using particular hashtags certainly helps in generating engagement, as we found that most user engagement was highly skewed toward tweets concerning COVID-19. Moreover, our study revealed that compared to the user engagement patterns found in the majority of health care organizations (ie, with peaks observed around major events or announcements), there are wide variations in user engagement for WHO. This could be due to the global presence of WHO, implying that it might not be the same set of followers engaging with its content every time, but rather only those who are impacted by or interested in the content in some way.
  • When the content is structured, results tend to exceed expectations. We conducted sentiment forecasting on the data sets using different moving averages and various ML univariate models. Surprisingly, we observed that when the content is structured, as is normally the case for that available on official Twitter accounts, results tend to exceed expectations, more so before COVID-19 than during COVID-19. The models used in this research are able to predict monthwise tweet sentiment with high accuracy and low errors. This helped us in analyzing our work in-depth, and we did not need to create any multivariate ML models. Results show that commonly used ARIMA and SARIMAX models work well, and they can be used for predicting tweet sentiments on live data. This could also help organizations correlate tweet sentiment with user engagement. For example, the highest engagement on Pfizer’s tweets was for the ones labeled neutral, implying that the organization should structure the content of its future tweets in a similar manner to maintain higher levels of engagement. Furthermore, tweets that mention more news-relevant content might be able to translate it into more user engagement.

Limitations and Future Work

There are 3 limitations of this study that could be addressed in future research. First, this work focused on dividing the tweets into 2 phases, before and during COVID-19. In the future, researchers can pursue other methods of structuring the analysis timeline. Second, this study dealt with only the structured textual content of tweets. It would be interesting to also incorporate the presence of image attributes in future studies. Finally, as the scope of this study was limited to health care organizations, we did not account for public demographics. Understanding the demographic background of the public engaging with this content is another area that can be explored in future studies.

Conclusion

This study examined the online activity of US and Canadian health care organizations on Twitter. The NLP-based analysis of social media presented here can be incorporated to gauge engagement on the previously published tweets and to generate tweets that create an impact on people accessing health information via SMPs. As organizations continue to leverage SMPs by providing the latest information to the community, predicting a tweet’s sentiment before publishing can boost an organization’s perception by the public. In conclusion, we found that performing content analysis and sentiment forecasting on an organization’s social media usage provides a comprehensive view of how it resonates with society.

Acknowledgments

The authors thank members of the DaTALab at Lakehead University for valuable discussions, along with Andy Pan, Chandreen Ravihari Liyanage, and Lakshmi Preethi Kamak for annotating the sampled tweets to evaluate the tweet sentiment. This study was conducted using Digital Research Alliance of Canada computing resources. AS and MKB were supported by Vector Scholarships in artificial intelligence (AI) from Vector Institute, Toronto, Canada, and a Natural Sciences and Engineering Research Council (NSERC) Discovery Grant (#RGPIN-2017-05377) held by VM.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Topics and user engagement.

PDF File (Adobe PDF File), 666 KB

  1. Ventola CL. Social media and health care professionals: benefits, risks, and best practices. P T 2014 Jul;39(7):491-520 [FREE Full text] [Medline]
  2. Househ M. The use of social media in healthcare: organizational, clinical, and patient perspectives. Stud Health Technol Inform 2013;183:244-248. [Medline]
  3. Zhou L, Zhang D, Yang CC, Wang Y. Harnessing social media for health information management. Electron Commer Res Appl 2018 Jan;27:139-151 [FREE Full text] [CrossRef] [Medline]
  4. Xue J, Chen J, Hu R, Chen C, Zheng C, Su Y, et al. Twitter discussions and emotions about the COVID-19 pandemic: machine learning approach. J Med Internet Res 2020 Nov 25;22(11):e20550 [FREE Full text] [CrossRef] [Medline]
  5. Benetoli A, Chen T, Aslani P. How patients' use of social media impacts their interactions with healthcare professionals. Patient Educ Couns 2018 Mar;101(3):439-444. [CrossRef] [Medline]
  6. Li H, Sakamoto Y. Social impacts in social media: an examination of perceived truthfulness and sharing of information. Comput Hum Behav 2014 Dec;41:278-287. [CrossRef]
  7. Lu Y, Wu Y, Liu J, Li J, Zhang P. Understanding health care social media use from different stakeholder perspectives: a content analysis of an online health community. J Med Internet Res 2017 Apr 07;19(4):e109 [FREE Full text] [CrossRef] [Medline]
  8. Tyrawski J, DeAndrea DC. Pharmaceutical companies and their drugs on social media: a content analysis of drug information on popular social media sites. J Med Internet Res 2015 Jun 01;17(6):e130 [FREE Full text] [CrossRef] [Medline]
  9. Abualigah L, Alfar H, Shehab M. Sentiment analysis in healthcare: a brief review. In: Abd Elaziz M, Al-qaness MAA, Ewees AA, editors. Recent Advances in NLP: The Case of Arabic Language. Cham: Springer International; 2020:129-141.
  10. Chandrasekaran R, Mehta V, Valkunde T, Moustakas E. Topics, trends, and sentiments of tweets about the COVID-19 pandemic: temporal infoveillance study. J Med Internet Res 2020 Oct 23;22(10):e22624 [FREE Full text] [CrossRef] [Medline]
  11. Poddar S, Mondal M, Misra J. Winds of Change: Impact of COVID-19 on Vaccine-Related Opinions of Twitter Users.   URL: https://ojs.aaai.org/index.php/ICWSM/article/view/19334 [accessed 2022-06-29]
  12. Rufai S, Bunce C. World leaders' usage of Twitter in response to the COVID-19 pandemic: a content analysis. J Public Health (Oxf) 2020 Aug 18;42(3):510-516 [FREE Full text] [CrossRef] [Medline]
  13. Haman M. The use of Twitter by state leaders and its impact on the public during the COVID-19 pandemic. Heliyon 2020 Nov;6(11):e05540 [FREE Full text] [CrossRef] [Medline]
  14. Rosenberg H, Syed S, Rezaie S. The Twitter pandemic: the critical role of Twitter in the dissemination of medical information and misinformation during the COVID-19 pandemic. CJEM 2020 Jul 06;22(4):418-421 [FREE Full text] [CrossRef] [Medline]
  15. Park HW, Park S, Chong M. Conversations and medical news frames on Twitter: infodemiological study on COVID-19 in South Korea. J Med Internet Res 2020 May 05;22(5):e18897 [FREE Full text] [CrossRef] [Medline]
  16. Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, et al. Artificial intelligence-enabled analysis of public attitudes on Facebook and Twitter toward COVID-19 vaccines in the United Kingdom and the United States: observational study. J Med Internet Res 2021 Apr 05;23(4):e26627 [FREE Full text] [CrossRef] [Medline]
  17. Lwin MO, Lu J, Sheldenkar A, Schulz PJ, Shin W, Gupta R, et al. Global sentiments surrounding the COVID-19 pandemic on Twitter: analysis of Twitter trends. JMIR Public Health Surveill 2020 May 22;6(2):e19447 [FREE Full text] [CrossRef] [Medline]
  18. Dubey AD. Twitter sentiment analysis during COVID19 Outbreak. SSRN Electron J 2020:1-9. [CrossRef]
  19. Gao S, He L, Chen Y, Li D, Lai K. Public perception of artificial intelligence in medical care: content analysis of social media. J Med Internet Res 2020 Jul 13;22(7):e16649 [FREE Full text] [CrossRef] [Medline]
  20. Jang H, Rempel E, Roth D, Carenini G, Janjua NZ. Tracking COVID-19 discourse on Twitter in North America: infodemiology study using topic modeling and aspect-based sentiment analysis. J Med Internet Res 2021 Feb 10;23(2):e25431 [FREE Full text] [CrossRef] [Medline]
  21. Tang L, Liu W, Thomas B, Tran HTN, Zou W, Zhang X, et al. Texas public agencies' tweets and public engagement during the COVID-19 pandemic: natural language processing approach. JMIR Public Health Surveill 2021 Apr 26;7(4):e26720 [FREE Full text] [CrossRef] [Medline]
  22. Koumpouros Y, Toulias TL, Koumpouros N. The importance of patient engagement and the use of social media marketing in healthcare. Technol Health Care 2015 Jul 21;23(4):495-507. [CrossRef]
  23. Slavik CE, Buttle C, Sturrock SL, Darlington JC, Yiannakoulias N. Examining tweet content and engagement of Canadian public health agencies and decision makers during COVID-19: mixed methods analysis. J Med Internet Res 2021 Mar 11;23(3):e24883 [FREE Full text] [CrossRef] [Medline]
  24. Tommasel A, Diaz-Pace A, Rodriguez JM, Godoy D. Forecasting mental health and emotions based on social media expressions during the COVID-19 pandemic. Inf Discov Deliv 2021 Jun 03;49(3):259-268. [CrossRef]
  25. McClellan C, Ali MM, Mutter R, Kroutil L, Landwehr J. Using social media to monitor mental health discussions - evidence from Twitter. J Am Med Inform Assoc 2017 May 01;24(3):496-502 [FREE Full text] [CrossRef] [Medline]
  26. Miliou I, Pavlopoulos J, Papapetrou P. Sentiment nowcasting during the COVID-19 pandemic. In: Discovery Science. Cham: Springer International; 2021:218-228.
  27. Harper R, Southern J. A Bayesian deep learning framework for end-to-end prediction of emotion from heartbeat. IEEE Trans Affective Comput 2022 Apr 1;13(2):985-991 [FREE Full text] [CrossRef]
  28. Deepa N, Prabadevi B, Maddikunta PK, Gadekallu TR, Baker T, Khan MA, et al. An AI-based intelligent system for healthcare analysis using Ridge-Adaline stochastic gradient descent classifier. J Supercomput 2020 May 30;77(2):1998-2017. [CrossRef]
  29. Barrera Ferro D, Brailsford S, Bravo C, Smith H. Improving healthcare access management by predicting patient no-show behaviour. Decis Support Syst 2020 Nov;138:113398. [CrossRef]
  30. Li Y, Vinzamuri B, Reddy CK. Constrained elastic net based knowledge transfer for healthcare information exchange. Data Min Knowl Disc 2014 Dec 23;29(4):1094-1112. [CrossRef]
  31. Singh R, Singh R. Applications of sentiment analysis and machine learning techniques in disease outbreak prediction – A review. Mater Today 2021 May:1-6 [FREE Full text] [CrossRef]
  32. Mengistie T. COVID-19 outbreak data analysis and prediction modeling using data mining technique. Int J Comput 2020;38:37-60 [FREE Full text]
  33. Denecke K, Nejdl W. How valuable is medical social media data? Content analysis of the medical web. Inf Sci 2009 May 30;179(12):1870-1880. [CrossRef]
  34. Nawaz MS, Bilal M, Lali MI, Ul Mustafa R, Aslam W, Jajja S. Effectiveness of social media data in healthcare communication. J Med Imaging Health Inform 2017 Oct 01;7(6):1365-1371. [CrossRef]
  35. Twitter API: Academic Research Access.   URL: https://developer.twitter.com/en/products/twitter-api/academic-research [accessed 2022-07-05]
  36. Kangley M. HCPs Discuss ‘Booster Shot’ to Decrease the High Spread of the Delta Variant.   URL: https:/​/creation.​co/​knowledge/​hcps-discuss-booster-shot-to-decrease-the-high-spread-of-the-delta-variant/​ [accessed 2022-07-05]
  37. CDC COVID-19 Response Team, Jorden MA, Rudman SL, Villarino E, Hoferka S, Patel MT, et al. Evidence for limited early spread of COVID-19 within the United States, January-February 2020. Morb Mortal Wkly Rep 2020 Jun 05;69(22):680-684 [FREE Full text] [CrossRef] [Medline]
  38. PyPI. regex 2022.7.9.   URL: https://pypi.org/project/regex/ [accessed 2022-07-05]
  39. PyPI. nltk 3.7.   URL: https://pypi.org/project/nltk/ [accessed 2022-07-05]
  40. Lilleberg J, Zhu Y, Zhang Y. Support vector machines and Word2vec for text classification with semantic features. 2015 Presented at: IEEE 14th International Conference on Cognitive Informatics Cognitive Computing (ICCI*CC); July 6-8, 2015; Beijing, China. [CrossRef]
  41. Newman D, Lau J, Grieser K. Automatic evaluation of topic coherence. 2010 Presented at: Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics; June 2-4, 2010; Los Angeles   URL: https://aclanthology.org/N10-1012
  42. Röder M, Both A, Hinneburg A. Exploring the space of topic coherence measures. 2015 Presented at: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining; 2015; New York, NY. [CrossRef]
  43. Gemsim. Latent Dirichlet Allocation.   URL: https://radimrehurek.com/gensim/models/ldamodel.html [accessed 2022-07-05]
  44. Gensim. Parallelized Latent Dirichlet Allocation.   URL: https://radimrehurek.com/gensim/models/ldamulticore.html [accessed 2022-07-05]
  45. Gensim. Latent Semantic Indexing.   URL: https://radimrehurek.com/gensim/models/lsimodel.html [accessed 2022-07-05]
  46. Gensim. Non-Negative Matrix Factorization.   URL: https://radimrehurek.com/gensim/models/nmf.html [accessed 2022-07-05]
  47. Gensim. Hierarchical Dirichlet Process.   URL: https://radimrehurek.com/gensim/models/hdpmodel.html [accessed 2022-07-05]
  48. PyPI. advertools 0.13.1.   URL: https://pypi.org/project/advertools/ [accessed 2022-07-05]
  49. Alomari K, ElSherif H, Shaalan K. Arabic tweets sentimental analysis using machine learning. In: Advances in Artificial Intelligence: From Theory to Practice. Cham: Springer International; 2017:602-610.
  50. Peisenieks J, Skadins R. Uses of machine translation in the sentiment analysis of tweets. 2014 Presented at: Human Language Technologies – The Baltic Perspective - Proceedings of the Sixth International Conference; 2014; Kaunas, Lithuania p. 2014. [CrossRef]
  51. Şaşmaz E, Tek F. Tweet sentiment analysis for cryptocurrencies. 2021 Presented at: 6th International Conference on Computer Science and Engineering (UBMK); September 15-17, 2021; Ankara, Turkey p. 613-618. [CrossRef]
  52. Golubev A, Loukachevitch N. Improving results on Russian sentiment datasets. In: Communications in Computer and Information Science. Cham: Springer International; 2020:109-121.
  53. Nabil M, Aly M, Atiya A. ASTD: Arabic Sentiment Tweets Dataset. 2015 Presented at: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing; 2015; Lisbon, Portugal. [CrossRef]
  54. Rustam F, Khalid M, Aslam W, Rupapara V, Mehmood A, Choi GS. A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis. PLoS One 2021 Feb 25;16(2):e0245909 [FREE Full text] [CrossRef] [Medline]
  55. Hugging Face. cardiffnlp / twitter-roberta-base-sentiment.   URL: https://huggingface.co/cardiffnlp/twitter-roberta-base-sen timent [accessed 2022-07-19]
  56. About Your Activity Dashboard.   URL: https://help.twitter.com/en/managing-your-account/using-the-tweet-activity-dashboard [accessed 2022-07-05]
  57. Daniluk M, Dabrowski J, Rychalska B. Synerise at RecSys 2021: Twitter user engagement prediction with a fast neural model. 2021 Presented at: RecSysChallenge ’21: Proceedings of the Recommender Systems Challenge 2021; 2021; New York, NY. [CrossRef]
  58. Razis G, Anagnostopoulos I. InfluenceTracker: rating the impact of a Twitter account. 2014 Presented at: IFIP International Conference on Artificial Intelligence Applications and Innovations; September 19-21, 2014; Rhodes, Greece. [CrossRef]
  59. Son J, Lee J, Oh O, Lee HK, Woo J. Using a heuristic-systematic model to assess the Twitter user profile’s impact on disaster tweet credibility. Int J Inf Manag 2020 Oct;54:102176. [CrossRef]
  60. Marinai S, Dengel A. Document Analysis Systems VI: 6th International Workshop, DAS 2004, Florence, Italy, September 8-10, 2004, Proceedings. Berlin, Heidelberg: Springer; 2004.
  61. statsmodels.   URL: https://www.statsmodels.org/stable/index.html [accessed 2022-07-05]
  62. statsmodels.tsa.arima.model.ARIMA.   URL: https://www.statsmodels.org/devel/generated/statsmodels.tsa.arima.model.ARIMA.html [accessed 2022-07-05]
  63. statsmodels.tsa.statespace.sarimax.SARIMAX.   URL: https://www.statsmodels.org/devel/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html [accessed 2022-07-05]
  64. PyPI. pycaret.   URL: https://pypi.org/project/pycaret/ [accessed 2022-07-05]
  65. PyPI. prophet.   URL: https://pypi.org/project/prophet/ [accessed 2022-07-05]


ARC: advanced research computing
ARIMA: autoregressive integrated moving average
CDC: Centers for Disease Control and Prevention
CPU: central processing unit
HCP: health care professional
HDP: hierarchical dirichlet process
LDA: latent dirichlet allocation
LSI: latent semantic indexing
MAE: mean absolute error
ML: machine learning
MSE: mean squared error
NGO: nongovernment organization
NIH: National Institutes of Health
NLP: natural language processing
NMF: nonnegative matrix factorization
RMSE: root-mean-square error
SARIMAX: seasonal autoregressive integrated moving average with exogenous factors
SMP: social media platform
TF-IDF: term frequency–inverse document frequency
WHO: World Health Organization


Edited by T Hao; submitted 09.03.22; peer-reviewed by S Doan, A Benis; comments to author 27.06.22; revised version received 08.07.22; accepted 15.07.22; published 18.08.22

Copyright

©Aditya Singhal, Manmeet Kaur Baxi, Vijay Mago. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 18.08.2022.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.