Published on in Vol 11 (2023)

This is a member publication of Lancaster University (Jisc)

Preprints (earlier versions) of this paper are available at, first published .
How People With a Bipolar Disorder Diagnosis Talk About Personal Recovery in Peer Online Support Forums: Corpus Framework Analysis Using the POETIC Framework

How People With a Bipolar Disorder Diagnosis Talk About Personal Recovery in Peer Online Support Forums: Corpus Framework Analysis Using the POETIC Framework

How People With a Bipolar Disorder Diagnosis Talk About Personal Recovery in Peer Online Support Forums: Corpus Framework Analysis Using the POETIC Framework

1Division of Health Research, Spectrum Centre for Mental Health Research, Lancaster University, , Lancaster, , United Kingdom

2UCREL Research Centre, School of Computing and Communications, Lancaster University, , Lancaster, , United Kingdom

3Faculty of Arts and Social Sciences, Department of Linguistics and English Language, Lancaster University, , Lancaster, , United Kingdom

Corresponding Author:

Steven Huntley Jones, PhD

Background: Personal recovery is of particular value in bipolar disorder, where symptoms often persist despite treatment. We previously defined the POETIC (Purpose and Meaning, Optimism and Hope, Empowerment, Tensions, Identity, Connectedness) framework for personal recovery in bipolar disorder. So far, personal recovery has only been studied in researcher-constructed environments (eg, interviews and focus groups). Support forum posts can serve as a complementary naturalistic data resource to understand the lived experience of personal recovery.

Objective: This study aimed to answer the question “What can online support forum posts reveal about the experience of personal recovery in bipolar disorder in relation to the POETIC framework?”

Methods: By integrating natural language processing, corpus linguistics, and health research methods, this study analyzed public, bipolar disorder support forum posts relevant to the lived experience of personal recovery. By comparing 4462 personal recovery–relevant posts by 1982 users to 25,197 posts not relevant to personal recovery, we identified 130 significantly overused key lemmas. Key lemmas were coded according to the POETIC framework.

Results: Personal recovery–related discussions primarily focused on 3 domains: “Purpose and meaning” (particularly reproductive decisions and work), “Connectedness” (romantic relationships and social support), and “Empowerment” (self-management and personal responsibility). This study confirmed the validity of the POETIC framework to capture personal recovery experiences shared on the web and highlighted new aspects beyond previous studies using interviews and focus groups.

Conclusions: This study is the first to analyze naturalistic data on personal recovery in bipolar disorder. By indicating the key areas that people focus on in personal recovery when posting freely and the language they use, this study provides helpful starting points for formal and informal carers to understand the concerns of people diagnosed with a bipolar disorder and to consider how to best offer support.

JMIR Med Inform 2023;11:e46544



Bipolar disorder (BD) is a severe mental health (MH) problem characterized by recurring episodes of depressed and elevated mood [1]. Its lifetime prevalence ranges from 0.1% to 2.6% internationally [2]. BD is associated with lower quality of life [3] and high suicide risk [4]. Therefore, fostering recovery and living well with BD are important societal tasks.

MH care agendas increasingly focus on enhancing personal recovery (PR), defined as “a way of living a satisfying, hopeful life even with the limitations caused by the illness” [5]. This contrasts with a previously narrower focus on reducing symptoms (clinical recovery). PR might be of particular value in BD [6], where symptoms often persist despite treatment, but has been underresearched to date [7]. Jagfeld et al [8] (hereafter the POETIC review) recently synthesized 12 qualitative studies to develop the first conceptual framework for PR in BD. The POETIC (Purpose and Meaning, Optimism and Hope, Empowerment, Tensions, Identity, Connectedness) framework, based on the CHIME (Connectedness, Hope and Optimism, Identity, Meaning and Purpose, Empowerment) framework [9], comprises the following processes: “Purpose and meaning,” “Optimism and hope,” “Empowerment,” “Tensions,” “Identity,” and “Connectedness” (see Table 1).

Table 1. The POETICa framework [8]: lived experience of personal recovery in bipolar disorder.
First-level domainsSecond-level categories
Purpose and meaning
  • Meaning of mental illness experiences
  • Paid or voluntary work
  • Quality of life
  • Meaningful life and social roles
Optimism and hope
  • Belief in possibility of recovery
  • Positive thinking and valuing success
  • Hope-inspiring relationships
  • Having dreams and aspirations
  • Self-management and personal responsibility
  • Controversial role of medication
  • Control over life
  • Balancing acceptance with ambitions
  • Openness enables support but also stigmatization
  • Ambivalence around (hypo)mania
  • Rebuilding positive sense of self
  • Overcoming stigma
  • Dimensions of identity
  • Support from others
  • Relationships
  • Peer support and support groups
  • Being part of the community

aPOETIC: Purpose and Meaning, Optimism and Hope, Empowerment, Tensions, Identify, Connectedness.

Current research on PR in BD has several limitations. First, it is mainly based on qualitative studies with few participants [10] and expert opinions, lacking quantitative evidence from larger samples [11]. Second, data collection is limited to structured settings (semistructured interviews, focus groups, and structured measures), which are not naturalistic and are subject to either interviewer bias [12] or constrained responses in structured measures. Third, recruitment is biased toward people who want to talk about PR and are in contact with services or researchers [8].

Naturalistic data collection, where “participants are not aware that they are being studied” [13], overcomes many of these limitations. Online forum posts are a source of naturalistic data, which can offer potential insights into “an experience as it is lived rather than as it is enacted in the researcher constructed environment” [14]. Some natural language processing (NLP) studies have analyzed large numbers of BD online forum posts via automatic quantitative methods such as content analysis [15] or emotion analysis [16,17] to identify forum topics or language differences between people with different or no MH diagnoses. Qualitative studies have applied conversation analysis [18], thematic analysis [19], grounded theory [20], and content analysis [21] to BD online forum posts. Such studies offer rich nuanced accounts of web-based discussions on BD but include only few, often handpicked, posts.

Corpus linguistics [22] provides a mix of quantitative and qualitative methods informed by linguistic theory for analyzing large amounts of text data with depth and richness that can overcome some of the shortcomings of previous NLP and qualitative studies. Semino et al [23] analyzed interviews and online forum posts of patients with cancer and their carers to learn about their lived experience and the metaphors they use for dealing with cancer. Hunt and Brookes [24] applied a combination of corpus linguistics and discourse analysis [25] to MH forum posts. Two corpus-linguistics studies have focused on BD specifically: Abdo et al [26] studied linguistic types of judgments, and McDonald and Woodward-Kron [27] studied forum users’ roles and identities.

A systematic review strongly recommended considering web-based content from individuals with lived experience in PR research [10], which has not yet been done. Therefore, the main aim of this paper was to gain further insights into the experience of PR in BD from online forum posts via a combination of NLP, corpus linguistics, and qualitative health research methods. Furthermore, the POETIC framework, synthesized from data collected via interviews or focus groups, has not been applied to new data yet. Hence, the secondary aim of this paper was to validate the framework by exploring to what extent it captures experiences shared on the web. The research question covering both aims is “What can online support forum posts reveal about the experience of PR in BD in relation to the POETIC framework?”

Data Source

This study analyzed posts from the international web-based discussion platform Reddit [28], which hosts subforums (subreddits) for various topics, including BD. Several reasons motivated the choice of this site: Reddit is one of the most visited internet sites worldwide with an international user base [29]; in contrast to other online support communities, everyone can read all public posts without a user account; and Reddit allows data analysis by third parties.

Reddit users with a self-reported BD diagnosis (S-BiDD) were automatically identified by matching phrases such as “I was diagnosed with bipolar” in all posts between January 2005 (the inception of Reddit) and March 2019 (see Jagfeld et al [30]). All posts of the identified users form the S-BiDD data set. Naturalistic data collection required subsequent filtering for content relevant for PR in BD, as an exploratory study revealed that the posts in the S-BiDD data set covered many other topics (see Report S1 in Multimedia Appendix 1). Figure 1 displays a flow chart for construction of the PR-BD corpus. In linguistics, a corpus is a sampled collection of texts representing a particular language variety [31]. The basis for the corpus was only posts in BD subreddits [32] (fourth level=“bipolar”), because a second exploratory study found that references to “recovery” and associated word forms were almost exclusively in relation to BD in BD subreddits (see Report S2 in Multimedia Appendix 1). Furthermore, only posts mentioning BD [33] were selected because only two-thirds (66%) of MH-related “recovery” mentions in BD subreddits referred to BD.

Figure 1. Flowchart of the 4 steps to create the PR-BD corpus and reference corpus. BD: bipolar disorder; MH: mental health; PR: personal recovery; S-BiDD: self-reported bipolar disorder diagnosis.

To select PR-relevant posts, a list of PR terms (comprising both single words and multiword phrases; n=562) [34] was compiled using corpus-linguistics methods (Document S2 in Multimedia Appendix 1). BD subreddit posts that mentioned BD were ranked according to their similarity with the PR terms list via term frequency–inverse document frequency–weighted cosine similarity, a standard information retrieval approach [35,36] (see Document S3 in Multimedia Appendix 1). To determine the cosine similarity cutoff, GJ coded whether 90 posts pertained to PR in BD using a preliminary codebook based on the second exploratory study and the POETIC codebook. SHJ audited the coding. To select the 90 posts, 10 posts were randomly sampled from every 10% quantile of the cosine similarity scores, taking only 10 posts from the first 2 quantiles that all scored 0. Following this, a minimum length of 94 words was set, as 5 posts shorter than this length lacked context to decide on their PR relevance (see Figure S5 in Multimedia Appendix 1). The codebook was refined to its final version (Document S4 in Multimedia Appendix 1). GJ and CH blindly coded 120 additional posts, again randomly sampled from each quantile of the cosine scores.

Ethical Considerations

The Lancaster University Faculty of Health and Medicine research ethics committee approved this research in May 2019 (reference FHMREC18066), which follows ethics guidelines for internet-mediated research [37]. It was infeasible to seek individual informed consent from the large number of included forum users, but quotes were paraphrased to protect users’ anonymity (see Document S5 in Multimedia Appendix 1). We recognize that some people may object to the use of web-based posts as research data without individual consent (eg, [38]). Users generally post to share information or seek support and do not directly provide their content for research. However, we believe that on balance, the benefits of this research to better understand PR makes it worthwhile while acknowledging these potential objections.

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

Involvement of People With Lived Experience

In all, 4 volunteers with lived experience of BD, who use online forums, and were recruited via People in Research [39] provided input on the study design, results, and subsequent plans in individual web-based meetings. All 4 volunteers—1 man and 3 women—were UK-based and in their 30s to 40s; additionally, 3 (75%) reported a bipolar II disorder diagnosis, 1 (25%) did not further specify their BD diagnosis, and at least 2 (50%) had a migrant background. Importantly, all volunteers were very supportive of the project, and none raised ethical concerns. After study completion, the volunteers were reinvited to provide feedback on our interpretations of the results.


Reflexivity is important to highlight how subjectivity may have impacted on research findings [40]. The research team embraces a PR approach in BD. GJ, FL, and SHJ previously developed the POETIC framework for PR in BD. They anticipated that it would capture many aspects shared on the web, but data analysis would reveal new aspects and deeper insights into the experience of PR in everyday life.

Corpus Framework Analysis

Data analysis drew on methods from corpus linguistics [22] and qualitative framework analysis [41], which we call corpus framework analysis. Quantitative corpus-linguistics methods derive frequency lists of the words in the corpus; identify keywords that occur statistically significantly more frequently in the corpus compared to other language samples; and find collocations, that is, words a target word co-occurs with more frequently than by chance. The main qualitative method is to analyze the context of specific words or phrases in so called concordances. Key lemmas in the PR-BD corpus were identified by comparing it to a reference corpus of posts with low similarity to the PR terms list via #LancsBox (version 6.0; Lancaster University) [42]. A lemma is the dictionary form of a word; for example, “recovering” and “recovered” are word forms of the lemma “recover.” To focus on the most salient topics of the PR-BD corpus, key lemmas overused at least twice at a significance level of P<.0001 [43] and used by at least 5% of users were analyzed. See Document S6 in Multimedia Appendix 1 for methodological details.

The key lemmas were coded into the POETIC framework via concordance analysis. First, overall impressions of all concordances were noted after sorting them according to the lemma, left, and right context (20 words each) in #LancsBox. Subsequently, 30 randomly sampled concordances for each key lemma were coded into the second-level POETIC categories (see the codebook in Appendix B of Jagfeld et al [8]). The coders read the full post if the 40 words did not provide enough context and noted impressions for every key lemma again. Finally, concordances that did not fit into an existing POETIC category were coded inductively. GJ coded all key lemmas, and SHJ and FL audited 6 key lemmas each.

Finally, new key lemmas that were not in the PR terms list and absent PR terms [44] were analyzed. Absence was defined as 0 frequency in the PR-BD corpus or a lower relative frequency than in the reference corpus. Additionally, collocations were analyzed via the #LancsBox GraphColl tool for some key lemmas. To do so, content words (noun, verb, adjective, and adverb) within a context of 5 words left and right of the target term and a minimum collocation frequency of 5 were ranked according to cubed mutual information [24].

The S-BiDD data set [45] contains 21,407,595 posts by 19,685 users (available for noncommercial research after signing a data usage agreement). The programming code is publicly available [46].

Coding the PR Relevance of Posts and Constructing the PR-BD Corpus

Following a blind trial of coding the PR relevance of 20 posts and a subsequent discussion, GJ and CH achieved moderate agreement (Cohen κ=0.51; 77/100, 77% observed agreement) in coding the remaining 100 posts (see Table S20 in Multimedia Appendix 1). The team resolved all disagreements. In total, 66 (31%) of the 210 posts were coded as PR relevant (see Table S21 in Multimedia Appendix 1). Based on this, the PR-BD corpus comprises posts with a PR score above 0.025 to balance precision in selecting PR-relevant posts and corpus size (see Table S22 in Multimedia Appendix 1). The PR-BD corpus has 4462 posts with 1,337,080 words by 1982 users. The reference corpus of posts with a PR score below 0.013 (see Table S23 in Multimedia Appendix 1) comprises 25,197 posts with 4,700,834 words by 6075 users.

Concordance Analysis With the POETIC Framework

In all, 130 lemmas met the prespecified keyness criteria. Figure 2 shows the domain and category frequencies, extrapolated from the 30 concordance lines coded for each key lemma and color coded according to Tol’s [47] light scheme for color-blindness accessibility. Table S28 in Multimedia Appendix 1 lists the key lemmas coded into each category, and Table S29 in Multimedia Appendix 1 lists the categories that each key lemma was coded into. Overall, the POETIC framework captured the experiences in the PR-BD corpus very well: there was evidence for all categories. Only 16% (9303/59,199) of key lemma instances fell into the new “Not POETIC” domain rather than the existing framework. The text below briefly reviews each domain with key lemmas in italics, highlighting the differences between the original framework and the web-based data. Tables S30 and S31 in Multimedia Appendix 1 provide illustrative quotes for all categories.

Figure 2. Frequency of POETIC domains and categories and new categories. MH: mental health; POETIC: Purpose and Meaning, Optimism and Hope, Empowerment, Tensions, Identity, Connectedness.

Purpose and Meaning

“Purpose and meaning” was the most frequent domain and contained the most salient topic differences between the POETIC review and web-based data. Although participants in the POETIC review mainly discussed the meaningful life and social role of being a parent, web-based discussions focused on reproductive decisions. Participants discussed perceived risks they might be responsible for, for example, the possibility for their child to develop MH issues that affected their decision or choice to bear and raise kids. In the “Work” category, extended to include formal education, many discussions focused on struggles around studying and graduating college. No participant in the POETIC review reported financial or housing issues, whereas several web-based users complained about a low quality of life due to money problems, causing homelessness or inability to afford treatment. Spirituality was discussed more frequently and richer than in the POETIC review. Users often wondered whether to regard their experiences as truly spiritual or rather as (hypo)manic symptoms.

Optimism and Hope

Reddit users differed in their “Belief” in the possibility of recovery. The mainstream opinion on Reddit was that BD is a “chronic condition that cannot be cured, only managed.” Users questioned whether feeling fully recovered was not just a temporary experience caused by (hypo)mania. In the “Positive thinking and valuing success” category, many users were grateful for aspects of their BD experiences; for example, challenges provide opportunities for growth and demonstrating strength.


As in the POETIC review, “Self-management and personal responsibility” was the most frequent and richest category. Forum users generally considered (taking steps towards) maintaining a healthy lifestyle (including routines or schedules, diet, exercise, and coping skills) as an individual’s responsibility to reach recovery. In contrast, experiencing MH symptoms or feeling stuck in their recovery was regarded outside of someone’s responsibility if they followed professional or mainstream forum advice. The “Controversial role of medication” category included concerns about drug effects on the baby during pregnancy or nursing and alternative non–evidence-based treatments such as the keto diet or cannabis, which were not present in the POETIC review.


Experiences coded in the “Tensions” domain were similar to the POETIC review. Several participants shared feeling more comfortable to discuss “Ambivalence” around (hypo)mania on the web. Some asked if there was a possibility to enjoy increased motivation and confidence to make progress in their goals without the hypomania getting out of control.


Some participants shared rich success stories in the “Rebuilding positive sense of self” category, in which they moved away from shame and guilt by forgiving themselves for past behaviors and toward accepting themselves, whereas others were struggling with this process. Shame associated with stigma in the society was another focus of discussions, and some participants shared creative ways for overcoming stigma.


Regarding “Connectedness,” users mainly discussed relationships and support from others. Although there were positive accounts, participants often discussed struggles with romantic relationships or marriage and friendships and complained about issues with professional and family support, similar to the POETIC review. However, the web-based accounts, particularly those of relationship and family problems, appeared more candid, for example, with them also discussing sexual issues, trauma, and shame.


Inductive coding of the 645 concordance lines that did not fit into the POETIC framework revealed that they were unrelated to individuals’ PR or lived experience. Most quotes discussed other MH issues without PR relevance (symptoms, genetics and heredity, treatment, diagnosis, societal issues, and scientific research), followed by storytelling of their own or others’ situation without PR relevance; direct interactions between forum users, for example, giving advice or congratulating; and discussions of non-MH issues.

New PR Terms

Although 99 (76%) of 130 key lemmas were PR terms, 31 (24%) key lemmas were new. Of these 31 lemmas, 15 (48%) conveyed similar meanings to PR terms; for example, brother likened other family members in the PR terms list such as son or nephew. Another 7 (23%) new key lemmas introduced aspects not covered by PR terms. For example, baby, raise, and bear were related to reproductive decision-making; childhood was related to making sense of MH issues via early traumatic experiences; and environment was related to a focus on structural or societal circumstances rather than the individual (see Table S32 in Multimedia Appendix 1).

Absent PR Terms

Only 13% (n=54) of the 416 unique PR terms (after removing spelling and phraseological variants) were absent: 46 (11%) did not appear in the PR-BD corpus and 8 (2%) were underused compared to the reference corpus. The underused PR terms referred to symptoms (high mood, mania, manic, and sleep) or medical MH professionals (doctor, pdoc [psychiatrist], and psychiatrist; see Table S33 in Multimedia Appendix 1). These terms were relevant for some PR domains but also were strongly associated with clinical recovery. All PR terms missing in the PR-BD corpus were also missing in the reference corpus. They were mostly complex phrases, for example, brush yourself off, and none indicated aspects that were not covered by other key lemmas (see Table S34 in Multimedia Appendix 1).

Feedback From People With Lived Experience

Two volunteers who had commented on the first exploratory study provided feedback on the main study results. Overall, they valued the results and agreed with our findings but indicated limitations of the data, as reflected in the Discussion section. One volunteer argued that categorizations of experiences can be problematic for masking individual differences. Conversely, the other volunteer had found it particularly helpful to align some of her behaviors with CHIME categories because this gave her a sense of being on the right track.

This study analyzed Reddit posts of people with a BD diagnosis via corpus framework analysis to learn about the lived experience of PR in BD and validate the POETIC framework.

Key Findings in Relationship to Previous Work

The primary study aim was to provide new insights on PR in BD. Indeed, the web-based data contained candid, in-the-moment experiences that traditional qualitative data collection is unlikely to retrieve. For example, 1 user posted about their experiences in a current manic episode on 2 subsequent days: “Yesterday I posted here about the realization that I’ve entered a manic episode.” Other users shared things on the web that they had not shared elsewhere: ‘Talking about this part of my inner world to a psychiatrist would require a lot of trust for me.” The users had different interpretations of elated mood as signs of recovery, spiritual experiences, helpful motivational boosts, or dangerous MH symptoms to avoid. Quantitative [48,49] and qualitative [50,51] evidence shows that web-based anonymity affords personal self-disclosures and discussions of sensitive and stigmatized issues.

The results show that 3 POETIC domains were featured the most in Reddit discussions: “Purpose and meaning” (particularly reproductive decision making, work, and formal education), “Connectedness” (romantic relationships and social support), and “Empowerment” (self-management and personal responsibility). In line with a recent quantitative review [52], the concerns raised on Reddit pointed to a wide range of social and occupational functioning among people with a BD diagnosis: some were not working or leaving their house and therefore sought support on the web, whereas others asked for specific advice to further improve their already functional lifestyle. The popularity of the “Self-management and personal responsibility” category agrees with recent quantitative findings. A review by Mezes et al [53] found positive associations between PR and psychological characteristics focusing on control and personal agency, and a longitudinal study identified positive impacts of adaptive coping and balanced risk-taking on PR [54].

Importantly, the analysis highlighted PR issues that exclusively or more frequently came up on the web. This might be due to differences in sample demographics and data collection methods between this study and those included in the POETIC review. Users in the S-BiDD data set were younger than those in the studies included in the POETIC review: the S-BiDD data set users had a mean age of 32 years versus 45 years in the POETIC review, 30% (5866/19,685) versus 17% (18/163) of participants were aged between 18-29 years, and 7% (1299/19,685) versus 34% (36/163) were aged between 50-64 years [8,30]. This might explain why perspectives on transitioning into adulthood with BD, challenges of college education, and reproductive decision-making exclusively surfaced in the web-based data. Sahota and Sankar [20] summarized their qualitative analysis of discussions of genetic risk and reproductive decision-making in 2 BD subreddits as centering around the manageability of parenting a child for people with a BD diagnosis, which aligns well with the experiences found in this study.

Moreover, users in the S-BiDD data set were overwhelmingly from the United States [30], whereas all POETIC review studies stemmed from countries that provide at least a basic level of free public MH care and social security (the United Kingdom, Norway, Australia, Canada, China, Spain, and Turkey). This may explain why existential financial issues such as (threat of) homelessness and the inability to afford treatment surfaced only in the web-based data. Since health insurance in the United States (except for Medicare for those aged 65+ y) is either employer provided or privately paid, individuals who cannot work due to their MH issues lose their insurance and in turn access to professional support, often causing MH issues to exacerbate, for example, by abruptly stopping medication. One Reddit user described this as a “vicious cycle.” It also appears plausible that Reddit users stem from a different socioeconomic group than the participants recruited into the POETIC review studies.

The secondary aim of this study was to validate the POETIC framework. Results confirmed that the framework usefully captured PR experiences shared on the web. Web-based users discussed all second-level POETIC categories, and only 645 of the 3900 analyzed concordance lines could not be accommodated in the framework, demonstrating its comprehensiveness.

Strengths and Limitations

Three aspects of this study constitute both strengths and limitations. First, using online forums as a data source provided rich, candid, and in-the moment experiences. However, there is limited background and demographic information on the online forum users (but see Jagfeld et al [30] for an analysis of these properties in the users in the S-BiDD data set), and they are not representative of the general population with a BD diagnosis. One user in the PR-BD corpus posted “My hunch is that r/bipolarreddit overrepresents those who are struggling, who, understandably, may be more pessimistic about everything.” One volunteer shared his experience that discussions on Reddit MH forums mainly followed a mainstream opinion and that deviant opinions were ignored or suppressed. McDonald and Woodward-Kron [27] support this with corpus-linguistics evidence that BD forum users over time shifted from advice seeking to giving and used more medicalized language. Similarly, Vayreda and Antaki [18] showed that established BD forum users urged new members to seek a formal diagnosis and reinforced a biomedical view of BD. Our Reddit study provides one lens on the lived experience of some people that can complement studies of other MH forums and other sources, such as one-on-one interviews.

Second, the list of PR terms facilitated focusing on the concept of interest among the wealth of data, yet it arguably biased the data selection. Nevertheless, 52% (16/31) of the key lemmas that were not PR terms contributed new PR aspects. Moreover, explicitly stating our expectations of PR aspects via the terms list enabled us to identify absent aspects in the data.

Third, corpus-linguistics methods, particularly the coding of key lemmas, allowed the analysis of more data than traditional qualitative methods. However, single words probably more readily capture topic-like (eg, “Relationships”) rather than theme-like (eg, “Balancing acceptance with ambitions”) categories. Therefore, the relative category frequencies should be interpreted with some caution.

Research Implications

This study has at least 4 research implications. First, it demonstrates the usefulness of analyzing online forum posts to tap into authentic and candid accounts of lived experience of MH issues. Second, this study serves as the first validation of the POETIC framework. Ideally, this encourages other researchers to apply it in their research. Third, the combination of corpus linguistics and qualitative framework analysis allowed the analysis of large amounts of data. Hence, corpus framework analysis may also be useful for future studies of text data, such as therapy transcripts (eg, [55]). Lastly, the S-BiDD data set and derived corpora are available for future research, for example, on other aspects of the lived experience of BD.

Clinical Implications

This study identifies the key issues relevant to PR in BD shared by people with lived experience on the web and extends previous knowledge from interviews and focus groups. These findings, including the quotes in Tables S30 and S31 in Multimedia Appendix 1, are a rich resource for understanding more about the experience of PR in BD for individuals living with BD, their loved ones and informal carers, and MH professionals. This is also relevant for recent initiatives to educate MH professionals on the lived experience of severe MH issues, such as the current “Understanding psychosis and BD” training for the UK National Health Service [56]. Subsequently, issues identified in this study may provide helpful starting points for therapists to collaboratively consider them with their clients, for example, in recovery-focused therapy [57,58].

Individuals discussed issues on the web that they considered contentious and personal and were not comfortable sharing offline, such as sexuality, spirituality, and (hypo)mania. Recovery-focused therapies that are free to work with whatever model the clients bring for their BD experiences [58] may be particularly suitable to create a therapeutic environment where clients feel comfortable to discuss such sensitive issues. Moreover, Jones et al [59] showed that recovery-focused therapy reduces the positive self-appraisal of hypomanic experiences.

Reproductive decision-making surfaced as a major issue for young adults living with BD, and dedicated counseling on this topic may be advisable. Although understanding genetic vulnerability and risk data in MH is challenging, there is evidence that genetic counseling can offer effective support [60].


This study analyzed 4462 Reddit posts by 1982 people with an S-BiDD within the POETIC framework [8] for PR in BD. It is the first to analyze online forum data on PR. This study confirmed the validity of the POETIC framework to also capture PR experiences shared on the web and highlighted new aspects in PR that did not come up in previous studies using interviews and focus groups. It also demonstrated the utility of integrating corpus linguistics and qualitative framework analysis to identify key themes within large text data sets. By indicating the key areas that people focus on when posting freely, this study provides rich insights into the lived experience of PR in BD for formal and informal carers of people with a BD diagnosis.


The authors wish to thank the volunteers with lived experience who provided helpful insights for conducting and writing up this research. GJ is grateful to Enrica Troiano for insightful discussions on designing and evaluating the coding for personal recovery relevance. She would also like to thank Gavin Brookes and the participants of the 6th Corpora & Discourse International Conference for helpful comments on the corpus construction and analysis. The authors also thank Matthew Coole for testing the code release. This study was completed as part of a PhD studentship for GJ, which was funded by the Faculty of Health and Medicine at Lancaster University, United Kingdom.

Authors' Contributions

GJ led on the study design, supported by SHJ, FL, and PR. GJ collected the data, conducted the analyses, and drafted the manuscript. SHJ and FL audited the analyses (personal recovery relevance of posts and framework analysis of concordance lines). CH double coded the personal recovery relevance of posts and reviewed the paraphrasing of selected quotes. All coauthors discussed and agreed on the results, commented on the draft manuscript, and approved the final version. SHJ, FL, and PR obtained the funding for this study.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Supplementary reports and documents.

DOCX File, 513 KB

  1. Jones SH, Lobban F, Cook A, Division of Clinical Psychology. Understanding Bipolar Disorder: Why Some People Experience Extreme Mood States and What Can Help: The British Psychological Society; 2010. [CrossRef]
  2. Rowland TA, Marwaha S. Epidemiology and risk factors for bipolar disorder. Ther Adv Psychopharmacol 2018 Sep;8(9):251-269 [CrossRef] [Medline]
  3. Michalak EE, Yatham LN, Lam RW. Quality of life in bipolar disorder: a review of the literature. Health Qual Life Outcomes 2005 Nov 15;3:72 [CrossRef] [Medline]
  4. Novick DM, Swartz HA, Frank E. Suicide attempts in bipolar I and bipolar II disorder: a review and meta-analysis of the evidence. Bipolar Disord 2010 Feb;12(1):1-9 [CrossRef] [Medline]
  5. Anthony WA. Recovery from mental illness: the guiding vision of the mental health service system in the 1990s. Psychosoc Rehabil J 1993;16(4):11-23 [CrossRef]
  6. Murray G, Leitan ND, Thomas N, et al. Towards recovery-oriented Psychosocial interventions for bipolar disorder: quality of life outcomes, stage-sensitive treatments, and Mindfulness mechanisms. Clin Psychol Rev 2017 Mar;52:148-163 S0272-7358(16)30220-3. [CrossRef] [Medline]
  7. van Weeghel J, van Zelst C, Boertien D, Hasson-Ohayon I. Conceptualizations, assessments, and implications of personal recovery in mental illness: a scoping review of systematic reviews and meta-analyses. Psychiatr Rehabil J 2019 Jun;42(2):169-181 [CrossRef] [Medline]
  8. Jagfeld G, Lobban F, Marshall P, Jones SH. Personal recovery in bipolar disorder: systematic review and "best fit" framework synthesis of qualitative evidence - a POETIC adaptation of CHIME. J Affect Disord 2021 Sep 1;292:375-385 [CrossRef] [Medline]
  9. Leamy M, Bird V, Le Boutillier C, Williams J, Slade M. Conceptual framework for personal recovery in mental health: systematic review and narrative synthesis. Br J Psychiatry 2011 Dec;199(6):445-452 [CrossRef] [Medline]
  10. Stuart SR, Tansey L, Quayle E. What we talk about when we talk about recovery: a systematic review and best-fit framework synthesis of qualitative literature. J Ment Health 2017 Jun;26(3):291-304 [CrossRef] [Medline]
  11. Slade M, Leamy M, Bacon F, et al. International differences in understanding recovery: systematic review. Epidemiol Psychiatr Sci 2012 Dec;21(4):353-364 [CrossRef] [Medline]
  12. Briggs CL. Learning How to Ask: A Sociolinguistic Appraisal of the Role of the Interview in Social Science Research: Cambridge University Press; 1986. [CrossRef]
  13. Janetzko D. Nonreactive data collection online. In: Fielding NG, Lee RM, Blank G, editors. The SAGE Handbook of Online Research Methods: SAGE Publications Ltd; 2016:76-91 [CrossRef]
  14. Seale C, Charteris-Black J, MacFarlane A, McPherson A. Interviews and Internet forums: a comparison of two sources of qualitative data. Qual Health Res 2010 May;20(5):595-606 [CrossRef] [Medline]
  15. Kramer ADI, Fussell SR, Setlock LD. Text analysis as a tool for analyzing conversation in online support groups. Presented at: CHI EA ’04: CHI ’04 Extended Abstracts on Human Factors in Computing Systems; Apr 24-29, 2004; Vienna Austria p. 1485-1488 [CrossRef]
  16. Gkotsis G, Oellrich A, Hubbard T, et al. The language of mental health problems in social media. Presented at: Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology; Jun 16, 2016; San Diego, CA, USA p. 63-73 [CrossRef]
  17. Coppersmith G, Dredze M, Harman C, Hollingshead K. From ADHD to SAD: analyzing the language of mental health on Twitter through self-reported diagnoses. Presented at: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology; Jun 5, 2015; Denver, Colorado p. 1-10 [CrossRef]
  18. Vayreda A, Antaki C. Social support and unsolicited advice in a bipolar disorder online forum. Qual Health Res 2009 Jul;19(7):931-942 [CrossRef] [Medline]
  19. Mandla A, Billings J, Moncrieff J. ”Being bipolar”: a qualitative analysis of the experience of bipolar disorder as described in internet blogs. Issues Ment Health Nurs 2017 Oct;38(10):858-864 [CrossRef] [Medline]
  20. Sahota PKC, Sankar PL. Bipolar disorder, genetic risk, and reproductive decision-making: a qualitative study of social media discussion boards. Qual Health Res 2020 Jan;30(2):293-302 [CrossRef] [Medline]
  21. Bauer R, Bauer M, Spiessl H, Kagerbauer T. Cyber-support: an analysis of online self-help forums (online self-help forums in bipolar disorder). Nord J Psychiatry 2013 Jun;67(3):185-190 [CrossRef] [Medline]
  22. McEnery T, Hardie A. Corpus Linguistics: Method, Theory and Practice: Cambridge University Press; 2011. [CrossRef]
  23. Semino E, Demjén Z, Hardie A, Payne S, Rayson P. Cancer and the End of Life: A Corpus-Based Study: Routledge, Taylor & Francis; 2017. [CrossRef]
  24. Hunt D, Brookes G. Corpus, Discourse and Mental Health: Bloomsbury Academic; 2020. [CrossRef]
  25. Baker P. Using Corpora in Discourse Analysis: Continuum; 2006. [CrossRef]
  26. Abdo M, Ali A, Sarhan N. Analyzing judgment in bipolar depression patients’ narratives using syntactic patterns: a corpus-based study. Egyptian Journal of Language Engineering 2019 Apr 1;6(1):1-11 [CrossRef]
  27. McDonald D, Woodward-Kron R. Member roles and identities in online support groups: perspectives from corpus and systemic functional linguistics. Discourse & Communication 2016 Apr;10(2):157-175 [CrossRef]
  28. Reddit. URL: [accessed 2023-10-02]
  29. Reddit. Alexa. 2022. URL: [accessed 2020-02-27]
  30. Jagfeld G, Lobban F, Rayson P, Jones SH. Understanding who uses Reddit: profiling individuals with a self-reported bipolar disorder diagnosis. Presented at: Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology; Jun 11, 2021; Online p. 1-14 [CrossRef]
  31. McEnery T, Xiao R, Tono Y. Corpus-Based Language Studies: An Advanced Resource Book: Routledge, Taylor & Francis; 2006.
  32. Jagfeld G. Subreddit topics. GitHub. 2022. URL: [accessed 2023-02-23]
  33. Jagfeld G. List of bipolar disorder terms. GitHub. 2021. URL:​glorisonne/​reddit_bd_user_characteristics/​blob/​master/​disclosure-patterns/​condition-terms/​bipolar-filter-terms.​txt [accessed 2023-02-23]
  34. Jagfeld G, Lobban F, Humphreys C, Rayson P, Jones SH. List of personal recovery terms. GitHub. 2022. URL: [accessed 2023-02-23]
  35. Spärck Jones K. A statistical interpretation of term specificity and its application in retrieval. J Doc 2004 Oct 1;60(5):493-502 [CrossRef]
  36. Robertson S. Understanding inverse document frequency: on theoretical arguments for IDF. J Doc 2004 Oct;60(5):503-520 [CrossRef]
  37. Research Board. Ethics guidelines for internet-mediated research. The British Psychological Society. 2021 Jun 7. URL: [accessed 2023-10-02]
  38. Morant N, Chilman N, Lloyd-Evans B, Wackett J, Johnson S. Acceptability of using social media content in mental health research: a reflection. comment on “Twitter users” views on mental health crisis resolution team care compared with stakeholder interviews and focus groups: qualitative analysis. JMIR Ment Health 2021 Aug 17;8(8):e32475 [CrossRef] [Medline]
  39. People in Research. URL: [accessed 2023-10-02]
  40. Finlay L, Gough G, editors. Reflexivity: A Practical Guide for Researchers in Health and Social Sciences: John Wiley & Sons; 2008.
  41. Pope C, Ziebland S, Mays N. Qualitative research in health care. analysing qualitative data. BMJ 2000 Jan 8;320(7227):114-116 [CrossRef] [Medline]
  42. Brezina V, Platt W, McEnery T. #Lancsbox v. 6.0. Lancaster University. 2021. URL: [accessed 2023-10-02]
  43. Rayson P, Berridge D, Francis B. Extending the Cochran rule for the comparison of word frequencies between Corpora. Presented at: 7th International Conference on Statistical Analysis of Textual Data (JADT 2004); Mar 10-12, 2004; Louvain-la-Neuve, Belgium p. 926-936
  44. Duguid A, Partington A. Absence: you don’t know what you’re missing. Or do you? In: Taylor C, Marchi A, editors. Corpus Approaches to Discourse: A Critical Review: Routledge, Taylor & Francis; 2018:38-59 [CrossRef]
  45. Jagfeld G, Jones SH, Rayson P, Lobban F. Self-reported BD diagnosis (S-BiDD) data set. Lancaster University. 2022. URL: 10.17635/lancaster/researchdata/589 [accessed 2023-10-02]
  46. Jagfeld G. Code accompanying the paper “how people with a bipolar disorder diagnosis talk about personal recovery in peer online support forums: corpus framework analysis using POETIC.”. GitHub. 2022. URL: [accessed 2023-02-23]
  47. Tol P. Colour schemes. Netherlands Institute for Space Research. 2021 Aug 18. URL: [accessed 2023-10-02]
  48. De Choudhury M, De S. Mental health discourse on Reddit: self-disclosure, social support, and anonymity. Presented at: Eighth International AAAI Conference on Weblogs and Social Media; Jun 1-4, 2014; Ann Arbor, MI p. 71-80 [CrossRef]
  49. Pavalanathan U, de Choudhury M. Identity management and mental health discourse in social media. Proc Int World Wide Web Conf 2015 May;2015(Companion):315-321 [CrossRef] [Medline]
  50. Wright K. Communication in health-related online social support groups/communities: a review of research on predictors of participation, applications of social support theory, and health outcomes. Review of Communication Research 2016;4:65-87 [CrossRef]
  51. Smith-Merry J, Goggin G, Campbell A, McKenzie K, Ridout B, Baylosis C. Social connection and online engagement: insights from interviews with users of a mental health online forum. JMIR Ment Health 2019 Mar 26;6(3):e11084 [CrossRef] [Medline]
  52. Akers N, Lobban F, Hilton C, Panagaki K, Jones SH. Measuring social and occupational functioning of people with bipolar disorder: a systematic review. Clin Psychol Rev 2019 Dec;74 [CrossRef] [Medline]
  53. Mezes B, Lobban F, Costain D, et al. Recovery beyond clinical improvement - recovery outcomes measured for people with bipolar disorder between 1980 and 2020. J Affect Disord 2022 Jul 15;309:375-392 S0165-0327(22)00412-8. [CrossRef] [Medline]
  54. Mezes B, Lobban F, Costain D, Longson D, Jones SH. Psychological factors in personal and clinical recovery in bipolar disorder. J Affect Disord 2021 Feb 1;280(Pt A):326-337 [CrossRef] [Medline]
  55. Tay D, Qiu H. Modeling linguistic (a)synchrony: a case study of therapist-client interaction. Front Psychol 2022 May 23;13:903227 [CrossRef] [Medline]
  56. Understanding psychosis and bipolar disorder. Health Education England. 2020 Mar. URL: psychosis and bipolar disorder.pdf [accessed 2023-10-02]
  57. Tyler E, Lobban F, Sutton C, et al. A pilot randomised controlled trial to assess the feasibility and acceptability of recovery-focused therapy for older adults with bipolar disorder. BJPsych Open 2022 Oct 24;8(6):e191 [CrossRef] [Medline]
  58. Jones SH, Smith G, Mulligan LD, et al. Recovery-focused cognitive-behavioural therapy for recent-onset bipolar disorder: randomised controlled pilot trial. Br J Psychiatry 2015 Jan;206(1):58-66 [CrossRef] [Medline]
  59. Jones SH, Knowles D, Howarth E, Lobban F, Emsley R. Mediation analysis of recovery-focused therapy for recent-onset bipolar disorder. J Affect Disord Reports 2021 Jul;5:100175 [CrossRef]
  60. Hippman C, Ringrose A, Inglis A, et al. A pilot randomized clinical trial evaluating the impact of genetic counseling for serious mental illnesses. J Clin Psychiatry 2016 Feb;77(2):e190-e198 [CrossRef] [Medline]

BD: bipolar disorder
CHIME: Connectedness, Hope and Optimism, Identity, Meaning and Purpose, Empowerment
MH: mental health
NLP: natural language processing
POETIC: Purpose and Meaning, Optimism and Hope, Empowerment, Tensions, Identity, Connectedness
PR: personal recovery
S-BiDD: self-reported bipolar disorder diagnosis

Edited by Christian Lovis; submitted 23.02.23; peer-reviewed by Aude Molinard-Chenu, Eman Jaafar; final revised version received 12.09.23; accepted 13.09.23; published 08.11.23


© Glorianna Jagfeld, Fiona Lobban, Chloe Humphreys, Paul Rayson, Steven Huntley Jones. Originally published in JMIR Medical Informatics (, 8.11.2023.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.