This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
As a common mental disease, depression seriously affects people’s physical and mental health. According to the statistics of the World Health Organization, depression is one of the main reasons for suicide and self-harm events in the world. Therefore, strengthening depression detection can effectively reduce the occurrence of suicide or self-harm events so as to save more people and families. With the development of computer technology, some researchers are trying to apply natural language processing techniques to detect people who are depressed automatically. Many existing feature engineering methods for depression detection are based on emotional characteristics, but these methods do not consider high-level emotional semantic information. The current deep learning methods for depression detection cannot accurately extract effective emotional semantic information.
In this paper, we propose an emotion-based attention network, including a semantic understanding network and an emotion understanding network, which can capture the high-level emotional semantic information effectively to improve the depression detection task.
The semantic understanding network module is used to capture the contextual semantic information. The emotion understanding network module is used to capture the emotional semantic information. There are two units in the emotion understanding network module, including a positive emotion understanding unit and a negative emotion understanding unit, which are used to capture the positive emotional information and the negative emotional information, respectively. We further proposed a dynamic fusion strategy in the emotion understanding network module to fuse the positive emotional information and the negative emotional information.
We evaluated our method on the Reddit data set. The experimental results showed that the proposed emotion-based attention network model achieved an accuracy, precision, recall, and F-measure of 91.30%, 91.91%, 96.15%, and 93.98%, respectively, which are comparable results compared with state-of-the-art methods.
The experimental results showed that our model is competitive with the state-of-the-art models. The semantic understanding network module, the emotion understanding network module, and the dynamic fusion strategy are effective modules for depression detection. In addition, the experimental results verified that the emotional semantic information was effective in depression detection.
As defined in the free dictionary, depression refers to the act of depressing or state of being depressed. Depression is usually regarded as one type of mood disorder; the main clinical feature of depression is the significant and persistent mood depression. The depressed patients’ emotion can range from gloomy to grief, low self-esteem, and even to pessimism, which may cause suicidal attempts or behaviors [
With the development of the internet in people’s daily life, people began to share their feelings and problems on social media [
The goal of depression detection is to classify a person or a post as depressed or not. The performance of depression detection on social media can help with the clinical treatment of depression. This problem needs to be solved. The posts of patients with depression usually contain strong emotions. We give three examples of the textual posts left on Reddit, including two depression-indicative posts and one standard post as follows.
Example 1: “Today, I feel so horrible, it makes me want to die I made a fool of myself at work, felt so stupid after the meeting so I left work, told the boss I’m sick. Spent the remaining afternoon in bed.” Label: depression
Example 2: “That feeling when you hate who you are as a person but can’t get yourself to change because you are so used to being like this for the past several years. I’ve become a shitty person. The thought of change seems impossible to me at this point.” Label: depression
Example 3: “Looking for cool ways to tell parents my wife is pregnant.” Label: nondepression
Examples 1 and 2 contain strong emotional information made by the patients with depression. From example 1, the words, including
We also counted the proportion of the positive words and the negative words that appeared in the depression-indicative posts and the standard posts of the Reddit data set [
Percentage of emotion words in posts.
Categories | Depression-indicative posts (%) | Standard posts (%) |
Positive emotion words | 8.62 | 9.41 |
Negative emotion words | 6.70 | 4.85 |
Detecting depression automatically has made some progress. Many existing models detect depression based on the feature engineering such as bag of words [
Before introducing our model and to understand our paper more conveniently, we give several definitions of concepts, including high-level emotional semantic information, semantic understanding network (SUN), emotion understanding network (EUN), and dynamic fusion strategy.
High-level emotional semantic information denotes the emotional semantic information that is captured by deep learning.
SUN is a deep learning method that is used to capture the contextual semantic information in the text for depression detection.
EUN is a deep learning method that is used to capture the emotional semantic information in the text for depression detection.
Dynamic fusion strategy denotes a fusion strategy that can fuse positive emotional information and negative emotional information dynamically.
To extract the emotional information effectively, we propose an emotion-based attention network (EAN) for depression detection. Our EAN model mainly contains two modules, including a SUN and an EUN. The SUN module is used to capture the contextual semantic information, which has been widely used in NLP. The EUN module is used to capture the emotional information because the emotional information plays an important role for depression detection as previously mentioned. As shown in
The main contributions of this paper can be summarized as follows:
We propose a new deep learning framework for depression detection. We also design a special module to explicitly extract the high-level emotion information for depression detection in our framework.
We take into consideration the positive emotion information and the negative emotion information simultaneously. At the same time, we apply a dynamic fusion strategy to fuse the positive emotion information and the negative information.
We conduct experiments on the Reddit data set for depression detection. The experiments show our model can get state-of-the-art or comparable performance. The ablation study also verifies the effectiveness of the components proposed in our model.
In this section, we review the related work about depression detection on social media.
In recent years, with the development of social media, more and more people are willing to post their thoughts, emotions, or life details on social media, including Reddit, Twitter, and so on. Park et al [
De Choudhury et al [
Most of the existing methods for depression detection are based on feature engineering. LIWC is usually used to extract individual psychological states, such as positive and negative emotions, pronouns, and so on. Therefore, LIWC was often used for the depression detection task [
Shneidman [
Different from traditional feature engineering-based methods, deep learning methods mostly apply end-to-end models. Yates et al [
According to previous research on depression detection, it can be concluded that the emotional information is important in the task of depression detection. In addition, deep learning can take high-level semantic information into account, but the current deep learning methods for depression detection still lack effective extraction of the emotional semantic information. Thus, we propose a deep learning model to consider the high-level emotional information that is captured by the deep learning method for depression detection, which is named the EAN.
The structure of this paper is organized as follows. The Introduction section introduced the background and related work. The Methods section shows the details of the proposed model. The Results section gives the experiments in this paper. The Discussion section shows the conclusions and future work.
As a newly developed social media, Reddit has become a widely popular web-based discussion forum. Reddit users can discuss a variety of topics on this web-based platform anonymously. The topics discussed on the platform can be arranged in more than a million discussion groups. Due to the large amount of discussion text, Reddit attracts many researchers to conduct their studies with the data on the Reddit platform. Pirina and Çöltekin [
We preprocessed the Reddit data set, such as removing the stop words. We then counted the occurrence number of each word for the depression-indicative posts and the standard posts. We sorted the words according to the statistics and show the top of the word lists in
As shown in
The architecture of the emotion-based attention network model. There are two parts in our model, including a SUN and an EUN. bi-LSTM: bidirectional long short-term memory.
All text: i’m, like, feel, want, get, know, even, really, people, life, i’ve, one, time, think, would, never, depression, me, can’t, go, going, things, don’t, much, friends, make, good, it, still, could, back, anyone, years, anything, always, every, got, someone, fucking, help, day, see, something, work, ever, need, feeling, everything, talk, year
Positive: friends, good, work, help, better, happy, job, love, hard, friend, family, care, wanted, best, sleep, sure, self, mind, understand, new, mental, hope, social, money, high, remember, working, reason, okay, close, real, together, great, normal, deal, believe, change, enjoy, birthday, honestly, nice, motivation, advice, loved, therapist, happiness, fun, boyfriend, saying, big
Negative: depression, depressed, bad, fucking, nothing, alone, hate, shit, stop, lost, worse, anxiety, fuck, tired, sad, die, suicide, kill, relationship, wrong, pain, suicidal, problems, old, sorry, cry, lonely, therapy, hurt, stupid, constantly, issues, sick, crying, problem, afraid, weird, reddit, hospital, worst, hang, illness, dead, scared, dark, broken, shitty, broke, miserable, died
All text: like, i’m, know, friend, would, feel, really, friends, want, time, get, one, even, said, always, never, told, got, family, go, things, me, think, best, make, mom, going, people, years, talk, also, still, back, something, much, see, say, could, i’ve, dad, tell, since, don’t, started, us, me, it, made, help, parents
Positive: friend, friends, family, best, sister, help, friendship, work, brother, good, new, sure, love, wanted, saying, together, advice, father, close, money, boyfriend, kids, care, hard, better, mad, understand, job, basically, happy, great, deal, child, high, moved, believe, fun, social, mind, baby, conversation, eventually, reason, married, big, change, spend, real, normal, nice
Negative: bad, wrong, nothing, old, hang, problem, stop, hurt, upset, sorry, shit, issues, lost, alone, cut, angry, hate, problems, worse, depression, weird, sick, constantly, anxiety, sad, tired, annoyed, broke, bitch, scared, died, hell, afraid, crying, cancer, toxic, ignore, pregnant, lose, difficult, wait, fault, depressed, horrible, awkward, selfish, reply, fuck, confused, reddit
In this section, we introduce the proposed model for depression detection briefly, which is called the EAN, as shown in
The SUN was used to capture the contextual semantic information in the text for depression detection. There are three layers in the SUN module, including the word encoding layer, context encoding layer, and attention mechanism (Att) layer. We will introduce these three layers in more details.
We will introduce the word encoding layer in the SUN module briefly. The input of our task is text. The text can be denoted as w = {
The context encoding layer was used to obtain contextual information. Bidirectional long short-term memory (Bi-LSTM) [
LSTM was proposed by Hochreiter and Schmidhuber [
Where
The input of the Att layer is H = [
Where
Many research papers [
In this section, we introduce the inputs of the EUN module. The inputs include a positive emotion part and a negative emotion part. We applied the SenticNet application programming interface to divide the original texts into a positive emotional part and a negative emotional part. These two emotional parts are also mapped into a matrix of word vectors as in the word encoding layer in the SUN module, named
The emotion encoding layer is to encode the positive emotional information and the negative emotional information.
The goal of the emotion fusion layer is to fuse the positive emotional information and the negative emotional information for depression detection. We get the positive emotional information
As previously described, we get the contextual semantic information
Accordingly, the final classification decision for depression detection is formulated by the softmax function:
The cross-entropy loss was used for depression detection in our model. The training goal was to minimize the loss.
The unit size of Bi-LSTM in our experiments was 64. We applied the pretrained 300-dimension word embedding (GloVe) in the word encoding layer. In addition, the optimization function was Adam, and the batch size was 128. Following Tadesse et al [
We applied the standard metrics, including accuracy, precision, recall, and F1-score, to evaluate the effectiveness of our model for depression detection. F1 is defined as follows:
We compared the results of our model with many state-of-the-art methods on the Reddit data set. We compared it with the baselines, including LIWC, LDA, unigram, bigram, LIWC + LDA + unigram, LIWC + LDA + bigram [
LIWC: Tadesse et al [
LDA: Tadesse et al [
Unigram: Tadesse et al [
Bigram: Tadesse et al [
LIWC + LDA + unigram: The model is based on the aforementioned characteristics, including LIWC, LDA, and unigram, for depression detection.
LIWC + LDA + bigram: The model is based on the aforementioned characteristics, including LIWC, LDA, and bigram, for depression detection.
LSTM: LSTM was proposed by Hochreiter and Schmidhuber [
Bi-LSTM: The Bi-LSTM was proposed by Graves et al [
Bi-LSTM + Att: The model is based on Bi-LSTM and the Att.
EAN: This model is proposed in this paper, which considers emotional semantic information based on deep learning.
As shown in
The results based on bigram (bigram and LIWC + LDA + bigram) were higher than unigram (unigram and LIWC + LDA + unigram). It can be concluded that contextual information can improve the results of the model. The results based on Bi-LSTM were higher than LSTM. it can be concluded that considering bidirectional contextual semantic information is necessary. The results based on Bi-LSTM + Att were higher than Bi-LSTM; it can be proven that the Att is effective for the depression detection task. The proposed EAN model got the higher results because we took into consideration both the contextual semantic information and the emotional semantic information.
Results compared with the existing models.
Model | Accuracy (%) | Precision (%) | Recall (%) | F1 (%) |
LIWCa,b | 70 | 74 | 71 | 72 |
LDAb,c | 75 | 75 | 72 | 74 |
Unigramb | 70 | 71 | 95 | 81 |
Bigramb | 79 | 80 | 76 | 78 |
LIWC + LDA + unigramb | 78 | 84 | 79 | 81 |
LIWC + LDA + bigramb | 91 | 90 | 92 | 91 |
LSTMd | 87.03 | 90.30 | 91.67 | 90.98 |
Bi-LSTMe | 86.46 | 88.08 | 95 | 91.41 |
Bi-LSTM + Attf | 88.59 | 90.41 | 94.96 | 92.63 |
EANg (our model) | 91.3 | 91.91 | 96.15 | 93.98 |
aLIWC: Linguistic Inquiry and Word Count.
bIndicates that the results are shown in the literature [
cLDA: latent Dirichlet allocation.
dLSTM: long short-term memory.
eBi-LSTM: bidirectional long short-term memory.
fAtt: attention mechanism.
gEAN: emotion-based attention network.
In this section, we analyze the effectiveness of the two modules (SUN and EUN), the effectiveness of different emotional semantic information, and the effectiveness of the dynamic fusion strategy.
To verify the effectiveness of SUN and EUN, we designed a series of experiments. SUN means the proposed EAN model without the EUN module. EUN means the proposed EAN model without the SUN module. As shown in
The effectiveness of the SUN and EUN. EAN: emotion-based attention network; EUN: emotion understanding network; SUN: semantic understanding network.
To verify the effectiveness of different emotional semantic information, we designed a series of experiments, including without emotion (SUN), without positive emotion (SUN + negative), and without negative emotion (SUN + positive). As shown in
The effectiveness of different emotional semantic information. Acc: accuracy; EAN: emotion-based attention network; P: precision; R: recall; SUN: semantic understanding network.
To verify the effectiveness of the dynamic fusion strategy, we designed a series of experiments including the EAN model with the concatenate fusion strategy, the EAN model with the fixed fusion strategy, and the EAN model with the dynamic fusion strategy. The EAN (concatenate fusion) model applies the concatenate operation in the emotion fusion strategy. The EAN (fixed fusion) model applies the fixed fusion operation in the emotion fusion layer. The θ in equation 10 is fixed at 0.5. The EAN (dynamic fusion) model is the model proposed in this paper. As shown in
In this section, we designed a series of experiments to verify the effectiveness of the proposed EAN model, including the two modules in the EAN model, the different emotional semantic information, and the dynamic fusion method.
Some visualization results of the θ to illustrate the effectiveness of the proposed dynamic fusion strategy intuitively are shown in
The effectiveness of the dynamic fusion strategy. Acc: accuracy; EAN: emotion-based attention network; P: precision; R: recall.
The visualization of the θ in the dynamic fusion strategy. GF: girlfriend.
Depression attracts more and more attention from people and organizations now. With the development of computer technology, some researchers are trying to use computers to automatically identify people who are depressed. In this paper, we proposed an EAN model to explicitly extract the high-level emotion information for the depression detection task. The proposed EAN model consists of the SUN and the EUN. In the proposed model, we took into consideration the positive emotion information and the negative emotion information simultaneously. At the same time, we applied a dynamic fusion strategy to fuse the positive emotion information and the negative information. The experimental results verified that the emotional semantic information is effective in depression detection.
According to WHO statistics, depression is one of the main causes of suicide in the world. We will focus on the relationship between depression and suicide. We will try to combine suicide detection with depression detection in our future work to improve the performance of both tasks by multitask learning. In addition, the future work will be combined with self-reported depressive symptoms or clinical diagnosis. Hopefully, our study can provide some technical supports in the field of health care.
attention mechanism
bidirectional long short-term memory
Conference and Labs of Evaluation Forum for Early Risk Prediction
convolutional neural network
emotion-based attention network
emotion understanding network
Global Vectors for Word Representation
latent Dirichlet allocation
Linguistic Inquiry and Word Count
long short-term memory
natural language processing
recurrent neural network
semantic understanding network
term frequency–inverse document frequency
World Health Organization
This study was partially supported by a grant from the Natural Science Foundation of China (No. 62076046, 61632011, 62006034, 61876031), the Ministry of Education Humanities and Social Science Project (No. 19YJCZH199), State Key Laboratory of Novel Software Technology (Nanjing University; No. KFKT2021B07), and the Fundamental Research Funds for the Central Universities (No. DUT21RC(3)015).
None declared.