Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences

Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 32; no. 3; pp. 588 - 601
Main Authors Trotzek, Marcel, Koitka, Sven, Friedrich, Christoph M.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.03.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss their problems. This paper addresses the early detection of depression using machine learning models based on messages on a social platform. In particular, a convolutional neural network based on different word embeddings is evaluated and compared to a classification based on user-level linguistic metadata. An ensemble of both approaches is shown to achieve state-of-the-art results in a current early detection task. Furthermore, the currently popular <inline-formula><tex-math notation="LaTeX">ERDE</tex-math> <mml:math><mml:mrow><mml:mi>E</mml:mi><mml:mi>R</mml:mi><mml:mi>D</mml:mi><mml:mi>E</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="friedrich-ieq1-2885515.gif"/> </inline-formula> score as metric for early detection systems is examined in detail and its drawbacks in the context of shared tasks are illustrated. A slightly modified metric is proposed and compared to the original score. Finally, a new word embedding was trained on a large corpus of the same domain as the described task and is evaluated as well.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2018.2885515