Prediction of depression cases, incidence, and chronicity in a large occupational cohort using machine learning techniques: an analysis of the ELSA-Brasil study

. Depression is highly prevalent and marked by a chronic and recurrent course. Despite being a major cause of disability worldwide, little is known regarding the determinants of its heterogeneous course. Machine learning techniques present an opportunity to develop tools to predict diagnosis and pro...

Full description

Saved in:
Bibliographic Details
Published inPsychological medicine Vol. 51; no. 16; pp. 2895 - 2903
Main Authors Librenza-Garcia, Diego, Passos, Ives Cavalcante, Feiten, Jacson Gabriel, Lotufo, Paulo A., Goulart, Alessandra C., de Souza Santos, Itamar, Viana, Maria Carmen, Benseñor, Isabela M., Brunoni, Andre Russowsky
Format Journal Article
LanguageEnglish
Published Cambridge, UK Cambridge University Press 01.12.2021
Subjects
Online AccessGet full text
ISSN0033-2917
1469-8978
1469-8978
DOI10.1017/S0033291720001579

Cover

Loading…
More Information
Summary:. Depression is highly prevalent and marked by a chronic and recurrent course. Despite being a major cause of disability worldwide, little is known regarding the determinants of its heterogeneous course. Machine learning techniques present an opportunity to develop tools to predict diagnosis and prognosis at an individual level. We examined baseline (2008-2010) and follow-up (2012-2014) data of the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil), a large occupational cohort study. We implemented an elastic net regularization analysis with a 10-fold cross-validation procedure using socioeconomic and clinical factors as predictors to distinguish at follow-up: (1) depressed from non-depressed participants, (2) participants with incident depression from those who did not develop depression, and (3) participants with chronic (persistent or recurrent) depression from those without depression. We assessed 15 105 and 13 922 participants at waves 1 and 2, respectively. The elastic net regularization model distinguished outcome levels in the test dataset with an area under the curve of 0.79 (95% CI 0.76-0.82), 0.71 (95% CI 0.66-0.77), 0.90 (95% CI 0.86-0.95) for analyses 1, 2, and 3, respectively. Diagnosis and prognosis related to depression can be predicted at an individual subject level by integrating low-cost variables, such as demographic and clinical data. Future studies should assess longer follow-up periods and combine biological predictors, such as genetics and blood biomarkers, to build more accurate tools to predict depression course.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0033-2917
1469-8978
1469-8978
DOI:10.1017/S0033291720001579