Natural language processing for mental health interventions: a systematic review and research framework

Neuropsychiatric disorders pose a high societal cost, but their treatment is hindered by lack of objective outcomes and fidelity metrics. AI technologies and specifically Natural Language Processing (NLP) have emerged as tools to study mental health interventions (MHI) at the level of their constitu...

Full description

Saved in:
Bibliographic Details
Published inTranslational psychiatry Vol. 13; no. 1; pp. 309 - 17
Main Authors Malgaroli, Matteo, Hull, Thomas D., Zech, James M., Althoff, Tim
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 06.10.2023
Nature Publishing Group
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Neuropsychiatric disorders pose a high societal cost, but their treatment is hindered by lack of objective outcomes and fidelity metrics. AI technologies and specifically Natural Language Processing (NLP) have emerged as tools to study mental health interventions (MHI) at the level of their constituent conversations. However, NLP’s potential to address clinical and research challenges remains unclear. We therefore conducted a pre-registered systematic review of NLP-MHI studies using PRISMA guidelines (osf.io/s52jh) to evaluate their models, clinical applications, and to identify biases and gaps. Candidate studies (n = 19,756), including peer-reviewed AI conference manuscripts, were collected up to January 2023 through PubMed, PsycINFO, Scopus, Google Scholar, and ArXiv. A total of 102 articles were included to investigate their computational characteristics (NLP algorithms, audio features, machine learning pipelines, outcome metrics), clinical characteristics (clinical ground truths, study samples, clinical focus), and limitations. Results indicate a rapid growth of NLP MHI studies since 2019, characterized by increased sample sizes and use of large language models. Digital health platforms were the largest providers of MHI data. Ground truth for supervised learning models was based on clinician ratings ( n  = 31), patient self-report ( n  = 29) and annotations by raters ( n  = 26). Text-based features contributed more to model accuracy than audio markers. Patients’ clinical presentation ( n  = 34), response to intervention ( n  = 11), intervention monitoring ( n  = 20), providers’ characteristics ( n  = 12), relational dynamics ( n  = 14), and data preparation ( n  = 4) were commonly investigated clinical categories. Limitations of reviewed studies included lack of linguistic diversity, limited reproducibility, and population bias. A research framework is developed and validated (NLPxMHI) to assist computational and clinical researchers in addressing the remaining gaps in applying NLP to MHI, with the goal of improving clinical utility, data access, and fairness.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Undefined-1
content type line 23
ISSN:2158-3188
2158-3188
DOI:10.1038/s41398-023-02592-2