eCBT-I dialogue system: a comparative evaluation of large language models and adaptation strategies for insomnia treatment

Traditional face-to-face mental health treatments are often limited by time and space. Thanks to the development of advanced large language models (LLMs), digital mental health treatments can provide personalized advice to patients and improve compliance. However, in the field of CBT-I, specialized,...

Full description

Saved in:
Bibliographic Details
Published inJournal of translational medicine Vol. 23; no. 1; pp. 862 - 15
Main Authors Bao, Xueying, Zhu, Xingyu, Yang, Dongren, Lou, Hao, Wang, Ruoyun, Wu, Yutong, Li, Wenhui, Xia, Yu, Zeng, Li, Pan, Yingying, Wang, Xiqin, Zhang, Xian, Ling, Cheng, Ling, Youhui, Zhang, Yan, Zhao, Qi, Yang, Mei
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 05.08.2025
BioMed Central
BMC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Traditional face-to-face mental health treatments are often limited by time and space. Thanks to the development of advanced large language models (LLMs), digital mental health treatments can provide personalized advice to patients and improve compliance. However, in the field of CBT-I, specialized, real-time interactive dialogue platforms have not been fully developed. Our research team construct an eCBT-I intelligent dialogue system based on the RAG architecture, aiming to provide an example of the deep integration of CBT-I knowledge graphs and large language models. Furthermore, in order to optimize the performance of the system's core language generation module on the insomnia dialogue dataset, we systematically include eight mainstream large language models (ChatGLM2-6b, ChatGLM3-6b, Baichuan-7b, Baichuan-13b, Qwen-7b, Qwen2-7b, Llama-2-7b-chat-hf, and Llama-2-13b-chat-hf) and three adaptation strategies (LoRA, QLoRA, and Freeze). We screen the suitability of the three adaptation strategies for the eight major language models in the group, and thus determine the best adaptation method for each language model to maximize performance improvement. The eight best-adapted language models are then evaluated in three dimensions to compare their performance on the small sample sleep dialogue dataset and the C-eval dataset. All subjects that evaluated under experimental conditions are historical medical records and patients who did not exhibit delirium and had normal language expression abilities. Through the matching of model characteristics to adaptation strategies and the horizontal evaluation of multiple models, we compare the contribution of different fine-tuning strategies to the performance improvement of different language models on the small insomnia dialogue dataset, and finally determine that Qwen2-7b (Freeze) is the model with the best performance on the insomnia dialogue dataset. This study effectively integrates the CBT-I knowledge graph with the large language model through the RAG architecture, which improves the professionalism of the eCBT-I intelligent dialogue system. The systematic fine-tuning method selection process and the confirmation of the optimal model not only improve the adaptability of the large language model in the CBT-I task, but also provide a useful paradigm for AI applications in medical subfields with resource constraints and difficulties in data collection, laying a solid foundation for more accurate and efficient digital CBT-I clinical practice in the future.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1479-5876
1479-5876
DOI:10.1186/s12967-025-06871-y