Time–Frequency Causal Hidden Markov Model for speech-based Alzheimer’s disease longitudinal detection
Speech deterioration is an early indicator in individuals with Alzheimer’s disease (AD), with progression influenced by various factors, leading to unique trajectories for each individual. To facilitate automated longitudinal detection of AD using speech, we propose an enhanced Hidden Markov Model (...
Saved in:
Published in | Computer speech & language Vol. 95; p. 101862 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.01.2026
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Speech deterioration is an early indicator in individuals with Alzheimer’s disease (AD), with progression influenced by various factors, leading to unique trajectories for each individual. To facilitate automated longitudinal detection of AD using speech, we propose an enhanced Hidden Markov Model (HMM), termed the Time-Frequency Causal HMM (TF-CHMM), which models disease-causative acoustic features over time under the Markov property. The TF-CHMM integrates a parallel convolutional neural network as an encoder for spectrograms, extracting both time-domain and frequency-domain features from audio recordings linked to AD. Additionally, it incorporates personal attributes (e.g., age) and clinical diagnosis data (e.g., MMSE scores) as supplementary inputs, disentangling disease-related features from unrelated components through a sequential variational auto-encoder with causal inference. The TF-CHMM is evaluated using the Pitt Corpus, which includes annual visits for each subject with a variable number of longitudinal samples, comprising audio recordings, manual transcriptions, MMSE scores, and age information. Experimental results demonstrated the effectiveness of our designed system, achieving a competitive accuracy of 90.24% and an F1 score of 90.00%. An ablation study further highlighted the efficiency of the parallel convolutional kernels in extracting time–frequency information and emphasized the effectiveness of our longitudinal experimental setup in the AD detection system.
•A parallel convolutional block is designed to extract time-frequency AD features.•Disease-related and disease-unrelated information is disentangled causally.•Designed longitudinally modeling system achieves SOTA performance on the Pitt Corpus. |
---|---|
ISSN: | 0885-2308 |
DOI: | 10.1016/j.csl.2025.101862 |