Exploring the Topics of Audio Words for Detecting Alzheimer's Disease From Spontaneous Speech

Many studies have utilized speech data to automatically detect Alzheimer's Disease (AD). However, most of them simply take the speech data from a subject participating in a task (e.g., picture description) as a whole audio sequence, and lack considerations on separate utterances that may be rel...

Full description

Saved in:
Bibliographic Details
Published inIEEE signal processing letters Vol. 30; pp. 1 - 5
Main Authors Guo, Zhiqiang, Ling, Zhenhua
Format Journal Article
LanguageEnglish
Published New York IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Many studies have utilized speech data to automatically detect Alzheimer's Disease (AD). However, most of them simply take the speech data from a subject participating in a task (e.g., picture description) as a whole audio sequence, and lack considerations on separate utterances that may be related to different topics in the picture and are of different importance for discriminating AD patients from healthy controls. To this end, this paper proposes an AD detection method with topic modeling for utterances composed of audio words. First, an audio word discovery algorithm using Byte Pair Encoding (BPE) is designed to tokenize speech data without relying on text transcriptions. Then, a topic model is built that assigns each utterance a topic label in an unsupervised way. Finally, an Audio-Word HuBERT (AW-HuBERT) model integrating utterance-level topic labels is constructed for AD detection. This model is pretrained by training the existing HuBERT model with audio word sequences. The final decision for a recording is made by the weighted sum of utterance-level classification possibilities, and topic-dependent Area-Under-Curve (AUC) values are used as the weights. Experimental results on the DementiaBank dataset show that the proposed method achieves a better AD detection accuracy than state-of-the-art methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2023.3334696