Deep convolution neural network based Parkinson’s disease detection using line spectral frequency spectrum of running speech
The deformation of speech caused by glottic vocal tract is an early bio marker for Parkinson’s disease. A novel idea of Line Spectral Frequency trajectory spectrum image representation of the speech signals of the subjects in Deep Convolution Neural Network is proposed for Parkinson’s disease classi...
Saved in:
Published in | Journal of intelligent & fuzzy systems Vol. 45; no. 3; pp. 4599 - 4615 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
London, England
SAGE Publications
24.08.2023
Sage Publications Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The deformation of speech caused by glottic vocal tract is an early bio marker for Parkinson’s disease. A novel idea of Line Spectral Frequency trajectory spectrum image representation of the speech signals of the subjects in Deep Convolution Neural Network is proposed for Parkinson’s disease classification in which the convolution layer automatically learn the features from the input images and no separate feature calculation stage in required. The human vocal tract that produces a short phonetics is assumed as an all-pole Infinite impulse response system and the Line spectral frequency trajectory spectrum images represents the poles of the system and reflects the voice defects due to Parkinson’s disease. It is shown that the proposed method outperforms the existing state of the art work for two different utterance tasks one for sustained phonation and another for natural running speech dataset. It is demonstrated that the Deep Convolution Neural Network results in a training accuracy of 92.5% for sustained phonation dataset and training accuracy of 99.18% for King’s college running speech dataset. The validation accuracies for both the datasets are 100%. The proposed work is much better than another recent benchmark work in which Mel Frequency Cepstral Coefficient parameters are used in machine learning for Parkinson’s disease detection in running speech. The high performance of the proposed method for King’s college running speech dataset which is collected through mobile device voice recordings, gains attention. Rigorous performance analysis is performed for running speech dataset by using separate isolated test set for repeated 50 trials and the performance metrics are F1 score of 99.37%, sensitivity of 100%, precision of 98.75% and specificity of 99.27%. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1064-1246 1875-8967 |
DOI: | 10.3233/JIFS-230183 |