Deep convolution neural network based Parkinson’s disease detection using line spectral frequency spectrum of running speech

The deformation of speech caused by glottic vocal tract is an early bio marker for Parkinson’s disease. A novel idea of Line Spectral Frequency trajectory spectrum image representation of the speech signals of the subjects in Deep Convolution Neural Network is proposed for Parkinson’s disease classi...

Full description

Saved in:

Bibliographic Details
Published in	Journal of intelligent & fuzzy systems Vol. 45; no. 3; pp. 4599 - 4615
Main Authors	Kumari, Rani, Ramachandran, Prakash
Format	Journal Article
Language	English
Published	London, England SAGE Publications 24.08.2023 Sage Publications Ltd
Subjects	Artificial neural networks Datasets Frequency spectrum Impulse response Line spectra Machine learning Mathematical analysis Mobile phones Neural networks Parkinson's disease Performance measurement Phonation Phonetics Speech Training Vocal tract Voice recognition sustained phonation Deep convolution neural network running speech line spectral frequency Parkinson’s disease
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The deformation of speech caused by glottic vocal tract is an early bio marker for Parkinson’s disease. A novel idea of Line Spectral Frequency trajectory spectrum image representation of the speech signals of the subjects in Deep Convolution Neural Network is proposed for Parkinson’s disease classification in which the convolution layer automatically learn the features from the input images and no separate feature calculation stage in required. The human vocal tract that produces a short phonetics is assumed as an all-pole Infinite impulse response system and the Line spectral frequency trajectory spectrum images represents the poles of the system and reflects the voice defects due to Parkinson’s disease. It is shown that the proposed method outperforms the existing state of the art work for two different utterance tasks one for sustained phonation and another for natural running speech dataset. It is demonstrated that the Deep Convolution Neural Network results in a training accuracy of 92.5% for sustained phonation dataset and training accuracy of 99.18% for King’s college running speech dataset. The validation accuracies for both the datasets are 100%. The proposed work is much better than another recent benchmark work in which Mel Frequency Cepstral Coefficient parameters are used in machine learning for Parkinson’s disease detection in running speech. The high performance of the proposed method for King’s college running speech dataset which is collected through mobile device voice recordings, gains attention. Rigorous performance analysis is performed for running speech dataset by using separate isolated test set for repeated 50 trials and the performance metrics are F1 score of 99.37%, sensitivity of 100%, precision of 98.75% and specificity of 99.27%.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1064-1246 1875-8967
DOI:	10.3233/JIFS-230183