Detecting pertussis in the pediatric population using respiratory sound events and CNN
•Classification of pertussis and non-pertussis subjects based on respiratory sound events (cough and whooping) and deep learning.•Convolutional neural network models trained on three time-frequency image-like representations: mel-spectrogram, wavelet scalogram, and cochleagram.•Time-frequency image...
Saved in:
Published in | Biomedical signal processing and control Vol. 68; p. 102722 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.07.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Classification of pertussis and non-pertussis subjects based on respiratory sound events (cough and whooping) and deep learning.•Convolutional neural network models trained on three time-frequency image-like representations: mel-spectrogram, wavelet scalogram, and cochleagram.•Time-frequency image augmentation during training using mixup and late fusion to combine learning from different time-frequency representations.•Achieved an overall accuracy of 90.48% (AUC = 0.9501), outperforming various baseline methods.•Promising results demonstrates that automated respiratory sound analysis may be useful in non-invasive detection of pertussis.
Pertussis (whooping cough), a respiratory tract infection, is a significant cause of morbidity and mortality in children. The classic presentation of pertussis includes paroxysmal coughs followed by a high-pitched intake of air that sounds like a whoop. Although these respiratory sounds can be useful in making a diagnosis in clinical practice, the distinction of these sounds by humans can be subjective. This work aims to objectively analyze these respiratory sounds using signal processing and deep learning techniques to detect pertussis in the pediatric population.
Various time-frequency representations of the respiratory sound signals are formed and used as a direct input to convolutional neural networks, without the need for feature engineering. In particular, we consider the mel-spectrogram, wavelet scalogram, and cochleagram representations which reveal spectral characteristics at different frequencies. The method is evaluated on a dataset of 42 recordings, containing 542 respiratory sound events, from children with pertussis and non-pertussis. We use data augmentation to prevent model overfitting on the relatively small dataset and late fusion to combine the learning from the different time-frequency representations for more informed predictions.
The proposed method achieves an accuracy of 90.48% (AUC = 0.9501) in distinguishing pertussis subjects from non-pertussis subjects, outperforming several baseline techniques.
Our results suggest that detecting pertussis using automated respiratory sound analysis is feasible. It could potentially be implemented as a non-invasive screening tool, for example, in smartphones, to increase the diagnostic utility for this disease which may be used by parents/carers in the community. |
---|---|
ISSN: | 1746-8094 1746-8108 |
DOI: | 10.1016/j.bspc.2021.102722 |