Exploring classical machine learning for identification of pathological lung auscultations

The use of machine learning in biomedical research has surged in recent years thanks to advances in devices and artificial intelligence. Our aim is to expand this body of knowledge by applying machine learning to pulmonary auscultation signals. Despite improvements in digital stethoscopes and attemp...

Full description

Saved in:
Bibliographic Details
Published inComputers in biology and medicine Vol. 168; p. 107784
Main Authors Razvadauskas, Haroldas, Vaičiukynas, Evaldas, Buškus, Kazimieras, Arlauskas, Lukas, Nowaczyk, Sławomir, Sadauskas, Saulius, Naudžiūnas, Albinas
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.01.2024
Elsevier Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The use of machine learning in biomedical research has surged in recent years thanks to advances in devices and artificial intelligence. Our aim is to expand this body of knowledge by applying machine learning to pulmonary auscultation signals. Despite improvements in digital stethoscopes and attempts to find synergy between them and artificial intelligence, solutions for their use in clinical settings remain scarce. Physicians continue to infer initial diagnoses with less sophisticated means, resulting in low accuracy, leading to suboptimal patient care. To arrive at a correct preliminary diagnosis, the auscultation diagnostics need to be of high accuracy. Due to the large number of auscultations performed, data availability opens up opportunities for more effective sound analysis. In this study, digital 6-channel auscultations of 45 patients were used in various machine learning scenarios, with the aim of distinguishing between normal and abnormal pulmonary sounds. Audio features (such as fundamental frequencies F0-4, loudness, HNR, DFA, as well as descriptive statistics of log energy, RMS and MFCC) were extracted using the Python library Surfboard. Windowing, feature aggregation, and concatenation strategies were used to prepare data for machine learning algorithms in unsupervised (fair-cut forest, outlier forest) and supervised (random forest, regularized logistic regression) settings. The evaluation was carried out using 9-fold stratified cross-validation repeated 30 times. Decision fusion by averaging the outputs for a subject was also tested and found to be helpful. Supervised models showed a consistent advantage over unsupervised ones, with random forest achieving a mean AUC ROC of 0.691 (accuracy 71.11%, Kappa 0.416, F1-score 0.675) in side-based detection and a mean AUC ROC of 0.721 (accuracy 68.89%, Kappa 0.371, F1-score 0.650) in patient-based detection. •Digital stethoscopes enabled quality recordings of pulmonary sounds.•The study data contains sequential 6-channel auscultations of 45 subjects.•Various schemes for the use of these biomedical audio signals were tested.•Supervised machine learning methods outperformed unsupervised ones.•The random forest had 71.11% accuracy in the side-based detection task.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0010-4825
1879-0534
1879-0534
DOI:10.1016/j.compbiomed.2023.107784