Exploring classical machine learning for identification of pathological lung auscultations
The use of machine learning in biomedical research has surged in recent years thanks to advances in devices and artificial intelligence. Our aim is to expand this body of knowledge by applying machine learning to pulmonary auscultation signals. Despite improvements in digital stethoscopes and attemp...
Saved in:
Published in | Computers in biology and medicine Vol. 168; p. 107784 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Elsevier Ltd
01.01.2024
Elsevier Limited |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The use of machine learning in biomedical research has surged in recent years thanks to advances in devices and artificial intelligence. Our aim is to expand this body of knowledge by applying machine learning to pulmonary auscultation signals. Despite improvements in digital stethoscopes and attempts to find synergy between them and artificial intelligence, solutions for their use in clinical settings remain scarce. Physicians continue to infer initial diagnoses with less sophisticated means, resulting in low accuracy, leading to suboptimal patient care. To arrive at a correct preliminary diagnosis, the auscultation diagnostics need to be of high accuracy. Due to the large number of auscultations performed, data availability opens up opportunities for more effective sound analysis. In this study, digital 6-channel auscultations of 45 patients were used in various machine learning scenarios, with the aim of distinguishing between normal and abnormal pulmonary sounds. Audio features (such as fundamental frequencies F0-4, loudness, HNR, DFA, as well as descriptive statistics of log energy, RMS and MFCC) were extracted using the Python library Surfboard. Windowing, feature aggregation, and concatenation strategies were used to prepare data for machine learning algorithms in unsupervised (fair-cut forest, outlier forest) and supervised (random forest, regularized logistic regression) settings. The evaluation was carried out using 9-fold stratified cross-validation repeated 30 times. Decision fusion by averaging the outputs for a subject was also tested and found to be helpful. Supervised models showed a consistent advantage over unsupervised ones, with random forest achieving a mean AUC ROC of 0.691 (accuracy 71.11%, Kappa 0.416, F1-score 0.675) in side-based detection and a mean AUC ROC of 0.721 (accuracy 68.89%, Kappa 0.371, F1-score 0.650) in patient-based detection.
•Digital stethoscopes enabled quality recordings of pulmonary sounds.•The study data contains sequential 6-channel auscultations of 45 subjects.•Various schemes for the use of these biomedical audio signals were tested.•Supervised machine learning methods outperformed unsupervised ones.•The random forest had 71.11% accuracy in the side-based detection task. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 0010-4825 1879-0534 1879-0534 |
DOI: | 10.1016/j.compbiomed.2023.107784 |