Wisdom of artificial crowds feature selection in untargeted metabolomics: An application to the development of a blood-based diagnostic test for thrombotic myocardial infarction
Graphical Abstract. Flow chart diagram of the process used to determine a diagnostic classifier for Acute Myocardial Infarction (AMI) from the abundance of circulating metabolites in plasma. Blood samples were drawn from human subjects presenting with Thrombotic MI, Non-thrombotic MI, and Stable Cor...
Saved in:
Published in | Journal of biomedical informatics Vol. 81; pp. 53 - 60 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
United States
Elsevier Inc
01.05.2018
|
Subjects | |
Online Access | Get full text |
ISSN | 1532-0464 1532-0480 1532-0480 |
DOI | 10.1016/j.jbi.2018.03.007 |
Cover
Loading…
Summary: | Graphical Abstract. Flow chart diagram of the process used to determine a diagnostic classifier for Acute Myocardial Infarction (AMI) from the abundance of circulating metabolites in plasma. Blood samples were drawn from human subjects presenting with Thrombotic MI, Non-thrombotic MI, and Stable Coronary Artery Disease (CAD). Abundances of plasma metabolites were quantified via a non-targeted approach using ultra performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) and gas chromatography mass spectrometry (GC-MS). Feature selection was conducted by Wisdom of Artificial Crowds with other approaches (the Lasso and Random Forest Variable Importance) employed for comparison. A small fixed number of metabolites was selected, and classifiers were trained and evaluated.
[Display omitted]
•Heart disease remains a leading cause of global mortality.•Etiology of Myocardial Infarction (MI) cannot be determined by a blood-based diagnostic.•Blood plasma was analyzed from human subjects of varying MI type and stable CAD.•1032 metabolites were detected and quantified by untargeted mass spectrometry.•We compared the “Wisdom of Artificial Crowds” algorithm to traditional approaches for feature selection.•We developed a diagnostic model for discriminating thrombotic MI, an important type.
Heart disease remains a leading cause of global mortality. While acute myocardial infarction (colloquially: heart attack), has multiple proximate causes, proximate etiology cannot be determined by a blood-based diagnostic test. We enrolled a suitable patient cohort and conducted a non-targeted quantification of plasma metabolites by mass spectrometry for developing a test that can differentiate between thrombotic MI, non-thrombotic MI, and stable disease. A significant challenge in developing such a diagnostic test is solving the NP-hard problem of feature selection for constructing an optimal statistical classifier.
We employed a Wisdom of Artificial Crowds (WoAC) strategy for solving the feature selection problem and evaluated the accuracy and parsimony of downstream classifiers in comparison with traditional feature selection techniques including the Lasso and selection using Random Forest variable importance criteria.
Artificial Crowd Wisdom was generated via aggregation of the best solutions from independent and diverse genetic algorithm populations that were initialized with bootstrapping and a random subspaces constraint.
Strong evidence was observed that a statistical classifier utilizing WoAC feature selection can discriminate between human subjects presenting with thrombotic MI, non-thrombotic MI, and stable Coronary Artery Disease given abundances of selected plasma metabolites. Utilizing the abundances of twenty selected metabolites, a leave-one-out cross-validation estimated misclassification rate of 2.6% was observed. However, the WoAC feature selection strategy did not perform better than the Lasso over the current study. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Equally contributing authors |
ISSN: | 1532-0464 1532-0480 1532-0480 |
DOI: | 10.1016/j.jbi.2018.03.007 |