Text-mining-based feature selection for anticancer drug response prediction

Abstract Motivation Predicting anticancer treatment response from baseline genomic data is a critical obstacle in personalized medicine. Machine learning methods are commonly used for predicting drug response from gene expression data. In the process of constructing these machine learning models, on...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics advances Vol. 4; no. 1; p. vbae047
Main Authors Wu, Grace, Zaker, Arvin, Ebrahimi, Amirhosein, Tripathi, Shivanshi, Mer, Arvind Singh
Format Journal Article
LanguageEnglish
Published England Oxford University Press 2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Motivation Predicting anticancer treatment response from baseline genomic data is a critical obstacle in personalized medicine. Machine learning methods are commonly used for predicting drug response from gene expression data. In the process of constructing these machine learning models, one of the most significant challenges is identifying appropriate features among a massive number of genes. Results In this study, we utilize features (genes) extracted using the text-mining of scientific literatures. Using two independent cancer pharmacogenomic datasets, we demonstrate that text-mining-based features outperform traditional feature selection techniques in machine learning tasks. In addition, our analysis reveals that text-mining feature-based machine learning models trained on in vitro data also perform well when predicting the response of in vivo cancer models. Our results demonstrate that text-mining-based feature selection is an easy to implement approach that is suitable for building machine learning models for anticancer drug response prediction. Availability and implementation https://github.com/merlab/text_features.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Grace Wu and Arvin Zaker Equal contribution.
ISSN:2635-0041
2635-0041
DOI:10.1093/bioadv/vbae047