Addition of Pathway-Based Information to Improve Predictions in Transcriptomics
The diagnosis and prognosis of cancer are among the more critical challenges that modern medicine confronts. In this sense, personalized medicine aims to use data from heterogeneous sources to estimate the evolution of the disease for each specific patient in order to fit the more appropriate treatm...
Saved in:
Published in | Bioinformatics and Biomedical Engineering pp. 200 - 208 |
---|---|
Main Authors | , , , |
Format | Book Chapter |
Language | English |
Published |
Cham
Springer International Publishing
|
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The diagnosis and prognosis of cancer are among the more critical challenges that modern medicine confronts. In this sense, personalized medicine aims to use data from heterogeneous sources to estimate the evolution of the disease for each specific patient in order to fit the more appropriate treatments. In recent years, DNA sequencing data have boosted cancer prediction and treatment by supplying genetic information that has been used to design genetic signatures or biomarkers that led to a better classification of the different subtypes of cancer as well as to a better estimation of the evolution of the disease and the response to diverse treatments. Several machine learning models have been proposed in the literature for cancer prediction. However, the efficacy of these models can be seriously affected by the existing imbalance between the high dimensionality of the gene expression feature sets and the number of samples available, what is known as the curse of dimensionality. Although linear predictive models could give worse performance rates when compared to more sophisticated non-linear models, they have the main advantage of being interpretable. However, the use of domain-specific information has been proved useful to boost the performance of multivariate linear predictors in high dimensional settings. In this work, we design a set of linear predictive models that incorporate domain-specific information from genetic pathways for effective feature selection. By combining these linear model with other classical machine learning models, we get state-of-art performance rates in the prediction of vital status on a public cancer dataset. |
---|---|
ISBN: | 3030179346 9783030179342 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-030-17935-9_19 |