FeSTwo, a two-step feature selection algorithm based on feature engineering and sampling for the chronological age regression problem
Accurate determination of the sample's chronological age is an important forensic problem. This regression problem may be improved by selecting appropriate methylomic features. Most of the existing feature selection algorithms, however, optimize the regression performance by considering only th...
Saved in:
Published in | Computers in biology and medicine Vol. 125; p. 104008 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Elsevier Ltd
01.10.2020
Elsevier Limited |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Accurate determination of the sample's chronological age is an important forensic problem. This regression problem may be improved by selecting appropriate methylomic features. Most of the existing feature selection algorithms, however, optimize the regression performance by considering only the original features. This study proposed four feature engineering strategies to transform the original methylomic features. The regression performance of the age regression model was improved by the resampling-based feature selection algorithm FeSTwo proposed in this study. FeSTwo outperformed the parallel algorithms used in the previous studies even with the electronic health record data. The age prediction performance of the FeSTwo-detected features was also confirmed for another independent dataset. The study results demonstrated that the proposed model, FeSTwo, led to a more than 8% reduction in root-mean-square error (RMSE) on the test dataset with only 70 features.
•A two-step feature selection algorithm for methylomic biomarkers of age prediction.•Optimization of age prediction using feature engineering and sampling.•Gender variation, feature interaction, and squaring for engineering features. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0010-4825 1879-0534 |
DOI: | 10.1016/j.compbiomed.2020.104008 |