An Efficient Boruta-Based Feature Selection and Classification of Gene Expression Data
Gene expression data is biological data on the quantities of various transcription factors and other chemicals inside a cell at any particular time. It comes from a study of DNA microarrays. The amount of many chemical components' approaches shown by gene expression data reveals a range of fact...
Saved in:
Published in | 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT) pp. 1 - 6 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
07.10.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Gene expression data is biological data on the quantities of various transcription factors and other chemicals inside a cell at any particular time. It comes from a study of DNA microarrays. The amount of many chemical components' approaches shown by gene expression data reveals a range of facts about the cell's health. The difficulty with gene expression data is that it contains noise, missing values, and has an extremely high dimensionality since each gene in an organism's genome has a value in the thousands, despite the fact that the number of samples is considerably fewer. This leads to mistakes in the computational analysis due to the curse of dimensionality. We have utilised the feature selection approach to fix these issues. It is used to choose the most appropriate genes for the subject being studied from the large number of genes whose values are provided. Our idea is to use the Boruta feature selection algorithm, a random forest wrapper class approach, to select a collection of features from many samples produced by gene expression profiles. |
---|---|
ISBN: | 9781665468534 166546853X |
DOI: | 10.1109/GCAT55367.2022.9971894 |