Feature selection with ensemble learning for prostate cancer diagnosis from microarray gene expression

Cancer diagnosis using machine learning algorithms is one of the main topics of research in computer-based medical science. Prostate cancer is considered one of the reasons that are leading to deaths worldwide. Data analysis of gene expression from microarray using machine learning and soft computin...

Full description

Saved in:
Bibliographic Details
Published inHealth informatics journal Vol. 27; no. 1; p. 1460458221989402
Main Authors Gumaei, Abdu, Sammouda, Rachid, Al-Rakhami, Mabrook, AlSalman, Hussain, El-Zaart, Ali
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.01.2021
SAGE PUBLICATIONS, INC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cancer diagnosis using machine learning algorithms is one of the main topics of research in computer-based medical science. Prostate cancer is considered one of the reasons that are leading to deaths worldwide. Data analysis of gene expression from microarray using machine learning and soft computing algorithms is a useful tool for detecting prostate cancer in medical diagnosis. Even though traditional machine learning methods have been successfully applied for detecting prostate cancer, the large number of attributes with a small sample size of microarray data is still a challenge that limits their ability for effective medical diagnosis. Selecting a subset of relevant features from all features and choosing an appropriate machine learning method can exploit the information of microarray data to improve the accuracy rate of detection. In this paper, we propose to use a correlation feature selection (CFS) method with random committee (RC) ensemble learning to detect prostate cancer from microarray data of gene expression. A set of experiments are conducted on a public benchmark dataset using 10-fold cross-validation technique to evaluate the proposed approach. The experimental results revealed that the proposed approach attains 95.098% accuracy, which is higher than related work methods on the same dataset.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1460-4582
1741-2811
DOI:10.1177/1460458221989402