Improved Reptile Search Optimization Algorithm Using Chaotic Map and Simulated Annealing for Feature Selection in Medical Field

The increased volume of medical datasets has produced high dimensional features, negatively affecting machine learning (ML) classifiers. In ML, the feature selection process is fundamental for selecting the most relevant features and reducing redundant and irrelevant ones. The optimization algorithm...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 10; pp. 51428 - 51446
Main Authors Elgamal, Zenab, Sabri, Aznul Qalid Md, Tubishat, Mohammad, Tbaishat, Dina, Makhadmeh, Sharif Naser, Alomari, Osama Ahmad
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The increased volume of medical datasets has produced high dimensional features, negatively affecting machine learning (ML) classifiers. In ML, the feature selection process is fundamental for selecting the most relevant features and reducing redundant and irrelevant ones. The optimization algorithms demonstrate its capability to solve feature selection problems. Reptile Search Algorithm (RSA) is a new nature-inspired optimization algorithm that stimulates Crocodiles' encircling and hunting behavior. The unique search of the RSA algorithm obtains promising results compared to other optimization algorithms. However, when applied to high-dimensional feature selection problems, RSA suffers from population diversity and local optima limitations. An improved metaheuristic optimizer, namely the Improved Reptile Search Algorithm (IRSA), is proposed to overcome these limitations and adapt the RSA to solve the feature selection problem. Two main improvements adding value to the standard RSA; the first improvement is to apply the chaos theory at the initialization phase of RSA to enhance its exploration capabilities in the search space. The second improvement is to combine the Simulated Annealing (SA) algorithm with the exploitation search to avoid the local optima problem. The IRSA performance was evaluated over 20 medical benchmark datasets from the UCI machine learning repository. Also, IRSA is compared with the standard RSA and state-of-the-art optimization algorithms, including Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Grasshopper Optimization algorithm (GOA) and Slime Mould Optimization (SMO). The evaluation metrics include the number of selected features, classification accuracy, fitness value, Wilcoxon statistical test (<inline-formula> <tex-math notation="LaTeX">p </tex-math></inline-formula>-value), and convergence curve. Based on the results obtained, IRSA confirmed its superiority over the original RSA algorithm and other optimized algorithms on the majority of the medical datasets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3174854