Feature selection based on mutual information with correlation coefficient

Feature selection is an important preprocessing process in machine learning. It selects the crucial features by removing irrelevant features or redundant features from the original feature set. Most of feature selection algorithms focus on maximizing relevant information and minimizing redundant inf...

Full description

Saved in:
Bibliographic Details
Published inApplied intelligence (Dordrecht, Netherlands) Vol. 52; no. 5; pp. 5457 - 5474
Main Authors Zhou, Hongfang, Wang, Xiqian, Zhu, Rourou
Format Journal Article
LanguageEnglish
Published New York Springer US 01.03.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Feature selection is an important preprocessing process in machine learning. It selects the crucial features by removing irrelevant features or redundant features from the original feature set. Most of feature selection algorithms focus on maximizing relevant information and minimizing redundant information. In order to remove more redundant information in the evaluation criteria, we propose a feature selection based on mutual information with correlation coefficient (CCMI) in this paper. We introduce the correlation coefficient in the paper, and combine the correlation coefficient and mutual information to measure the relationship between different features. We use the absolute value of the correlation coefficient between two different features as the weight of the redundant item denoted by the mutual information in the evaluation standard. In order to select low redundancy features effectively, we also use the principle of minimization in the evaluation criteria. By comparing with 7 popular contrast algorithms in 12 data sets, CCMI has achieved the highest average classification accuracy for two classifiers of SVM and KNN. Experimental results show that our proposed CCMI has better feature classification capability.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-021-02524-x