Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection

Hybrid genetic algorithms (GA) and artificial neural networks (ANN) are not new in the machine learning culture. Such hybrid systems have been shown to be very successful in classification and prediction problems. However, little attention has been focused on this architecture as a feature selection...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of machine learning and cybernetics Vol. 1; no. 1-4; pp. 75 - 87
Main Authors Tong, Dong Ling, Mintram, Robert
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer-Verlag 01.12.2010
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Hybrid genetic algorithms (GA) and artificial neural networks (ANN) are not new in the machine learning culture. Such hybrid systems have been shown to be very successful in classification and prediction problems. However, little attention has been focused on this architecture as a feature selection method and the consequent significance of the ANN activation function and the number of GA evaluations on the feature selection performance. The activation function is one of the core components of the ANN architecture and influences the learning and generalization capability of the network. Meanwhile the GA searches for an optimal ANN classifier given a set of chromosomes selected from those available. The objective of the GA is to combine the search for optimum chromosome choices with that of finding an optimum classifier for each choice. The process operates as a form of co-evolution with the eventual objective of finding an optimum chromosome selection rather than an optimum classifier. The selection of an optimum chromosome set is referred to in this paper as feature selection. Quantitative comparisons of four of the most commonly used ANN activation functions against ten GA evaluation step counts and three population sizes are presented. These studies employ four data sets with high dimension and low significant datum instances. That is to say that each datum has a high attribute count and the unusual or abnormal data are sparse within the data set. Results suggest that the hyperbolic tangent (tanh) activation function outperforms other common activation functions by extracting a smaller, but more significant feature set. Furthermore, it was found that fitness evaluation sizes ranging from 20,000 to 40,000 within populations ranging from 200 to 300, deliver optimum feature selection capability. Again, optimum in this sense meaning a smaller but more significant feature set.
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-010-0004-x