Feature selection for speaker verification using genetic programming
We present a study examining feature selection from high performing models evolved using genetic programming (GP) on the problem of automatic speaker verification (ASV). ASV is a highly unbalanced binary classification problem in which a given speaker must be verified against everyone else. We evolv...
Saved in:
Published in | Evolutionary intelligence Vol. 10; no. 1-2; pp. 1 - 21 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.07.2017
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We present a study examining feature selection from high performing models evolved using genetic programming (GP) on the problem of automatic speaker verification (ASV). ASV is a highly unbalanced binary classification problem in which a given speaker must be verified against everyone else. We evolve classification models for 10 individual speakers using a variety of fitness functions and data sampling techniques and examine the generalisation of each model on a 1:9 unbalanced set. A significant difference between train and test performance is found which may indicate overfitting in the models. Using only the best generalising models, we examine two methods for selecting the most important features. We compare the performance of a number of tuned machine learning classifiers using the full 275 features and a reduced set of 20 features from both feature selection methods. Results show that using only the top 20 features found in high performing GP programs led to test classifications that are as good as, or better than, those obtained using all data in the majority of experiments undertaken. The classification accuracy between speakers varies considerably across all experiments showing that some speakers are easier to classify than others. This indicates that in such real-world classification problems, the content and quality of the original data has a very high influence on the quality of results obtainable. |
---|---|
ISSN: | 1864-5909 1864-5917 |
DOI: | 10.1007/s12065-016-0150-5 |