Genetic Algorithm Based Feature Selection and Parameter Optimization for Support Vector Regression Applied to Semantic Textual Similarity

Semantic textual similarity（STS） is a common task in natural language processing（NLP）. STS measures the degree of semantic equivalence of two textual snippets. Recently, machine learning methods have been applied to this task, including methods based on support vector regression（SVR）. However, there...

Full description

Saved in:

Bibliographic Details
Published in	Shanghai jiao tong da xue xue bao Vol. 20; no. 2; pp. 143 - 148
Main Author	苏柏桦王英林
Format	Journal Article
Language	English
Published	Heidelberg Shanghai Jiaotong University Press 01.04.2015
Subjects	Architecture Computer Science Electrical Engineering Engineering Genetic algorithms Learning Life Sciences Materials Science Mathematical analysis Mathematical models Regression Semantics Similarity Tasks TP 311.5 support vector regression (SVR) semantic textural similarity (STS) feature selection
Online Access	Get full text
ISSN	1007-1172 1995-8188
DOI	10.1007/s12204-015-1602-2

Cover

Loading…

More Information
Summary:	Semantic textual similarity（STS） is a common task in natural language processing（NLP）. STS measures the degree of semantic equivalence of two textual snippets. Recently, machine learning methods have been applied to this task, including methods based on support vector regression（SVR）. However, there exist amounts of features involved in the learning process, part of which are noisy features and irrelative to the result.Furthermore, different parameters will significantly influence the prediction performance of the SVR model. In this paper, we propose genetic algorithm（GA） to select the effective features and optimize the parameters in the learning process, simultaneously. To evaluate the proposed approach, we adopt the STS-2012 dataset in the experiment. Compared with the grid search, the proposed GA-based approach has better regression performance.
Bibliography:	31-1943/U support vector regression（SVR）,feature selection,semantic textural similarity（STS） SU Bai-hua, WANG Ying-lin（1.Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200240, China ; 2. Department of Computer Science and Technology, Shanghai University of Finance and Economics, Shanghai 200433, China） Semantic textual similarity（STS） is a common task in natural language processing（NLP）. STS measures the degree of semantic equivalence of two textual snippets. Recently, machine learning methods have been applied to this task, including methods based on support vector regression（SVR）. However, there exist amounts of features involved in the learning process, part of which are noisy features and irrelative to the result.Furthermore, different parameters will significantly influence the prediction performance of the SVR model. In this paper, we propose genetic algorithm（GA） to select the effective features and optimize the parameters in the learning process, simultaneously. To evaluate the proposed approach, we adopt the STS-2012 dataset in the experiment. Compared with the grid search, the proposed GA-based approach has better regression performance. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1007-1172 1995-8188
DOI:	10.1007/s12204-015-1602-2