基于多特征融合的中文微博评价对象抽取方法
中文微博的评价对象抽取作为中文微博情感分析的基础任务,受到研究者的广泛关注,有着重要的研究价值。结合微博文本的特点,对微博文本进行预处理,利用句法分析构建包括名词、名词短语、微博话题在内的评价对象候选集,再分别利用SVM模型、加权模型实现多特征融合的筛选候选评价对象方法,所用特征包括语义角色信息、最小距离和词频。算法经实验证明有效,在对候选评价对象进行筛选后,采用SVM模型的F值达到0.3573,加权模型的F值达到0.4059。...
Saved in:
Published in | 计算机应用研究 Vol. 33; no. 2; pp. 378 - 383 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
北京信息科技大学智能信息处理研究所,北京,100192
2016
|
Subjects | |
Online Access | Get full text |
ISSN | 1001-3695 |
DOI | 10.3969/j.issn.1001-3695.2016.02.013 |
Cover
Summary: | 中文微博的评价对象抽取作为中文微博情感分析的基础任务,受到研究者的广泛关注,有着重要的研究价值。结合微博文本的特点,对微博文本进行预处理,利用句法分析构建包括名词、名词短语、微博话题在内的评价对象候选集,再分别利用SVM模型、加权模型实现多特征融合的筛选候选评价对象方法,所用特征包括语义角色信息、最小距离和词频。算法经实验证明有效,在对候选评价对象进行筛选后,采用SVM模型的F值达到0.3573,加权模型的F值达到0.4059。 |
---|---|
Bibliography: | opinion target; candidate set of opinion target; syntactic analysis ; semantic role labeling; support vector machine 51-1196/TP With a widespread name, Micro blog has been drawing more and more attention of researchers. Based on the char- acteristics of Chinese micro blog, this paper put forward a three-stepped strategy. It settled normalization of the corpus first. Followed by building of a opinion target candidate set including noun, noun phrase and micro blog topic. At last applied SVM and score rank to filter the candidate set, with respect to semantic role labeling, minimum distance and term frequency. The algorithm in this paper is confirmed effective by experimental results, the SVM model achieves 30.42% F-value, and the score rank achieves 40.59% F-value. Li Jingyu, Zhang Yangsen, Jiang Yuru (Institute of Intelligence Information Processing, Beijing Information Science & Technology University, Beijin'g 100192, China) |
ISSN: | 1001-3695 |
DOI: | 10.3969/j.issn.1001-3695.2016.02.013 |