基于多特征融合的中文微博评价对象抽取方法

中文微博的评价对象抽取作为中文微博情感分析的基础任务，受到研究者的广泛关注，有着重要的研究价值。结合微博文本的特点，对微博文本进行预处理，利用句法分析构建包括名词、名词短语、微博话题在内的评价对象候选集，再分别利用SVM模型、加权模型实现多特征融合的筛选候选评价对象方法，所用特征包括语义角色信息、最小距离和词频。算法经实验证明有效，在对候选评价对象进行筛选后，采用SVM模型的F值达到0．3573，加权模型的F值达到0．4059。...

Full description

Saved in:

Bibliographic Details
Published in	计算机应用研究 Vol. 33; no. 2; pp. 378 - 383
Main Author	李景玉张仰森蒋玉茹
Format	Journal Article
Language	Chinese
Published	北京信息科技大学智能信息处理研究所,北京,100192 2016
Subjects	句法分析支持向量机评价对象评价对象候选集语义角色标注支持向量机 semantic role labeling support vector machine 评价对象候选集评价对象语义角色标注 candidate set of opinion target syntactic analysis opinion target 句法分析
Online Access	Get full text
ISSN	1001-3695
DOI	10.3969/j.issn.1001-3695.2016.02.013

Cover

More Information
Summary:	中文微博的评价对象抽取作为中文微博情感分析的基础任务，受到研究者的广泛关注，有着重要的研究价值。结合微博文本的特点，对微博文本进行预处理，利用句法分析构建包括名词、名词短语、微博话题在内的评价对象候选集，再分别利用SVM模型、加权模型实现多特征融合的筛选候选评价对象方法，所用特征包括语义角色信息、最小距离和词频。算法经实验证明有效，在对候选评价对象进行筛选后，采用SVM模型的F值达到0．3573，加权模型的F值达到0．4059。
Bibliography:	opinion target; candidate set of opinion target; syntactic analysis ; semantic role labeling; support vector machine 51-1196/TP With a widespread name, Micro blog has been drawing more and more attention of researchers. Based on the char- acteristics of Chinese micro blog, this paper put forward a three-stepped strategy. It settled normalization of the corpus first. Followed by building of a opinion target candidate set including noun, noun phrase and micro blog topic. At last applied SVM and score rank to filter the candidate set, with respect to semantic role labeling, minimum distance and term frequency. The algorithm in this paper is confirmed effective by experimental results, the SVM model achieves 30.42% F-value, and the score rank achieves 40.59% F-value. Li Jingyu, Zhang Yangsen, Jiang Yuru （Institute of Intelligence Information Processing, Beijing Information Science ＆ Technology University, Beijin＇g 100192, China）
ISSN:	1001-3695
DOI:	10.3969/j.issn.1001-3695.2016.02.013