基于多特征融合的中文微博评价对象抽取方法

中文微博的评价对象抽取作为中文微博情感分析的基础任务,受到研究者的广泛关注,有着重要的研究价值。结合微博文本的特点,对微博文本进行预处理,利用句法分析构建包括名词、名词短语、微博话题在内的评价对象候选集,再分别利用SVM模型、加权模型实现多特征融合的筛选候选评价对象方法,所用特征包括语义角色信息、最小距离和词频。算法经实验证明有效,在对候选评价对象进行筛选后,采用SVM模型的F值达到0.3573,加权模型的F值达到0.4059。...

Full description

Saved in:
Bibliographic Details
Published in计算机应用研究 Vol. 33; no. 2; pp. 378 - 383
Main Author 李景玉 张仰森 蒋玉茹
Format Journal Article
LanguageChinese
Published 北京信息科技大学智能信息处理研究所,北京,100192 2016
Subjects
Online AccessGet full text
ISSN1001-3695
DOI10.3969/j.issn.1001-3695.2016.02.013

Cover

More Information
Summary:中文微博的评价对象抽取作为中文微博情感分析的基础任务,受到研究者的广泛关注,有着重要的研究价值。结合微博文本的特点,对微博文本进行预处理,利用句法分析构建包括名词、名词短语、微博话题在内的评价对象候选集,再分别利用SVM模型、加权模型实现多特征融合的筛选候选评价对象方法,所用特征包括语义角色信息、最小距离和词频。算法经实验证明有效,在对候选评价对象进行筛选后,采用SVM模型的F值达到0.3573,加权模型的F值达到0.4059。
Bibliography:opinion target; candidate set of opinion target; syntactic analysis ; semantic role labeling; support vector machine
51-1196/TP
With a widespread name, Micro blog has been drawing more and more attention of researchers. Based on the char- acteristics of Chinese micro blog, this paper put forward a three-stepped strategy. It settled normalization of the corpus first. Followed by building of a opinion target candidate set including noun, noun phrase and micro blog topic. At last applied SVM and score rank to filter the candidate set, with respect to semantic role labeling, minimum distance and term frequency. The algorithm in this paper is confirmed effective by experimental results, the SVM model achieves 30.42% F-value, and the score rank achieves 40.59% F-value.
Li Jingyu, Zhang Yangsen, Jiang Yuru (Institute of Intelligence Information Processing, Beijing Information Science & Technology University, Beijin'g 100192, China)
ISSN:1001-3695
DOI:10.3969/j.issn.1001-3695.2016.02.013