基于多标记学习的汽车评论文本多性能识别
针对汽车产品评论文本中出现的多方面性能,提出一种基于多标记学习的汽车评论文本多方面性能识别方法。首先,结合文本挖掘方法,利用多标记文本特征选择方法选取特征,将非结构化的文本转化为结构化的多标记数据集。在此基础上,使用四种多标记分类方法,对待识别的评论文档标注一个或多个方面标记。最后,以八种多标记评价指标评估方面识别的性能。在新浪汽车评论语料上的实验表明,方面识别的子集准确率达到了95%,验证了方法的可行性。...
Saved in:
Published in | 计算机工程与科学 Vol. 38; no. 1; pp. 188 - 194 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
山西大学计算机与信息技术学院,山西太原,030006
2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | 针对汽车产品评论文本中出现的多方面性能,提出一种基于多标记学习的汽车评论文本多方面性能识别方法。首先,结合文本挖掘方法,利用多标记文本特征选择方法选取特征,将非结构化的文本转化为结构化的多标记数据集。在此基础上,使用四种多标记分类方法,对待识别的评论文档标注一个或多个方面标记。最后,以八种多标记评价指标评估方面识别的性能。在新浪汽车评论语料上的实验表明,方面识别的子集准确率达到了95%,验证了方法的可行性。 |
---|---|
Bibliography: | ZHANG Jing,LI De-yu,WANG Su-ge (School of Computer and Information Teehnology,Shanxi University,Taiyuan 030006 ,China) Aiming at the characteristics of the multi-aspect performance appeared in the automotive product reviews, this paper proposed a novel method for recognizing the multiple aspects of performance about car comment text based on multi-label learning. Firstly, appropriate words were selected as features by multi-label text feature selection method combined with the text mining technology, and then, the unstructured document corpus are transformed into structured multi-label dataset. After that,we finished marking one or more aspect tags for the unrecognized comment text with four multi-label classification methods. Finally, the recognition accuracy of multiple aspects was assessed by eight multi-label eval- uation metrics. On the Sina car review corpus, experimental results indicate the subset accuracy reaches up to 95%. Hence,our method was feasible for recognizing the multiple aspects of automobi |
ISSN: | 1007-130X |
DOI: | 10.3969/j.issn.1007-130X.2016.01.031 |