Computing Word Similarity on Large-Scale Corpus

This paper proposes a novel approach for word similarity computation based on word sense vectors. The word sense vector is built using HIT-IR Tongyici Cilin (extended) for concept generalization and is further modified by the use of relative and absolute frequency filters. Experiments show that the...

Full description

Saved in:
Bibliographic Details
Published in2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC) pp. 1076 - 1079
Main Authors Tao Xu, Weiguang Qu, Xuri Tang, Dexin Ding, Bin Li, Hui Li
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2009
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper proposes a novel approach for word similarity computation based on word sense vectors. The word sense vector is built using HIT-IR Tongyici Cilin (extended) for concept generalization and is further modified by the use of relative and absolute frequency filters. Experiments show that the approach not only overcomes the problem of similarity computation of unseen words but also yields a result closer to human judgment when compared to word similarity computation approaches based on dictionaries.
ISBN:142445543X
9781424455430
DOI:10.1109/ICICIC.2009.145