A practical framework for formalizing and extracting Chinese collocations

In this paper we argue for a word-sense based formalization for collocation, and proposes a seed-based approach for collocation extraction for specific purposes. The approach uses RFR_SUM model to iteratively classify polysemous word sense in the corpus. The collocation strength is also obtained by...

Full description

Saved in:
Bibliographic Details
Published in2011 7th International Conference on Natural Language Processing and Knowledge Engineering pp. 390 - 396
Main Authors Weiguang Qu, Xuri Tang, Junsheng Zhou, Yanhui Gu, Bin Li
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper we argue for a word-sense based formalization for collocation, and proposes a seed-based approach for collocation extraction for specific purposes. The approach uses RFR_SUM model to iteratively classify polysemous word sense in the corpus. The collocation strength is also obtained by RFR. To capture the syntactic relation inside collocations, this paper presents a frame-based collocation extraction method, which uses word-related frames to obtain collocation with structural information automatically from a large-scale corpus with an average accuracy rate of 89.69%.
DOI:10.1109/NLPKE.2011.6138230