Resources for Turkish morphological processing
We present a set of language resources and tools—a morphological parser, a morphological disambiguator, and a text corpus—for exploiting Turkish morphology in natural language processing applications. The morphological parser is a state-of-the-art finite-state transducer-based implementation of Turk...
Saved in:
Published in | Language Resources and Evaluation Vol. 45; no. 2; pp. 249 - 261 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Dordrecht
Springer
01.05.2011
Springer Netherlands Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We present a set of language resources and tools—a morphological parser, a morphological disambiguator, and a text corpus—for exploiting Turkish morphology in natural language processing applications. The morphological parser is a state-of-the-art finite-state transducer-based implementation of Turkish morphology. The disambiguator is based on the averaged perceptron algorithm and has the best accuracy reported for Turkish in the literature. The text corpus has been compiled from the web and contains about 500 million tokens. This is the largest Turkish web corpus published. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1 |
ISSN: | 1574-020X 1572-8412 1574-0218 |
DOI: | 10.1007/s10579-010-9128-6 |