Resources for Turkish morphological processing

We present a set of language resources and tools—a morphological parser, a morphological disambiguator, and a text corpus—for exploiting Turkish morphology in natural language processing applications. The morphological parser is a state-of-the-art finite-state transducer-based implementation of Turk...

Full description

Saved in:
Bibliographic Details
Published inLanguage Resources and Evaluation Vol. 45; no. 2; pp. 249 - 261
Main Authors Sak, Haşim, Güngör, Tunga, Saraçlar, Murat
Format Journal Article
LanguageEnglish
Published Dordrecht Springer 01.05.2011
Springer Netherlands
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present a set of language resources and tools—a morphological parser, a morphological disambiguator, and a text corpus—for exploiting Turkish morphology in natural language processing applications. The morphological parser is a state-of-the-art finite-state transducer-based implementation of Turkish morphology. The disambiguator is based on the averaged perceptron algorithm and has the best accuracy reported for Turkish in the literature. The text corpus has been compiled from the web and contains about 500 million tokens. This is the largest Turkish web corpus published.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:1574-020X
1572-8412
1574-0218
DOI:10.1007/s10579-010-9128-6