A Korean Learner Corpus and Its Features

In this paper, we present a Korean learner corpus and aim to find features to characterize this corpus. The corpus is based on an open writing test of Korean learners (beginners, intermediate and advanced students) with various topics and were manually evaluated and scored. We explore several types...

Full description

Saved in:
Bibliographic Details
Published inEONEOHAG no. 75; pp. 69 - 85
Main Authors Jungyeul Park(박정열), Jung Hee Lee(이정희)
Format Journal Article
LanguageEnglish
Published 사단법인 한국언어학회 01.08.2016
Subjects
Online AccessGet full text
ISSN1225-7494
2508-4429
DOI10.17290/jlsk.2016..75.69

Cover

Loading…
More Information
Summary:In this paper, we present a Korean learner corpus and aim to find features to characterize this corpus. The corpus is based on an open writing test of Korean learners (beginners, intermediate and advanced students) with various topics and were manually evaluated and scored. We explore several types of features in the learner corpus by starting with the pre-processing of Korean sentences. Some features are automatically measured using parts of speech tagging which concerns the number of tokens and the correct use of Functional morphemes. Syntax-related features and topic-related features are measured while using the automatic syntactic parsing and statistical language models. These features can be used for language proficiency identification and other learner corpus related applications that make use of machine learning techniques. KCI Citation Count: 0
Bibliography:G704-000314.2016..75.004
ISSN:1225-7494
2508-4429
DOI:10.17290/jlsk.2016..75.69