Study of Machine-Learning Classifier and Feature Set Selection for Intent Classification of Korean Tweets about Food Safety

In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyw...

Full description

Saved in:
Bibliographic Details
Published inJournal of information science theory and practice Vol. 2; no. 3; pp. 29 - 39
Main Authors Yeom, Ha-Neul, Hwang, Myunggwon, Hwang, Mi-Nyeong, Jung, Hanmin
Format Journal Article
LanguageEnglish
Published Daejeon Korean Institute of Science and Technology Information 01.09.2014
Korea Institute of Science and Technology Information
한국과학기술정보연구원
Subjects
Online AccessGet full text
ISSN2287-9099
2287-4577
DOI10.1633/JISTaP.2014.2.3.3

Cover

More Information
Summary:In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyword use in most tweets, we focus on optimistic machine-learning and feature set selection to classify collected tweets. We build the classifier model using Naive Bayes & Naive Bayes Multinomial, Support Vector Machine, and Decision Tree Algorithms, all of which show good performance. To select an optimum feature set, we construct a basic feature set as a standard for performance comparison, so that further test feature sets can be evaluated. Experiments show that precision and F-measure performance are best when using a Naive Bayes Multinomial classifier model with a test feature set defined by extracting Substantive, Predicate, Modifier, and Interjection parts of speech.
Bibliography:G704-001608.2014.2.3.001
ISSN:2287-9099
2287-4577
DOI:10.1633/JISTaP.2014.2.3.3