Machine Learning Model for Detecting Fake News Content in Indonesian-Language Online Media

The development of the internet is one of the factors causing the information explosion. Websites and blogs previously only used to convey information have now evolved into media used to produce information. As a result, currently circulating information tends to be accessible without validation and...

Full description

Saved in:
Bibliographic Details
Published in2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) pp. 302 - 307
Main Authors Yanuar Risca Pratiwi, Inggrid, Ferdita Nugraha, Anggit, Pristyanto, Yoga, Faticha Alfa Aziza, Rifda, Kuswanto, Jeki, Hadi Purwanto, Ibnu
Format Conference Proceeding
LanguageEnglish
Published IEEE 16.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The development of the internet is one of the factors causing the information explosion. Websites and blogs previously only used to convey information have now evolved into media used to produce information. As a result, currently circulating information tends to be accessible without validation and legitimacy from a credible group. One thing that is in the spotlight is the news content. The existence of a person's freedom to produce news without the need for validation before publication makes the validity of the news content questionable and tends to lead to fake news content. Fake news content can be hazardous because of its nature which can affect the perception of its readers and can even cause divisions between individuals and groups. Awareness of fake news material must be strongly established to prevent harm and reduce losses caused by fake news content. System support is also needed as a form of verification of the level of validity of news. Therefore, this study experimented with detecting fake news content using a machine learning model, especially in Indonesian online media. The experiment in this study was conducted using the "Indonesian Hoax News Detection Dataset," N-Gram Model, and Term Frequent-Inverse Document Frequency (TFIDF) as a process for feature extraction and the Naïve Bayes algorithm as a model for detecting fake news content. As a result, the best performance was obtained using the Trigram model for feature extraction with an accuracy value of 71.00%, a precision value of 74%, a recall value of 92.8%, and an F1-score value of 80.00%. With these results, it is hoped that this research can be used as a reference in developing various models to detect fake news content, especially in Indonesian language online media
DOI:10.1109/ICIMCIS56303.2022.10017528