Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake news

The detection of disinformation becomes a significant challenge in the modern world. Most of our communication media and most of the sources of information about reality are located on the distributed network services, where the published content is usually not a subject to any initial verification....

Full description

Saved in:

Bibliographic Details
Published in	Engineering applications of artificial intelligence Vol. 120; p. 105882
Main Authors	Ksieniewicz, Paweł, Zyblewski, Paweł, Borek-Marciniec, Weronika, Kozik, Rafał, Choraś, Michał, Woźniak, Michał
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.04.2023
Subjects	Classifier ensemble Fake news n-gram Natural language processing Pattern recognition Classifier ensemble n-gram Natural language processing Pattern recognition Fake news
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The detection of disinformation becomes a significant challenge in the modern world. Most of our communication media and most of the sources of information about reality are located on the distributed network services, where the published content is usually not a subject to any initial verification. One of the few tools that seem to be able to process such large volumes of data efficiently are pattern recognition methods employing extraction of features obtained through the Natural Language Processing models and procedures. The following paper is proposing an Alphabet Flatting – a modification of the preprocessing method for the feature extraction from large language corpora – allowing the construction of diverse classifier ensembles integrated by the support accumulation, the generalization power of which may compete with quality of the state-of-the-art models in environments with strict time constraints. The proposed method has been thoroughly evaluated with the set of computer experiments, the results of which allow us to conclude its potential usefulness in the solutions of the automatic systems for preventing the spread of fake news.
ISSN:	0952-1976 1873-6769
DOI:	10.1016/j.engappai.2023.105882