The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations

Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to authorship attribution, detection of plagiarism, style analysis, sour...

Full description

Saved in:

Bibliographic Details
Main Authors	Duncan, Jeremiah, Fallas, Fabian, Gropp, Chris, Herron, Emily, Mahbub, Maria, Olaya, Paula, Ponce, Eduardo, Samuel, Tabitha K, Schultz, Daniel, Srinivasan, Sudarshan, Tang, Maofeng, Zenkov, Viktor, Zhou, Quan, Begoli, Edmon
Format	Journal Article
Language	English
Published	23.02.2021
Subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online Access	Get full text
DOI	10.48550/arxiv.2102.11917

Cover

Loading…

More Information
Summary:	Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to authorship attribution, detection of plagiarism, style analysis, sources of misinformation, etc. The focus of this paper is to explore the limitations and sensitiveness of established approaches to adversarial manipulations of inputs. To this end, and using those established techniques, we first developed an experimental frame-work for author detection and input perturbations. Next, we experimentally evaluated the performance of the authorship detection model to a collection of semantic-preserving adversarial perturbations of input narratives. Finally, we compare and analyze the effects of different perturbation strategies, input and model configurations, and the effects of these on the author detection model.
DOI:	10.48550/arxiv.2102.11917