The numbers reveal the author: a stylometric comparison of German-language modernist texts

The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of literary texts, based on the analysis of statistics of numerals found in them, is applied to literary texts in German. A computer program ha...

Full description

Saved in:
Bibliographic Details
Published inФилология: научные исследования no. 11; pp. 50 - 62
Main Author Zenkov, Andrei Viacheslavovich
Format Journal Article
LanguageEnglish
Published 01.11.2024
Online AccessGet full text
ISSN2454-0749
2454-0749
DOI10.7256/2454-0749.2024.11.72167

Cover

Loading…
More Information
Summary:The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of literary texts, based on the analysis of statistics of numerals found in them, is applied to literary texts in German. A computer program has been developed to search in the text for cardinal and ordinal numerals expressed both in numbers and verbally (in different word forms). The program automatically removes phraseological units and stable combinations from the text that accidentally (without the author's intention) contain numerals. Previously, the text is manually cleared of auxiliary numerals such as pagination, chapter numbers, etc. It is shown that the numerals used by the author in the (artistic) text are individual for each author; their totality is a characteristic feature (author's invariant, "fingerprint") that distinguishes the texts written by different authors. A comparative stylometric analysis of a number of literary works by Thomas Mann, Hermann Broch, Robert Musil, and Elias Canetti – the representatives of German-language literary modernism of the 20th century – is performed. Substantial authorial differences in the manner of using numerals were discovered. The results of the analysis were subjected to hierarchical clustering process (the Manhattan metric; Complete linkage and Between-groups methods). The cluster analysis correctly distributed the texts according to their authorship. The use of various clustering methods for text analysis enhances the significance of the results obtained and confirms their non-random nature. This demonstrates that the novel method of stylometry is able to accurately attribute literary texts to their correct authors.
ISSN:2454-0749
2454-0749
DOI:10.7256/2454-0749.2024.11.72167