The numbers reveal the author: a stylometric comparison of German-language modernist texts

The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of literary texts, based on the analysis of statistics of numerals found in them, is applied to literary texts in German. A computer program ha...

Full description

Saved in:
Bibliographic Details
Published inФилология: научные исследования no. 11; pp. 50 - 62
Main Author Zenkov, Andrei Viacheslavovich
Format Journal Article
LanguageEnglish
Published 01.11.2024
Online AccessGet full text
ISSN2454-0749
2454-0749
DOI10.7256/2454-0749.2024.11.72167

Cover

Loading…
Abstract The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of literary texts, based on the analysis of statistics of numerals found in them, is applied to literary texts in German. A computer program has been developed to search in the text for cardinal and ordinal numerals expressed both in numbers and verbally (in different word forms). The program automatically removes phraseological units and stable combinations from the text that accidentally (without the author's intention) contain numerals. Previously, the text is manually cleared of auxiliary numerals such as pagination, chapter numbers, etc. It is shown that the numerals used by the author in the (artistic) text are individual for each author; their totality is a characteristic feature (author's invariant, "fingerprint") that distinguishes the texts written by different authors. A comparative stylometric analysis of a number of literary works by Thomas Mann, Hermann Broch, Robert Musil, and Elias Canetti – the representatives of German-language literary modernism of the 20th century – is performed. Substantial authorial differences in the manner of using numerals were discovered. The results of the analysis were subjected to hierarchical clustering process (the Manhattan metric; Complete linkage and Between-groups methods). The cluster analysis correctly distributed the texts according to their authorship. The use of various clustering methods for text analysis enhances the significance of the results obtained and confirms their non-random nature. This demonstrates that the novel method of stylometry is able to accurately attribute literary texts to their correct authors.
AbstractList The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of literary texts, based on the analysis of statistics of numerals found in them, is applied to literary texts in German. A computer program has been developed to search in the text for cardinal and ordinal numerals expressed both in numbers and verbally (in different word forms). The program automatically removes phraseological units and stable combinations from the text that accidentally (without the author's intention) contain numerals. Previously, the text is manually cleared of auxiliary numerals such as pagination, chapter numbers, etc. It is shown that the numerals used by the author in the (artistic) text are individual for each author; their totality is a characteristic feature (author's invariant, "fingerprint") that distinguishes the texts written by different authors. A comparative stylometric analysis of a number of literary works by Thomas Mann, Hermann Broch, Robert Musil, and Elias Canetti – the representatives of German-language literary modernism of the 20th century – is performed. Substantial authorial differences in the manner of using numerals were discovered. The results of the analysis were subjected to hierarchical clustering process (the Manhattan metric; Complete linkage and Between-groups methods). The cluster analysis correctly distributed the texts according to their authorship. The use of various clustering methods for text analysis enhances the significance of the results obtained and confirms their non-random nature. This demonstrates that the novel method of stylometry is able to accurately attribute literary texts to their correct authors.
Author Zenkov, Andrei Viacheslavovich
Author_xml – sequence: 1
  givenname: Andrei Viacheslavovich
  surname: Zenkov
  fullname: Zenkov, Andrei Viacheslavovich
BookMark eNpNkM1KAzEcxINUsNY-g3mBXZNsPna9SdFWKHjpyUvIZv9pF3aTkqRi396uiniaYQYG5neLZj54QOieklIxIR8YF7wgijclI4yXlF5iKtUVmv81s3_-Bi1T6lvCuapkw8Qcve8OgP1pbCEmHOEDzIDzJTKnfAjxERuc8nkII-TYW2zDeDSxT8Hj4PAa4mh8MRi_P5k94DF0EH2fMs7wmdMdunZmSLD81QXavTzvVpti-7Z-XT1tC9soVXQtra2VwEHUUjRW1M4xU1vVNsJxwaSg0klak4600FSKcku56hhjBFgtVLVA6mfWxpBSBKePsR9NPGtK9ARJT_f1dF9PkDSl-htS9QXiV1zZ
Cites_doi 10.1007/978-3-476-05650-4
10.25136/2409-8698.2023.10.68743
10.3390/stats4040060
10.17104/9783406750205
10.1524/9783050053592
10.1515/9783110363814
10.1515/9783110201857
10.1017/CBO9780511484049
10.1080/09296174.2017.1371915
10.1515/9783110351958
10.30853/phil20230568
10.1093/llc/17.3.267
10.1515/9783111384733
10.1002/asi.21001
10.1515/9783110988352
10.1515/9781571136367
10.1145/3132039
10.1137/1.9780898718348
10.1515/9783839449967
10.1515/9783110255577
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.7256/2454-0749.2024.11.72167
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
EISSN 2454-0749
EndPage 62
ExternalDocumentID 10_7256_2454_0749_2024_11_72167
GroupedDBID AAYXX
CITATION
M~E
ID FETCH-LOGICAL-c977-db18cc6e4e58659c58ff2a8c7b95f4526516f6180d0be93714c147d2220e28573
ISSN 2454-0749
IngestDate Tue Jul 01 01:02:38 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 11
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c977-db18cc6e4e58659c58ff2a8c7b95f4526516f6180d0be93714c147d2220e28573
OpenAccessLink https://doi.org/10.7256/2454-0749.2024.11.72167
PageCount 13
ParticipantIDs crossref_primary_10_7256_2454_0749_2024_11_72167
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-11-00
PublicationDateYYYYMMDD 2024-11-01
PublicationDate_xml – month: 11
  year: 2024
  text: 2024-11-00
PublicationDecade 2020
PublicationTitle Филология: научные исследования
PublicationYear 2024
References ref13
ref35
ref12
ref34
ref15
ref37
ref14
ref36
ref31
ref30
ref11
ref33
ref10
ref32
ref2
ref1
ref17
ref39
ref16
ref38
ref19
ref18
ref24
ref46
ref23
ref45
ref26
ref25
ref47
ref20
ref42
ref41
ref22
ref44
ref21
ref43
ref28
ref27
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
ref40
References_xml – ident: ref37
– ident: ref42
  doi: 10.1007/978-3-476-05650-4
– ident: ref1
– ident: ref3
– ident: ref7
  doi: 10.25136/2409-8698.2023.10.68743
– ident: ref5
– ident: ref20
– ident: ref45
– ident: ref4
  doi: 10.3390/stats4040060
– ident: ref13
  doi: 10.17104/9783406750205
– ident: ref29
– ident: ref43
  doi: 10.1524/9783050053592
– ident: ref46
  doi: 10.1515/9783110363814
– ident: ref25
  doi: 10.1515/9783110201857
– ident: ref41
– ident: ref22
– ident: ref19
– ident: ref32
– ident: ref14
  doi: 10.1017/CBO9780511484049
– ident: ref17
– ident: ref34
– ident: ref2
  doi: 10.1080/09296174.2017.1371915
– ident: ref15
– ident: ref30
– ident: ref36
  doi: 10.1515/9783110351958
– ident: ref6
  doi: 10.30853/phil20230568
– ident: ref38
– ident: ref11
  doi: 10.1093/llc/17.3.267
– ident: ref24
  doi: 10.1515/9783111384733
– ident: ref44
– ident: ref28
– ident: ref9
  doi: 10.1002/asi.21001
– ident: ref21
– ident: ref40
– ident: ref23
– ident: ref27
  doi: 10.1515/9783110988352
– ident: ref39
  doi: 10.1515/9781571136367
– ident: ref8
– ident: ref10
  doi: 10.1145/3132039
– ident: ref18
– ident: ref47
  doi: 10.1137/1.9780898718348
– ident: ref16
– ident: ref33
  doi: 10.1515/9783839449967
– ident: ref35
– ident: ref12
– ident: ref26
  doi: 10.1515/9783110255577
– ident: ref31
SSID ssib044736925
Score 2.2730901
Snippet The present study pertains to stylometry (and, more broadly, to quantitative linguistics). The novel quantitative method of studying the author's style of...
SourceID crossref
SourceType Index Database
StartPage 50
Title The numbers reveal the author: a stylometric comparison of German-language modernist texts
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELbKcuGCWAECdkE-0FPl4iR27HDbR5cV0nIqaMUlShxHu0IkqJv2wIH_y79gxo7TqCyC5TJyrXpsdb6OZ6x5EPLa8LQytUlYVkvLRME5K8EuYWA768pWFowEzEa--JCefxTvL-XlZPJzFLW07sq5-X5rXsn_SBXmQK6YJXsHyQ5MYQLGIF-gIGGg_yxj39IDH_83WCMY7chi3V21K5_GfNOBR_4V22aZPt687zk4e4c6uWHhwdL1xFlhHd0ZBoPcjK3W6SmfHgmkx9rRY0cXv42T8J1oqs_wAO7zqaPczSaOqjAPY79ezkZLo0AH9tJRMdoqDkyHDfpth6dw23xpN0PQ5vXs0zUWr4a_wKYF9Xg1fu-IRZ_4N6jFWEjBwPDxitbeMtcjNhrpZV_ctr_hvfrfvTsUGH-YHhNYzXFzuFHmWN5Iba_LECKwc4sOsY3gVSGrHBnlyChHRuBp5Y7RPXI_VspFFFz8WATVJ4RK0sy1CB4O4KMRkdebPxxqZEuNjKLlI_Kw92bokYfmPpnY5jH5DLCkPSyphyUFWFIPy7e0oCNQ0i0oaVvTHVDSAZTUgfIJWZ4tlifnrG_hwQw4FqwqI21MaoWVOpWZkbqu40IbVWayxub2MkrrNNK84qXFyozCREJVYLNyG2upkqdkr2kb-4xQqawokwS8YSWFMkbrgtcSnJuYZ7yMk-eEh98i_-YLteR_EcWLuy85IA-2mDwke91qbV-CVdqVr5w8fwHD-29A
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+numbers+reveal+the+author%3A+a+stylometric+comparison+of+German-language+modernist+texts&rft.jtitle=%D0%A4%D0%B8%D0%BB%D0%BE%D0%BB%D0%BE%D0%B3%D0%B8%D1%8F%3A+%D0%BD%D0%B0%D1%83%D1%87%D0%BD%D1%8B%D0%B5+%D0%B8%D1%81%D1%81%D0%BB%D0%B5%D0%B4%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%8F&rft.au=Zenkov%2C+Andrei+Viacheslavovich&rft.date=2024-11-01&rft.issn=2454-0749&rft.eissn=2454-0749&rft.issue=11&rft.spage=50&rft.epage=62&rft_id=info:doi/10.7256%2F2454-0749.2024.11.72167&rft.externalDBID=n%2Fa&rft.externalDocID=10_7256_2454_0749_2024_11_72167
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2454-0749&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2454-0749&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2454-0749&client=summon