How Good Is Good Enough? Establishing Quality Thresholds for the Automatic Text Analysis of Retro-Digitized Comics

Stylometry in the form of simple statistical text analysis has proven to be a powerful tool for text classification, e.g. in the form of authorship attribution. When analyzing retro-digitized comics, manga and graphic novels, the researcher is confronted with the problem that automated text recognit...

Full description

Saved in:
Bibliographic Details
Published inMultiMedia Modeling pp. 662 - 671
Main Authors Hartel, Rita, Dunst, Alexander
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Stylometry in the form of simple statistical text analysis has proven to be a powerful tool for text classification, e.g. in the form of authorship attribution. When analyzing retro-digitized comics, manga and graphic novels, the researcher is confronted with the problem that automated text recognition (ATR) still leads to results that have comparatively high error rates, while the manual transcription of texts remains highly time-consuming. In this paper, we present an approach and measures that specify whether stylometry based on unsupervised ATR will produce reliable results for a given dataset of comics images.
ISBN:9783030057152
3030057151
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-030-05716-9_59