Punctuation Restoration System for Slovene Language

Punctuation restoration is the process of adding punctuation symbols to raw text. It is typically used as a post-processing task of Automatic Speech Recognition (ASR) systems. In this paper we present an approach for punctuation restoration for texts in Slovene language. The system is trained using...

Full description

Saved in:
Bibliographic Details
Published inResearch Challenges in Information Science Vol. 385; pp. 509 - 514
Main Authors Bajec, Marko, Janković, Marko, Žitnik, Slavko, Bajec, Iztok Lebar
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2020
Springer International Publishing
SeriesLecture Notes in Business Information Processing
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Punctuation restoration is the process of adding punctuation symbols to raw text. It is typically used as a post-processing task of Automatic Speech Recognition (ASR) systems. In this paper we present an approach for punctuation restoration for texts in Slovene language. The system is trained using bi-directional Recurrent Neural Networks fed by word embeddings only. The evaluation results show our approach is capable of restoring punctuations with a high recall and precision. The F1 score is specifically high for commas and periods, which are considered most important punctuation symbols for the understanding of the ASR based transcripts.
ISBN:9783030503154
3030503151
ISSN:1865-1348
1865-1356
DOI:10.1007/978-3-030-50316-1_31