A Study of Data Augmentation for ASR Robustness in Low Bit Rate Contact Center Recordings Including Packet Losses

Client conversations in contact centers are nowadays routinely recorded for a number of reasons—in many cases, just because it is required by current legislation. However, even if not required, conversations between customers and agents can be a valuable source of information about clients or future...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 12; no. 3; p. 1580
Main Authors Fernández-Gallego, María Pilar, Toledano, Doroteo T.
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.02.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Client conversations in contact centers are nowadays routinely recorded for a number of reasons—in many cases, just because it is required by current legislation. However, even if not required, conversations between customers and agents can be a valuable source of information about clients or future clients, call center agents, markets trends, etc. Analyzing these recordings provides an excellent opportunity to be aware about the business and its possibilities. The current state of the art in Automatic Speech Recognition (ASR) allows this information to be effectively extracted and used. However, conversations are usually stored in highly compressed ways to save space and typically contain packet losses that produce short interruptions in the speech signal due to the common use of Voice-over-IP (VoIP) in these systems. These effects, and especially the last one, have a negative impact on ASR performance. This article presents an extensive study on the importance of these effects on modern ASR systems and the effectiveness of using several techniques of data augmentation to increase their robustness. In addition, ITU-T G.711, a well-known Packet Loss Concealment (PLC) method is applied in combination with data augmentation techniques to analyze ASR performance improvement on signals affected by packet losses.
ISSN:2076-3417
2076-3417
DOI:10.3390/app12031580