Gigant-KTTS dataset: Towards building an extensive gigant dataset for Kurdish text-to-speech systems

Today, speech synthesis is a part of our daily lives in computers all around the world. Central Kurdish Speech Corpus Construction is a speech corpus that is a primary data source for developing a speech system. There are still two main issues that prevent them from achieving the best possible perfo...

Full description

Saved in:

Bibliographic Details
Published in	Data in brief Vol. 55; p. 110753
Main Authors	Ahmad, Hawraz A., Rashid, Tarik A.
Format	Journal Article
Language	English
Published	Netherlands Elsevier Inc 01.08.2024 Elsevier
Subjects	Central Kurdish language Classification segmentation Data Dataset Deep learning Detection Speech corpus Speech system Deep learning Speech system Classification segmentation Dataset Speech corpus Detection Central Kurdish language
Online Access	Get full text
ISSN	2352-3409 2352-3409
DOI	10.1016/j.dib.2024.110753

Cover

More Information
Summary:	Today, speech synthesis is a part of our daily lives in computers all around the world. Central Kurdish Speech Corpus Construction is a speech corpus that is a primary data source for developing a speech system. There are still two main issues that prevent them from achieving the best possible performance, the lack of efficiency in training and analysis, and the difficulty in modelling. The biggest obstacle against text-to-speech in the Kurdish language is that there is a lack of text and speech recognition tools compounded by the fact that around 30 million people speak the Kurdish language in different countries. To address this issue, this corpus introduced a large vocabulary of Kurdish Text-to-Speech Dataset (KTTS, Gigant), including a pronunciation lexicon and speech corpus for the Central Kurdish dialect. A variety of subjects is comprised to record these sentences. The sentences are recorded in a voice recording studio by a Kurdish man who is a dubber. The goal of the speech corpus is to create a collection of sentences that accurately reflect the real data about the Central Kurdish dialect. A combination of audio and visual sources is used to record the 6,078 sentences of 12 document topics. They were recorded in a controlled environment using microphones that were not noisy. The total record duration is 13.63 h. The recorded sentences are in the “.wav” format.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2352-3409 2352-3409
DOI:	10.1016/j.dib.2024.110753