Preliminary Study on SSCF-derived Polar Coordinate for ASR

ACET 2022, Dec 2022, Phnom Penh, Cambodia The transition angles are defined to describe the vowel-to-vowel transitions in the acoustic space of the Spectral Subband Centroids, and the findings show that they are similar among speakers and speaking rates. In this paper, we propose to investigate the...

Full description

Saved in:
Bibliographic Details
Main Authors Leang, Sotheara, Castelli, Eric, Vaufreydaz, Dominique, Sam, Sethserey
Format Journal Article
LanguageEnglish
Published 30.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract ACET 2022, Dec 2022, Phnom Penh, Cambodia The transition angles are defined to describe the vowel-to-vowel transitions in the acoustic space of the Spectral Subband Centroids, and the findings show that they are similar among speakers and speaking rates. In this paper, we propose to investigate the usage of polar coordinates in favor of angles to describe a speech signal by characterizing its acoustic trajectory and using them in Automatic Speech Recognition. According to the experimental results evaluated on the BRAF100 dataset, the polar coordinates achieved significantly higher accuracy than the angles in the mixed and cross-gender speech recognitions, demonstrating that these representations are superior at defining the acoustic trajectory of the speech signal. Furthermore, the accuracy was significantly improved when they were utilized with their first and second-order derivatives ($\Delta$, $\Delta$$\Delta$), especially in cross-female recognition. However, the results showed they were not much more gender-independent than the conventional Mel-frequency Cepstral Coefficients (MFCCs).
AbstractList ACET 2022, Dec 2022, Phnom Penh, Cambodia The transition angles are defined to describe the vowel-to-vowel transitions in the acoustic space of the Spectral Subband Centroids, and the findings show that they are similar among speakers and speaking rates. In this paper, we propose to investigate the usage of polar coordinates in favor of angles to describe a speech signal by characterizing its acoustic trajectory and using them in Automatic Speech Recognition. According to the experimental results evaluated on the BRAF100 dataset, the polar coordinates achieved significantly higher accuracy than the angles in the mixed and cross-gender speech recognitions, demonstrating that these representations are superior at defining the acoustic trajectory of the speech signal. Furthermore, the accuracy was significantly improved when they were utilized with their first and second-order derivatives ($\Delta$, $\Delta$$\Delta$), especially in cross-female recognition. However, the results showed they were not much more gender-independent than the conventional Mel-frequency Cepstral Coefficients (MFCCs).
Author Leang, Sotheara
Castelli, Eric
Sam, Sethserey
Vaufreydaz, Dominique
Author_xml – sequence: 1
  givenname: Sotheara
  surname: Leang
  fullname: Leang, Sotheara
  organization: CADT, M-PSI
– sequence: 2
  givenname: Eric
  surname: Castelli
  fullname: Castelli, Eric
  organization: M-PSI
– sequence: 3
  givenname: Dominique
  surname: Vaufreydaz
  fullname: Vaufreydaz, Dominique
  organization: M-PSI
– sequence: 4
  givenname: Sethserey
  surname: Sam
  fullname: Sam, Sethserey
  organization: CADT
BackLink https://doi.org/10.48550/arXiv.2212.01245$$DView paper in arXiv
BookMark eNotj8FuwjAQRH1oD5T2AzjhH0gab2wc94YiaJGQigj3aGOvJUshrlyK4O-bUg6juYxG7z2xhyEOxNhMFLmslCpeMV3COQcQkBcCpJqwt12iPhzDgOnKm9OPu_I48Kap15mjFM7k-C72mHgdY3Lj7ETcx8SXzf6ZPXrsv-nl3lN2WK8O9Ue2_Xzf1MtthgutMqdBCusKiwZQClQd-QqcsSVY0tZThw6NNt0YI4gWle8EoPBKjtFUTtn8__YG336lcBxZ2z-J9iZR_gJ9vUNn
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID AKY
GOX
DOI 10.48550/arxiv.2212.01245
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2212_01245
GroupedDBID AKY
GOX
ID FETCH-LOGICAL-a675-d7241cd0ca92a41a5bef82d9c32ce7cfebada979b97991ee68fb12a1f541f57e3
IEDL.DBID GOX
IngestDate Mon Jan 08 05:47:28 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a675-d7241cd0ca92a41a5bef82d9c32ce7cfebada979b97991ee68fb12a1f541f57e3
OpenAccessLink https://arxiv.org/abs/2212.01245
ParticipantIDs arxiv_primary_2212_01245
PublicationCentury 2000
PublicationDate 2022-11-30
PublicationDateYYYYMMDD 2022-11-30
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-11-30
  day: 30
PublicationDecade 2020
PublicationYear 2022
Score 1.8624275
SecondaryResourceType preprint
Snippet ACET 2022, Dec 2022, Phnom Penh, Cambodia The transition angles are defined to describe the vowel-to-vowel transitions in the acoustic space of the Spectral...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Learning
Computer Science - Sound
Title Preliminary Study on SSCF-derived Polar Coordinate for ASR
URI https://arxiv.org/abs/2212.01245
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LSwMxEB7anryIolKf5OA12M2-st7K4loEtdgKvS15TECQVtZtsf_eSbaiFw-5JHPJDOH7JvMCuJYoRpZglmuVjnhCduAyywx3hbFIeK2F9cXJj0_Z5DV5WKSLHrCfWhjVfL1tuv7A-vNGCP9VRxCU9qEvhE_Zun9edMHJ0IprJ_8rRxwzbP0BieoA9nfsjo07cxxCD5dHcDtt8D0Mz2q2zOftbdlqyWazsuKW7L9By6bewWTlijxBEmuREZVk49nLMcyru3k54buRBVwR8-Y2J0A0dmRUIVQSqVSjk8IWJhYGc-NQK6uKvNA-mBYhZtLpSKjIpQmtHOMTGJDXj0NgMjZx5hSazL8z50ipRibE_qV2SS7jUxiGi9YfXVeK2uugDjo4-__oHPaEz98PvQsvYNA2a7wkVG31VVDtN9dZdvc
link.rule.ids 228,230,783,888
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Preliminary+Study+on+SSCF-derived+Polar+Coordinate+for+ASR&rft.au=Leang%2C+Sotheara&rft.au=Castelli%2C+Eric&rft.au=Vaufreydaz%2C+Dominique&rft.au=Sam%2C+Sethserey&rft.date=2022-11-30&rft_id=info:doi/10.48550%2Farxiv.2212.01245&rft.externalDocID=2212_01245