A segmental probabilistic model of speech using an orthogonal polynomial representation: Application to text-independent speaker verification

A segmental probabilistic model based on an orthogonal polynomial representation of speech signals is proposed. Unlike the conventional frame based probabilistic model, this segment based model concatenates the similar acoustic characteristics of consecutive frames into an acoustic segment and repre...

Full description

Saved in:

Bibliographic Details
Published in	Speech communication Vol. 18; no. 3; pp. 291 - 304
Main Authors	Liu, Chi-shi, Wang, Hsiao-chuan
Format	Journal Article
Language	English
Published	Elsevier B.V 01.05.1996
Subjects	Iterative algorithm Orthogonal polynomial function Segment model Speaker verification Segment model Iterative algorithm Orthogonal polynomial function Speaker verification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A segmental probabilistic model based on an orthogonal polynomial representation of speech signals is proposed. Unlike the conventional frame based probabilistic model, this segment based model concatenates the similar acoustic characteristics of consecutive frames into an acoustic segment and represents the segment by an orthogonal polynomial function. An iterative algorithm that performs recognition and segmentation processes is proposed for estimating the segment model. This segment model is applied in the text independent speaker verification. Tests were carried out on a 20-speaker database. With the best version of the model, an equal error rate of 0.59% can be reached, for test utterances of 10 digits. This corresponds to a relative error rate reduction of more than 50%, compared to the conventional frame based probabilistic model. In diesem Artikel wird ein segmentielles Wahrscheinlichkeitsmodell vorgestellt, welches auf einer orthogonalen und polynomialen Repräsentation des Sprachsignals basiert. Im Gegensatz zu einem üblichen Wahrscheinlichkeitsmodell basierend auf festen Zeitfenstern fäβt dieses segmentbasierte Modell ähnliche aufeinanderfolgende akustische Zeitfenster in ein akustisches Segment zusammen und repräsentiert es durch eine orthogonale polynomiale Funktion. Es wird ein Algorithmus vorgeschlagen, der iterativ eine Erkennung und Segmentierung durchführt, um die Parameter des Segmentmodells zu schätzen. Dieses Segmentmodel wird für eine textunabhängige Sprecherverifizierung benutzt. Der Ansantz wurde mit einer Sprachdatenbank getestet, die 20 Sprecher enhält. Für Testsätze, die die 10 Ziffern enthielten, konnte mit der besten Version des Models eine ausgeglichene Fehlerrate von 0.59% erreicht werden. Dies entspricht einer relativen Reduktion des Fehlers um mehr als 50%, verglichen mit dem konventionellen zeitfensterbasierten Wahrscheinlichkeitsmodell. Cet article propose un modèle segmentai probabiliste reposant sur une représentation polynomiale du signal de parole. A la différence du modèle probabiliste classique, opérant au niveau de la trame, ce modèle de segments regroupe les trames consécutives dont les caractéristiques sont semblables et représente l'ensemble du segment sur une base de polynômes orthogonaux. Un algorithme itératif d'estimation des paramètres du modèle est proposé. Le modèle segmentai est appliqué à la vérification du locuteur indépendante du texte. Les tests ont été réalisés sur une base de données de 20 locuteurs. La meilleure version du modèle permet d'obtenir un taux d'égale erreur de 0.59%, pour des énoncés de test composés de 10 chiffres. Ceci correspond à une réduction relative du taux d'erreur de plus de 50%, par rapport au modèle probabiliste conventionnel, opérant au niveau de la trame.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0167-6393 1872-7182
DOI:	10.1016/0167-6393(96)00014-3