A Bayesian Attention Neural Network Layer for Speaker Recognition

Neural network based attention modeling has found utility in areas such as visual analysis, speech recognition and more recently speaker recognition. Attention represents a gating (or weighting) function on information and governs how the corresponding statistics are accumulated. In the context of s...

Full description

Saved in:

Bibliographic Details
Published in	ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 6241 - 6245
Main Authors	Zhu, Weizhong, Pelecanos, Jason
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2019
Subjects	attention modeling Bayes methods Bayesian statistics deep neural networks Neural networks NIST Speaker recognition Training Training data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Neural network based attention modeling has found utility in areas such as visual analysis, speech recognition and more recently speaker recognition. Attention represents a gating (or weighting) function on information and governs how the corresponding statistics are accumulated. In the context of speaker recognition, attention can be incorporated as a frame weighted mean of an information stream. These weights can be made to sum to one (the standard approach) or be calculated in other ways. If the weights can be made to represent event observation probabilities, we can extend the approach to be within a Bayesian framework. More specifically, we combine prior information with the frame weighted statistics to produce an adapted or posterior estimate of the mean. We evaluate the proposed method on NIST data.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2019.8682953