Robust Audiovisual Liveness Detection for Biometric Authentication Using Deep Joint Embedding and Dynamic Time Warping

We address the problem of liveness detection in audiovisual recordings for preventing spoofing attacks in biometric authentication systems. We assume that liveness is detected from a recording of a speaker saying a predefined phrase and that another recording of the same phrase is a priori available...

Full description

Saved in:

Bibliographic Details
Published in	2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 3026 - 3030
Main Authors	Aides, Amit, DOV, David, Aronowitz, Hagai
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2018
Subjects	Audiovisual synchronization Authentication Cameras Cross-database generalizability Deep-Learning Face Gold Heuristic algorithms Spoofing countermeasure Streaming media Text dependent speaker recognition Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We address the problem of liveness detection in audiovisual recordings for preventing spoofing attacks in biometric authentication systems. We assume that liveness is detected from a recording of a speaker saying a predefined phrase and that another recording of the same phrase is a priori available, a setting, which is common in text-dependent authentication systems. We propose to measure liveness by comparing between alignments of audio and video to the a priori recorded sequence using dynamic time warping. The alignments are computed in a joint feature space to which audio and video are embedded using deep convolutional neural networks. We investigate the robustness of the proposed algorithm across datasets by training and testing it on different datasets. Experimental results demonstrate that the proposed algorithm generalizes well across datasets providing improved performance compared to competing methods.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2018.8462307