LDCformer: Incorporating Learnable Descriptive Convolution to Vision Transformer for Face Anti-Spoofing

Face anti-spoofing (FAS) aims to counter facial presentation attacks and heavily relies on identifying live/spoof discriminative features. While vision transformer (ViT) has shown promising potential in recent FAS methods, there remains a lack of studies examining the values of incorporating local d...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE International Conference on Image Processing (ICIP) pp. 121 - 125
Main Authors Huang, Pei-Kai, Chiang, Cheng-Hsuan, Chong, Jun-Xiong, Chen, Tzu-Hsien, Ni, Hui-Yu, Hsu, Chiou-Ting
Format Conference Proceeding
LanguageEnglish
Published IEEE 08.10.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Face anti-spoofing (FAS) aims to counter facial presentation attacks and heavily relies on identifying live/spoof discriminative features. While vision transformer (ViT) has shown promising potential in recent FAS methods, there remains a lack of studies examining the values of incorporating local descriptive feature learning with ViT. In this paper, we propose a novel LDCformer by incorporating Learnable Descriptive Convolution (LDC) with ViT and aim to learn distinguishing characteristics of FAS through modeling long-range dependency of locally descriptive features. In addition, we propose to extend LDC to a Decoupled Learnable Descriptive Convolution (Decoupled-LDC) for improving the optimization efficiency. With the new Decoupled-LDC, we further develop an extended model LDCformer D for FAS. Extensive experiments on FAS benchmarks show that LDCformer D outperforms previous methods on most of the protocols in both intra-domain and cross-domain testings. The codes are available at https://github.com/Pei-KaiHuang/ICIP23_D-LDCformer.
DOI:10.1109/ICIP49359.2023.10222330