LDCformer: Incorporating Learnable Descriptive Convolution to Vision Transformer for Face Anti-Spoofing
Face anti-spoofing (FAS) aims to counter facial presentation attacks and heavily relies on identifying live/spoof discriminative features. While vision transformer (ViT) has shown promising potential in recent FAS methods, there remains a lack of studies examining the values of incorporating local d...
Saved in:
Published in | 2023 IEEE International Conference on Image Processing (ICIP) pp. 121 - 125 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
08.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Face anti-spoofing (FAS) aims to counter facial presentation attacks and heavily relies on identifying live/spoof discriminative features. While vision transformer (ViT) has shown promising potential in recent FAS methods, there remains a lack of studies examining the values of incorporating local descriptive feature learning with ViT. In this paper, we propose a novel LDCformer by incorporating Learnable Descriptive Convolution (LDC) with ViT and aim to learn distinguishing characteristics of FAS through modeling long-range dependency of locally descriptive features. In addition, we propose to extend LDC to a Decoupled Learnable Descriptive Convolution (Decoupled-LDC) for improving the optimization efficiency. With the new Decoupled-LDC, we further develop an extended model LDCformer D for FAS. Extensive experiments on FAS benchmarks show that LDCformer D outperforms previous methods on most of the protocols in both intra-domain and cross-domain testings. The codes are available at https://github.com/Pei-KaiHuang/ICIP23_D-LDCformer. |
---|---|
DOI: | 10.1109/ICIP49359.2023.10222330 |