Spatio-temporal convolutional emotional attention network for spotting macro- and micro-expression intervals in long video sequences

•The STCEAN model considers the changes of spatial features in the temporal dimension.•The MAS of two heads is focus on different emotional dimensions attention weight.•The STCEAN model uses focal loss function to reduce sample imbalance.•Leave-Half-Subject-Out (LHSO) cross-validation method to redu...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition letters Vol. 162; pp. 89 - 96
Main Authors	Pan, Hang, Xie, Lun, Wang, Zhiliang
Format	Journal Article
Language	English
Published	Elsevier B.V 01.10.2022
Subjects	Attention model Convolution Macro-expression Micro-expression Spatio-temporal feature Micro-expression Attention model Convolution Spatio-temporal feature Macro-expression
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•The STCEAN model considers the changes of spatial features in the temporal dimension.•The MAS of two heads is focus on different emotional dimensions attention weight.•The STCEAN model uses focal loss function to reduce sample imbalance.•Leave-Half-Subject-Out (LHSO) cross-validation method to reduce trained time. Emotional detection based on facial micro-expressions is essential in high-risk tasks such as criminal investigation or lie detection. However, micro-expressions often occur in high-risk tasks when people often use facial expressions to conceal their actual emotional states. Therefore, spotting macro- and micro-expression intervals in long video sequences has become hot research. Considering the difference in duration and facial muscle movement intensity between macro- and micro-expression, we propose a novel Spatio-temporal Convolutional Emotional Attention Network (STCEAN) for spotting macro- and micro-expression intervals in long video sequences. The spatial features of each frame in the video sequence are extracted through the convolution neural network. Then the emotional self-attention model is used to analyze the temporal weights of spatial features in different emotional dimensions. The emotional weights in the temporal dimension are filtered for spotting macro- and micro-expressions intervals. Finally, the STCEAN model is jointly optimized by the dual emotional focal loss of macro- and micro-expression to solve the problem of sample unbalance. The experimental results on the CAS(ME)2 and SAMM-LV datasets show that the STCEAN model achieves competitive results in the Facial Micro-Expression Challenge 2021.
ISSN:	0167-8655 1872-7344
DOI:	10.1016/j.patrec.2022.09.008