Hierarchical global and local transformer for pain estimation with facial expression videos

Automatic pain intensity estimation from facial expression analysis has important applications in medical and healthcare areas. Most of the existing works tend to directly transfer the typical face recognition models to pain estimation task, which may not obtain good performances because the facial...

Full description

Saved in:
Bibliographic Details
Published inPattern analysis and applications : PAA Vol. 27; no. 3
Main Authors Liu, Hongrui, Xu, Haochen, Qiu, Jinheng, Wu, Shizhe, Liu, Manhua
Format Journal Article
LanguageEnglish
Published London Springer London 01.09.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Automatic pain intensity estimation from facial expression analysis has important applications in medical and healthcare areas. Most of the existing works tend to directly transfer the typical face recognition models to pain estimation task, which may not obtain good performances because the facial expression of pain is spontaneous with subtle facial variations. Pain estimation from facial video is still challenging because it relies on modeling semantic parts and extraction of fine-grained and dynamic features. In this study, we propose a hierarchical global and local transformer (HGLT) model for pain estimation from facial expression videos. HGLT model consists of an image frame embedding subnetwork and a temporal embedding subnetwork for extraction of spatio-temporal features. In the frame embedding subnetwork, we propose a multi-head local attention mechanism to extract the local fine-grained features related to micro variations of pain, followed by a hierarchical self-attention pooling to integrate the global and local features. In the temporal embedding subnetwork, a transformer encoder with temporal attention is proposed to model the temporal relationships of video frames and capture the dynamic facial variations. A correlation loss is proposed to alleviate the problem of long-tailed imbalance in the distribution of pain intensities. Our proposed method is tested with UNBC-McMaster Shoulder Pain, BioVid Heart Pain dataset, and DAiSEE dataset. Experimental results indicate that our method achieves competitive performances compared with the state-of-the-art methods.
ISSN:1433-7541
1433-755X
DOI:10.1007/s10044-024-01302-y