Cross-attention-based hybrid ViT-CNN fusion network for action recognition in visible and infrared videos

Human action recognition (HAR) in videos is a critical task in computer vision, but traditional methods relying solely on visible (RGB) data face challenges in low-light or occluded scenarios. Infrared (IR) imagery offers robustness in such conditions, yet effectively fusing IR and visible modalitie...

Full description

Saved in:
Bibliographic Details
Published inPattern analysis and applications : PAA Vol. 28; no. 3
Main Authors Imran, Javed, Gupta, Himanshu
Format Journal Article
LanguageEnglish
Published Heidelberg Springer Nature B.V 01.09.2025
Subjects
Online AccessGet full text
ISSN1433-7541
1433-755X
DOI10.1007/s10044-025-01493-y

Cover

Loading…