Leveraging Temporal Contextualization for Video Action Recognition
We propose a novel framework for video understanding, called Temporally Contextualized CLIP (TC-CLIP), which leverages essential temporal information through global interactions in a spatio-temporal domain within a video. To be specific, we introduce Temporal Contextualization (TC), a layer-wise tem...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
15.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!