Multi-scale Motion Feature Integration for Action Recognition
Analyzing video data with intricate temporal structures and extracting comprehensive motion information remains a significant challenge. In this work, we introduce the multi-scale motion feature integration (MMFI) network, which leverages two key modules for motion analysis: the progressive local co...
Saved in:
Published in | 2023 9th International Conference on Computer and Communications (ICCC) pp. 1776 - 1781 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
08.12.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Analyzing video data with intricate temporal structures and extracting comprehensive motion information remains a significant challenge. In this work, we introduce the multi-scale motion feature integration (MMFI) network, which leverages two key modules for motion analysis: the progressive local context aggregation (PLCA) module and the multi-scale motion excitation (MSME) module. The PLCA module captures frame-level motion details by incrementally processing frame-wise differences near the input frame in the early stages of the network. The MSME module provides motion-attentive channel weights in deeper layers with higher dimensions, incorporating short- and long-range segment-level motion information. These modules synergistically capture motion details across various scales. Our approach is evaluated on the large-scale video dataset Something-Something V1, yielding state-of-the-art performance with minimal computational overhead. |
---|---|
ISSN: | 2837-7109 |
DOI: | 10.1109/ICCC59590.2023.10507593 |