Extreme Low Resolution Activity Recognition with Confident Spatial-Temporal Attention Transfer
Activity recognition on extreme low-resolution videos, e.g., a resolution of 12*16 pixels, plays a vital role in far-view surveillance and privacy-preserving multimedia analysis. Low-resolution videos only contain limited information. Given the fact that one same activity may be represented by video...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
08.09.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Activity recognition on extreme low-resolution videos, e.g., a resolution of
12*16 pixels, plays a vital role in far-view surveillance and
privacy-preserving multimedia analysis. Low-resolution videos only contain
limited information. Given the fact that one same activity may be represented
by videos in both high resolution (HR) and extreme low resolution (eLR), it is
worth studying to utilize the relevant HR data to improve the eLR activity
recognition. In this work, we propose a novel Confident Spatial-Temporal
Attention Transfer (CSTAT) for eLR activity recognition. CSTAT can acquire
information from HR data by reducing the attention differences with a
transfer-learning strategy. Besides, the credibility of the supervisory signal
is also taken into consideration for a more confident transferring process.
Experimental results on two well-known datasets, i.e., UCF101 and HMDB51,
demonstrate that, the proposed method can effectively improve the accuracy of
eLR activity recognition and achieve an accuracy of 59.23% on 12*16 videos in
HMDB51, a state-of-the-art performance. |
---|---|
DOI: | 10.48550/arxiv.1909.03580 |