Attention-Based Pedestrian Attribute Analysis

Recognizing the pedestrian attributes in surveillance scenes is an inherently challenging task, especially for the pedestrian images with large pose variations, complex backgrounds, and various camera viewing angles. To select important and discriminative regions or pixels against the variations, th...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on image processing Vol. 28; no. 12; pp. 6126 - 6140
Main Authors Tan, Zichang, Yang, Yang, Wan, Jun, Hang, Hanyuan, Guo, Guodong, Li, Stan Z.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.12.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recognizing the pedestrian attributes in surveillance scenes is an inherently challenging task, especially for the pedestrian images with large pose variations, complex backgrounds, and various camera viewing angles. To select important and discriminative regions or pixels against the variations, three attention mechanisms are proposed, including parsing attention, label attention, and spatial attention. Those attentions aim at accessing effective information by considering problems from different perspectives. To be specific, the parsing attention extracts discriminative features by learning not only where to turn attention to but also how to aggregate features from different semantic regions of human bodies, e.g., head and upper body. The label attention aims at targetedly collecting the discriminative features for each attribute. Different from the parsing and label attention mechanisms, the spatial attention considers the problem from a global perspective, aiming at selecting several important and discriminative image regions or pixels for all attributes. Then, we propose a joint learning framework formulated in a multi-task-like way with these three attention mechanisms learned concurrently to extract complementary and correlated features. This joint learning framework is named Joint Learning of Parsing attention, Label attention, and Spatial attention for Pedestrian Attributes Analysis (JLPLS-PAA, for short). Extensive comparative evaluations conducted on multiple large-scale benchmarks, including PA-100K, RAP, PETA, Market-1501, and Duke attribute datasets, further demonstrate the effectiveness of the proposed JLPLS-PAA framework for pedestrian attribute analysis.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1057-7149
1941-0042
1941-0042
DOI:10.1109/TIP.2019.2919199