Relational Attention Network for Crowd Counting

Crowd counting is receiving rapidly growing research interests due to its potential application value in numerous real-world scenarios. However, due to various challenges such as occlusion, insufficient resolution and dynamic backgrounds, crowd counting remains an unsolved problem in computer vision...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / IEEE International Conference on Computer Vision pp. 6787 - 6796
Main Authors	Zhang, Anran, Shen, Jiayi, Xiao, Zehao, Zhu, Fan, Zhen, Xiantong, Cao, Xianbin, Shao, Ling
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2019
Subjects	Computational modeling Correlation Estimation Feature extraction Fuses Image reconstruction Task analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Crowd counting is receiving rapidly growing research interests due to its potential application value in numerous real-world scenarios. However, due to various challenges such as occlusion, insufficient resolution and dynamic backgrounds, crowd counting remains an unsolved problem in computer vision. Density estimation is a popular strategy for crowd counting, where conventional density estimation methods perform pixel-wise regression without explicitly accounting the interdependence of pixels. As a result, independent pixel-wise predictions can be noisy and inconsistent. In order to address such an issue, we propose a Relational Attention Network (RANet) with a self-attention mechanism for capturing interdependence of pixels. The RANet enhances the self-attention mechanism by accounting both short-range and long-range interdependence of pixels, where we respectively denote these implementations as local self-attention (LSA) and global self-attention (GSA). We further introduce a relation module to fuse LSA and GSA to achieve more informative aggregated feature representations. We conduct extensive experiments on four public datasets, including ShanghaiTech A, ShanghaiTech B, UCF-CC-50 and UCF-QNRF. Experimental results on all datasets suggest RANet consistently reduces estimation errors and surpasses the state-of-the-art approaches by large margins.
ISSN:	2380-7504
DOI:	10.1109/ICCV.2019.00689