Single-Image Crowd Counting via Multi-Column Convolutional Neural Network

This paper aims to develop a method than can accurately estimate the crowd count from an individual image with arbitrary crowd density and arbitrary perspective. To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its...

Full description

Saved in:

Bibliographic Details
Published in	2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 589 - 597
Main Authors	Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, Yi Ma
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2016
Subjects	Detectors Distortion Feature extraction Head Image resolution Image segmentation Neural networks
Online Access	Get full text

Cover

Loading…

Abstract	This paper aims to develop a method than can accurately estimate the crowd count from an individual image with arbitrary crowd density and arbitrary perspective. To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map. The proposed MCNN allows the input image to be of arbitrary size or resolution. By utilizing filters with receptive fields of different sizes, the features learned by each column CNN are adaptive to variations in people/head size due to perspective effect or image resolution. Furthermore, the true density map is computed accurately based on geometry-adaptive kernels which do not need knowing the perspective map of the input image. Since exiting crowd counting datasets do not adequately cover all the challenging situations considered in our work, we have collected and labelled a large new dataset that includes 1198 images with about 330,000 heads annotated. On this challenging new dataset, as well as all existing datasets, we conduct extensive experiments to verify the effectiveness of the proposed model and method. In particular, with the proposed simple MCNN model, our method outperforms all existing methods. In addition, experiments show that our model, once trained on one dataset, can be readily transferred to a new dataset.
AbstractList	This paper aims to develop a method than can accurately estimate the crowd count from an individual image with arbitrary crowd density and arbitrary perspective. To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map. The proposed MCNN allows the input image to be of arbitrary size or resolution. By utilizing filters with receptive fields of different sizes, the features learned by each column CNN are adaptive to variations in people/head size due to perspective effect or image resolution. Furthermore, the true density map is computed accurately based on geometry-adaptive kernels which do not need knowing the perspective map of the input image. Since exiting crowd counting datasets do not adequately cover all the challenging situations considered in our work, we have collected and labelled a large new dataset that includes 1198 images with about 330,000 heads annotated. On this challenging new dataset, as well as all existing datasets, we conduct extensive experiments to verify the effectiveness of the proposed model and method. In particular, with the proposed simple MCNN model, our method outperforms all existing methods. In addition, experiments show that our model, once trained on one dataset, can be readily transferred to a new dataset.
Author	Shenghua Gao Desen Zhou Yingying Zhang Siqin Chen Yi Ma
Author_xml	– sequence: 1 surname: Yingying Zhang fullname: Yingying Zhang email: zhangyy2@shanghaitech.edu.cn – sequence: 2 surname: Desen Zhou fullname: Desen Zhou email: zhouds@shanghaitech.edu.cn – sequence: 3 surname: Siqin Chen fullname: Siqin Chen email: chensq@shanghaitech.edu.cn – sequence: 4 surname: Shenghua Gao fullname: Shenghua Gao email: gaoshh@shanghaitech.edu.cn – sequence: 5 surname: Yi Ma fullname: Yi Ma email: mayi@shanghaitech.edu.cn
BookMark	eNotjklPwzAUhA0CiVJy48YlfyDBL3b87COyWCKVRWzXymnsypDYKEsr_j0RcPpGM9LMnJKjEIMl5BxoDkDVpX5_es4LCiJHekAShRK4QCZlCXBIFkAFy4QCdUKSYfiglIISEqRakOrFh21rs6ozW5vqPu6bVMcpjLOd7rxJ76d29JmO7dSFOQm7WY0-BtOmD3bqfzHuY_95Ro6daQeb_HNJ3m6uX_Vdtnq8rfTVKvMFhzGr-aYBgfM8KI5mIyTS0jJJsa5ROVlK7qRzskGL81FTK4dcgKib2tkCDFuSi79eb61df_W-M_33GlFSzhT7AVIlTho
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/CVPR.2016.70
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences Computer Science
EISBN	9781467388511 1467388513
EISSN	1063-6919
EndPage	597
ExternalDocumentID	7780439
Genre	orig-research
GroupedDBID	23M 29F 29O 6IE 6IH 6IK ABDPE ACGFS ALMA_UNASSIGNED_HOLDINGS CBEJK IPLJI M43 RIE RIO RNS
ID	FETCH-LOGICAL-i241t-b4cd1676811947ac68705e3807bb79f8584f8ff8d7e7000ab9f74616bdbfe21a3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 01:54:34 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i241t-b4cd1676811947ac68705e3807bb79f8584f8ff8d7e7000ab9f74616bdbfe21a3
PageCount	9
ParticipantIDs	ieee_primary_7780439
PublicationCentury	2000
PublicationDate	2016-06
PublicationDateYYYYMMDD	2016-06-01
PublicationDate_xml	– month: 06 year: 2016 text: 2016-06
PublicationDecade	2010
PublicationTitle	2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev	CVPR
PublicationYear	2016
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0001968189 ssj0023720 ssj0003211698
Score	2.5827951
Snippet	This paper aims to develop a method than can accurately estimate the crowd count from an individual image with arbitrary crowd density and arbitrary...
SourceID	ieee
SourceType	Publisher
StartPage	589
SubjectTerms	Detectors Distortion Feature extraction Head Image resolution Image segmentation Neural networks
Title	Single-Image Crowd Counting via Multi-Column Convolutional Neural Network
URI	https://ieeexplore.ieee.org/document/7780439
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFH8BTp5QwfidHjy64baupedFAiYYomK4kXVtEyIOooODf72v3RjGePC09S1Lmrbv49f3BXAjeRzyLKLIaRQBig6kJ8K71EMApAyqF6lc3tr4kQ2n9GEWzxpwW-fCaK1d8Jn27avz5atVtrFXZT1uq-VEoglNBG5lrtb-PkUw1D2iHkeIbJioPQqh7cZSB76LXvI6ebKBXcy3bYp_NFZxemXQhvFuRmU4yZu_KaSfff0q1vjfKR9Cd5_BRya1bjqChs6PoV2ZnKRi6E8k7bo67GgdGD3jH0vtjd5R1JAEYboiSdVRgmwXKXE5u15ixVqOX_JtdXrTJbGlPtzDxZZ3YTq4f0mGXtVwwVugIi88STMVMAQgQSAoTzOGzBxrW5JeSi5MH40V0zemr7jmuMSpFIZTFjCppNFhkEYn0MpXuT4FEhkUFmhOGDQAaGzdmTjQlMWGKYtJz6BjF2u-LmtqzKt1Ov-bfAEHdq_KEK1LaBUfG32FxkAhr90p-Aa3FrAB
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LT8JAEJ4gHvSECsa3e9Bji31t6cFTlVAFQhQMN-y2uwkRi4EC0d_iX_G_OduWYoxXEk_tTtNNdmd3HrvfzABcMNvS7cAwcaeZ6KBwjSmOfuUr6ACFAtULC5O4tVabNnrmXd_qF-Azj4XhnCfgM67K1-QuPxwHM3lUVrVlthzDySCU9_x9gQ7a9Nq7QW5e6nr9tus2lKyGgDJE3RQrzAxCjaJNraG3bvsBxfVpcZllnTHbETXUv6ImRC20uY3SwWeOsE2qURYywXXNN7DfDdhEO8PS0-iw1QmOg71mudpl20Bfijr5HYYu67_kUHun6j51HiSUjKqyMPKPUi6JJquX4Gs5BymA5UWdxUwNPn6lh_yvk7QDlVWMIunk2ncXCjzag1JmVJNMZE2RtKxbsaSVwXvEP0Zc8V5RmBJ3Ml6ExM1qZpD50CdJVLLiSsEd4Zdonu1Pf0RkMpPkkaDnK9Bby0j3oRiNI34AxBAoDtFgEmjimJa8sMUGN6klaCi97kMoS-YM3tKsIYOML0d_k89hq9FtNQdNr31_DNtynaSAtBMoxpMZP0XTJ2ZnyQok8Lxubn4DwTYNAg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=Single-Image+Crowd+Counting+via+Multi-Column+Convolutional+Neural+Network&rft.au=Yingying+Zhang&rft.au=Desen+Zhou&rft.au=Siqin+Chen&rft.au=Shenghua+Gao&rft.date=2016-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=589&rft.epage=597&rft_id=info:doi/10.1109%2FCVPR.2016.70&rft.externalDocID=7780439