Crowd Counting in Large Surveillance Areas by Fusing Audio and WiFi Sniffing Data

Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large surveillance areas, while emerging WiFi-based methods which are suitable for large surveillance areas incur limited accuracy due to the sparsity and r...

Full description

Saved in:

Bibliographic Details
Published in	2024 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8
Main Authors	Guo, Rui, Huang, Baoqi, Hao, Lifei, Jia, Bing
Format	Conference Proceeding
Language	English
Published	IEEE 30.06.2024
Subjects	Accuracy audio sensor Correlation Costs Crowd counting deep learning Fluctuations Fuses multi-modal fusion Neural networks Surveillance WiFi sniffing
Online Access	Get full text

Cover

Loading…

Abstract	Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large surveillance areas, while emerging WiFi-based methods which are suitable for large surveillance areas incur limited accuracy due to the sparsity and randomness of WiFi sniffing data. Considering the fact that the variations of audio data are spatial-temporally correlated with crowd fluctuations, this paper proposes to fuse audio and WiFi sniffing data for crowd counting by developing a Cross-modal Multi-level Perception Network, termed CMPN. The CMPN can not only extract crowd features from the bimodal data to leverage the temporally continuity for compensating sparse WiFi sniffing data, but also mine the correlation of intra- and inter-modality crowd features for accurate crowd counting. Extensive experiments are conducted in a real campus with the surveillance area of about 4000m 2 , and demonstrate that the CMPN can achieve the mean absolute error of 5.88, resulting in a 22.12% reduction compared to the state-of-the-art WiFi-only method.
AbstractList	Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large surveillance areas, while emerging WiFi-based methods which are suitable for large surveillance areas incur limited accuracy due to the sparsity and randomness of WiFi sniffing data. Considering the fact that the variations of audio data are spatial-temporally correlated with crowd fluctuations, this paper proposes to fuse audio and WiFi sniffing data for crowd counting by developing a Cross-modal Multi-level Perception Network, termed CMPN. The CMPN can not only extract crowd features from the bimodal data to leverage the temporally continuity for compensating sparse WiFi sniffing data, but also mine the correlation of intra- and inter-modality crowd features for accurate crowd counting. Extensive experiments are conducted in a real campus with the surveillance area of about 4000m 2 , and demonstrate that the CMPN can achieve the mean absolute error of 5.88, resulting in a 22.12% reduction compared to the state-of-the-art WiFi-only method.
Author	Jia, Bing Guo, Rui Hao, Lifei Huang, Baoqi
Author_xml	– sequence: 1 givenname: Rui surname: Guo fullname: Guo, Rui email: 32109005@mail.imu.edu.cn organization: Inner Mongolia University,China – sequence: 2 givenname: Baoqi surname: Huang fullname: Huang, Baoqi email: cshbq@imu.edu.cn organization: Inner Mongolia University,China – sequence: 3 givenname: Lifei surname: Hao fullname: Hao, Lifei email: cshlf@imu.edu.cn organization: Inner Mongolia University,China – sequence: 4 givenname: Bing surname: Jia fullname: Jia, Bing email: jiabing@imu.edu.cn organization: Inner Mongolia University,China
BookMark	eNqFjsFKw0AURZ-i0Fb7By7eDzS-mcmknWVJDVWkUFpwWUbzUp7Eicw0Lf17Keja1YVzzuKO4CZ0gQFQUaYUucfnl3K1KmjmXKZJ55miwipr7BWM3dTNjCVjnVH6GoZaFWqS5zQdwCilTyJtnDNDWJexO9VYdn04SNijBHz1cc-46eORpW19-GCcR_YJ389Y9elSzftaOvShxjepBDdBmubCF_7g7-G28W3i8e_ewUP1tC2XE2Hm3XeULx_Pu7-r5h_9A8-dQ0o
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/IJCNN60899.2024.10651535
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEL IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEL url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9798350359312
EISSN	2161-4407
EndPage	8
ExternalDocumentID	10651535
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China funderid: 10.13039/501100001809 – fundername: Natural Science Foundation of Inner Mongolia funderid: 10.13039/501100004763
GroupedDBID	6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI OCL RIE RIL RIO RNS
ID	FETCH-ieee_primary_106515353
IEDL.DBID	RIE
IngestDate	Wed Sep 18 05:50:09 EDT 2024
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-ieee_primary_106515353
ParticipantIDs	ieee_primary_10651535
PublicationCentury	2000
PublicationDate	2024-June-30
PublicationDateYYYYMMDD	2024-06-30
PublicationDate_xml	– month: 06 year: 2024 text: 2024-June-30 day: 30
PublicationDecade	2020
PublicationTitle	2024 International Joint Conference on Neural Networks (IJCNN)
PublicationTitleAbbrev	IJCNN
PublicationYear	2024
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0023993
Score	3.8473034
Snippet	Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large...
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Accuracy audio sensor Correlation Costs Crowd counting deep learning Fluctuations Fuses multi-modal fusion Neural networks Surveillance WiFi sniffing
Title	Crowd Counting in Large Surveillance Areas by Fusing Audio and WiFi Sniffing Data
URI	https://ieeexplore.ieee.org/document/10651535
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEB7cPXlaHxUfq8zBa2tNu7E9LtWyLlqUVdzb0iZZKUIqSyvorzfTx4qi4C0EMhkSwmQm35cP4JQzYa7lXNgpk4IoOczOOA9txSX3QxGoZUoF_duETx796Xw0b8nqNRdGKVWDz5RDzfotXxaiolKZOeEk3O2NetALXNaQtdbZFUXaDqrjhmfX0yhJOD1qmSSQ-U439puKSh1E4gEk3fQNduTFqcrMER8_fmb8t39bYH3x9fBuHYm2YUPpHRh0gg3Ynt9duI9M0i0xavUhMNd4Q0hwnFWrN0X6Q2RpTDh1zN4xJlD8M44rmReYaolPeZzjTFOZx_RfpmVqwTC-eogmNrm5eG2-rlh0Hnp70NeFVvuAHhfeknsq85VJtlgaMF-eeyPX3AMufM7kAVi_mjj8o_8INmnBG0zdEPrlqlLHJnCX2Um9YZ9aM5py
link.rule.ids	310,311,786,790,795,796,802,27958,55109
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFL3ofNCn-THxY-p98LV1pm1cH0e1dLMryibubbRJNorQyWgF_fXmtutEUfAtBBIuCeHk3pyTA3DJmdDXci6MmElBkhxmJJy7huKS267oqllMBf1hxIMnezBxJiuxeqmFUUqV5DNlUrN8y5cLUVCpTJ9wMu62nE3Y0kDfcSu51jq_IqytyTod96o_8KKI07OWTgOZbdajv_molDDiNyGqA6jYIy9mkSem-PjxN-O_I9yF1pdiDx_WWLQHGyrbh2Zt2YCrE3wAj55OuyV6K4cITDMMiQuOo2L5psiBiGbqEVMdk3f0iRY_x14h0wXGmcTn1E9xlFGhR_ffxnncgrZ_N_YCg8KcvlafV0zrCK1DaGSLTB0BWlxYM26pxFY63WJxl9ny2nI6-iZwY3Mmj6H16xQnf_RfwHYwHobTsB_dn8IOLX7FsGtDI18W6kzDeJ6cl5v3CbFEncg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+International+Joint+Conference+on+Neural+Networks+%28IJCNN%29&rft.atitle=Crowd+Counting+in+Large+Surveillance+Areas+by+Fusing+Audio+and+WiFi+Sniffing+Data&rft.au=Guo%2C+Rui&rft.au=Huang%2C+Baoqi&rft.au=Hao%2C+Lifei&rft.au=Jia%2C+Bing&rft.date=2024-06-30&rft.pub=IEEE&rft.eissn=2161-4407&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FIJCNN60899.2024.10651535&rft.externalDocID=10651535