Crowd Counting in Large Surveillance Areas by Fusing Audio and WiFi Sniffing Data

Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large surveillance areas, while emerging WiFi-based methods which are suitable for large surveillance areas incur limited accuracy due to the sparsity and r...

Full description

Saved in:
Bibliographic Details
Published in2024 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8
Main Authors Guo, Rui, Huang, Baoqi, Hao, Lifei, Jia, Bing
Format Conference Proceeding
LanguageEnglish
Published IEEE 30.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large surveillance areas, while emerging WiFi-based methods which are suitable for large surveillance areas incur limited accuracy due to the sparsity and randomness of WiFi sniffing data. Considering the fact that the variations of audio data are spatial-temporally correlated with crowd fluctuations, this paper proposes to fuse audio and WiFi sniffing data for crowd counting by developing a Cross-modal Multi-level Perception Network, termed CMPN. The CMPN can not only extract crowd features from the bimodal data to leverage the temporally continuity for compensating sparse WiFi sniffing data, but also mine the correlation of intra- and inter-modality crowd features for accurate crowd counting. Extensive experiments are conducted in a real campus with the surveillance area of about 4000m 2 , and demonstrate that the CMPN can achieve the mean absolute error of 5.88, resulting in a 22.12% reduction compared to the state-of-the-art WiFi-only method.
AbstractList Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large surveillance areas, while emerging WiFi-based methods which are suitable for large surveillance areas incur limited accuracy due to the sparsity and randomness of WiFi sniffing data. Considering the fact that the variations of audio data are spatial-temporally correlated with crowd fluctuations, this paper proposes to fuse audio and WiFi sniffing data for crowd counting by developing a Cross-modal Multi-level Perception Network, termed CMPN. The CMPN can not only extract crowd features from the bimodal data to leverage the temporally continuity for compensating sparse WiFi sniffing data, but also mine the correlation of intra- and inter-modality crowd features for accurate crowd counting. Extensive experiments are conducted in a real campus with the surveillance area of about 4000m 2 , and demonstrate that the CMPN can achieve the mean absolute error of 5.88, resulting in a 22.12% reduction compared to the state-of-the-art WiFi-only method.
Author Jia, Bing
Guo, Rui
Hao, Lifei
Huang, Baoqi
Author_xml – sequence: 1
  givenname: Rui
  surname: Guo
  fullname: Guo, Rui
  email: 32109005@mail.imu.edu.cn
  organization: Inner Mongolia University,China
– sequence: 2
  givenname: Baoqi
  surname: Huang
  fullname: Huang, Baoqi
  email: cshbq@imu.edu.cn
  organization: Inner Mongolia University,China
– sequence: 3
  givenname: Lifei
  surname: Hao
  fullname: Hao, Lifei
  email: cshlf@imu.edu.cn
  organization: Inner Mongolia University,China
– sequence: 4
  givenname: Bing
  surname: Jia
  fullname: Jia, Bing
  email: jiabing@imu.edu.cn
  organization: Inner Mongolia University,China
BookMark eNqFjsFKw0AURZ-i0Fb7By7eDzS-mcmknWVJDVWkUFpwWUbzUp7Eicw0Lf17Keja1YVzzuKO4CZ0gQFQUaYUucfnl3K1KmjmXKZJ55miwipr7BWM3dTNjCVjnVH6GoZaFWqS5zQdwCilTyJtnDNDWJexO9VYdn04SNijBHz1cc-46eORpW19-GCcR_YJ389Y9elSzftaOvShxjepBDdBmubCF_7g7-G28W3i8e_ewUP1tC2XE2Hm3XeULx_Pu7-r5h_9A8-dQ0o
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/IJCNN60899.2024.10651535
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEL
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEL
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350359312
EISSN 2161-4407
EndPage 8
ExternalDocumentID 10651535
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
– fundername: Natural Science Foundation of Inner Mongolia
  funderid: 10.13039/501100004763
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
OCL
RIE
RIL
RIO
RNS
ID FETCH-ieee_primary_106515353
IEDL.DBID RIE
IngestDate Wed Sep 18 05:50:09 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-ieee_primary_106515353
ParticipantIDs ieee_primary_10651535
PublicationCentury 2000
PublicationDate 2024-June-30
PublicationDateYYYYMMDD 2024-06-30
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-June-30
  day: 30
PublicationDecade 2020
PublicationTitle 2024 International Joint Conference on Neural Networks (IJCNN)
PublicationTitleAbbrev IJCNN
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0023993
Score 3.8473034
Snippet Popular vision-based crowd counting methods suffer from huge costs, limited coverage and high complexity, making it difficult to be applied for large...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Accuracy
audio sensor
Correlation
Costs
Crowd counting
deep learning
Fluctuations
Fuses
multi-modal fusion
Neural networks
Surveillance
WiFi sniffing
Title Crowd Counting in Large Surveillance Areas by Fusing Audio and WiFi Sniffing Data
URI https://ieeexplore.ieee.org/document/10651535
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8QwEB7cPXlaHxUfq8zBa2tNu7E9LtWyLlqUVdzb0iZZKUIqSyvorzfTx4qi4C0EMhkSwmQm35cP4JQzYa7lXNgpk4IoOczOOA9txSX3QxGoZUoF_duETx796Xw0b8nqNRdGKVWDz5RDzfotXxaiolKZOeEk3O2NetALXNaQtdbZFUXaDqrjhmfX0yhJOD1qmSSQ-U439puKSh1E4gEk3fQNduTFqcrMER8_fmb8t39bYH3x9fBuHYm2YUPpHRh0gg3Ynt9duI9M0i0xavUhMNd4Q0hwnFWrN0X6Q2RpTDh1zN4xJlD8M44rmReYaolPeZzjTFOZx_RfpmVqwTC-eogmNrm5eG2-rlh0Hnp70NeFVvuAHhfeknsq85VJtlgaMF-eeyPX3AMufM7kAVi_mjj8o_8INmnBG0zdEPrlqlLHJnCX2Um9YZ9aM5py
link.rule.ids 310,311,786,790,795,796,802,27958,55109
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFL3ofNCn-THxY-p98LV1pm1cH0e1dLMryibubbRJNorQyWgF_fXmtutEUfAtBBIuCeHk3pyTA3DJmdDXci6MmElBkhxmJJy7huKS267oqllMBf1hxIMnezBxJiuxeqmFUUqV5DNlUrN8y5cLUVCpTJ9wMu62nE3Y0kDfcSu51jq_IqytyTod96o_8KKI07OWTgOZbdajv_molDDiNyGqA6jYIy9mkSem-PjxN-O_I9yF1pdiDx_WWLQHGyrbh2Zt2YCrE3wAj55OuyV6K4cITDMMiQuOo2L5psiBiGbqEVMdk3f0iRY_x14h0wXGmcTn1E9xlFGhR_ffxnncgrZ_N_YCg8KcvlafV0zrCK1DaGSLTB0BWlxYM26pxFY63WJxl9ny2nI6-iZwY3Mmj6H16xQnf_RfwHYwHobTsB_dn8IOLX7FsGtDI18W6kzDeJ6cl5v3CbFEncg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+International+Joint+Conference+on+Neural+Networks+%28IJCNN%29&rft.atitle=Crowd+Counting+in+Large+Surveillance+Areas+by+Fusing+Audio+and+WiFi+Sniffing+Data&rft.au=Guo%2C+Rui&rft.au=Huang%2C+Baoqi&rft.au=Hao%2C+Lifei&rft.au=Jia%2C+Bing&rft.date=2024-06-30&rft.pub=IEEE&rft.eissn=2161-4407&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FIJCNN60899.2024.10651535&rft.externalDocID=10651535