Identifying Human Behaviors Using Synchronized Audio-Visual Cues

In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on affective computing Vol. 8; no. 1; pp. 54 - 66
Main Authors Vrigkas, Michalis, Nikou, Christophoros, Kakadiaris, Ioannis A.
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1949-3045
1949-3045
DOI10.1109/TAFFC.2015.2507168

Cover

Loading…
Abstract In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is represented by a vector of spatio-temporal visual features (STIP, head orientation and proxemic features) along with audio features (MFCCs). We propose a feature pruning method for removing irrelevant and redundant features based on the spatio-temporal neighborhood of each feature in a video sequence. The proposed framework assumes that human movements are highly correlated with sound emissions. For this reason, canonical correlation analysis (CCA) is employed to find correlation between the audio and video features prior to fusion. The experimental results, performed in two human behavior recognition datasets including political speeches and human interactions from TV shows, attest the advantages of the proposed method compared with several baseline and alternative human behavior recognition methods.
AbstractList In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is represented by a vector of spatio-temporal visual features (STIP, head orientation and proxemic features) along with audio features (MFCCs). We propose a feature pruning method for removing irrelevant and redundant features based on the spatio-temporal neighborhood of each feature in a video sequence. The proposed framework assumes that human movements are highly correlated with sound emissions. For this reason, canonical correlation analysis (CCA) is employed to find correlation between the audio and video features prior to fusion. The experimental results, performed in two human behavior recognition datasets including political speeches and human interactions from TV shows, attest the advantages of the proposed method compared with several baseline and alternative human behavior recognition methods.
Author Nikou, Christophoros
Vrigkas, Michalis
Kakadiaris, Ioannis A.
Author_xml – sequence: 1
  givenname: Michalis
  surname: Vrigkas
  fullname: Vrigkas, Michalis
  email: mvrigkas@cs.uoi.gr
  organization: Dept. of Comput. Sci. & Eng., Univ. of Ioannina, Ioannina, Greece
– sequence: 2
  givenname: Christophoros
  surname: Nikou
  fullname: Nikou, Christophoros
  email: cnikou@cs.uoi.gr
  organization: Dept. of Comput. Sci. & Eng., Univ. of Ioannina, Ioannina, Greece
– sequence: 3
  givenname: Ioannis A.
  surname: Kakadiaris
  fullname: Kakadiaris, Ioannis A.
  email: ioannisk@uh.edu
  organization: Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA
BookMark eNp9kE1LAzEQhoNUsNb-Ab0seN6aj83u5mZdrC0UPNh6Ddl82JQ2qcmuUH-9W1tEPDiXGYb3mXd4L0HPeacBuEZwhBBkd4vxZFKNMER0hCksUF6egT5iGUsJzGjv13wBhjGuYVeEkBwXfXA_U9o11uyte0um7Va45EGvxIf1ISbLeNi-7J1cBe_sp1bJuFXWp682tmKTVK2OV-DciE3Uw1MfgOXkcVFN0_nz06waz1NJctqkJDcZE5AaRpUimaZlTmkNy1LVpWSsUCXrHq8ZrE0utVG0VloJhRBRUmpZkwG4Pd7dBf_e-TZ87dvgOkuOUZFlJCcZ7lT4qJLBxxi04btgtyLsOYL8EBb_DosfwuKnsDqo_ANJ24jGetcEYTf_ozdH1Gqtf7wKQiHCmHwBQgl5WA
CODEN ITACBQ
CitedBy_id crossref_primary_10_1109_TBCAS_2021_3060617
crossref_primary_10_1016_j_dsp_2023_104272
crossref_primary_10_1109_ACCESS_2021_3059519
crossref_primary_10_1109_TNNLS_2023_3236320
crossref_primary_10_1007_s10044_020_00953_x
crossref_primary_10_1007_s41095_019_0157_9
crossref_primary_10_3390_electronics13112010
Cites_doi 10.1109/CVPR.2007.383299
10.1109/T-AFFC.2011.9
10.1109/TMM.2012.2228476
10.1145/2388676.2388684
10.1007/s00138-013-0521-1
10.1109/CVPR.2012.6247805
10.1162/0899766042321814
10.1007/978-3-540-74889-2_7
10.1007/s00530-010-0182-0
10.1155/S111086570321101X
10.1109/T-AFFC.2011.40
10.1109/TITS.2009.2030963
10.1109/TSMCA.2012.2226575
10.1109/TAFFC.2014.2352268
10.1109/ICME.2013.6607590
10.1109/CVPR.2013.320
10.1109/TPAMI.2012.24
10.1109/TMM.2007.906583
10.5244/C.22.99
10.1109/TPAMI.2007.1124
10.1145/1101149.1101236
10.1109/TSA.2005.855842
10.1007/978-3-642-24600-5_14
10.5244/C.26.30
10.1109/DICTA.2010.57
10.1109/TMM.2013.2293060
10.1016/j.patcog.2004.12.013
10.1109/TPAMI.2008.52
10.1109/FG.2011.5771341
10.1007/978-3-642-24600-5_60
10.1109/CVPR.2005.177
10.1007/978-3-319-07064-3_8
10.1109/CVPR.2011.5995672
10.1109/ICASSP.2013.6638346
10.1109/JPROC.2010.2057231
10.1109/CVPR.2012.6247817
10.1109/TMM.2014.2328311
10.1109/FG.2013.6553804
10.1007/978-3-642-24600-5_23
10.1007/978-3-642-24571-8_53
10.1109/CVPR.2008.4587756
10.1109/MCI.2013.2247823
10.1007/s11263-005-1838-7
10.1109/ICASSP.2011.5946963
10.1162/neco.2006.18.7.1527
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TAFFC.2015.2507168
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library
CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1949-3045
EndPage 66
ExternalDocumentID 10_1109_TAFFC_2015_2507168
7350122
Genre orig-research
GrantInformation_xml – fundername: UH Hugh Roy
– fundername: Lillie Cranz Cullen Endowment Fund
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
AENEX
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
M43
O9-
OCL
PQQKQ
RIA
RIE
RNI
RZB
AAYXX
CITATION
RIG
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c365t-36f49a05f95dd34e58655b088db8c997d89716b90bf6cefd5bdedad113dccecb3
IEDL.DBID RIE
ISSN 1949-3045
IngestDate Sun Jun 29 15:36:29 EDT 2025
Tue Jul 01 02:57:51 EDT 2025
Thu Apr 24 22:54:56 EDT 2025
Wed Aug 27 03:05:05 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c365t-36f49a05f95dd34e58655b088db8c997d89716b90bf6cefd5bdedad113dccecb3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 2174436342
PQPubID 2040414
PageCount 13
ParticipantIDs crossref_citationtrail_10_1109_TAFFC_2015_2507168
proquest_journals_2174436342
crossref_primary_10_1109_TAFFC_2015_2507168
ieee_primary_7350122
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-01-01
PublicationDateYYYYMMDD 2017-01-01
PublicationDate_xml – month: 01
  year: 2017
  text: 2017-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE transactions on affective computing
PublicationTitleAbbrev T-AFFC
PublicationYear 2017
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref14
mcennis (ref54) 0
ref53
ref55
ref11
ref10
ref17
ref16
ref19
ref18
hussain (ref9) 0; 6974
ref51
ref50
liu (ref52) 0
ref46
ref45
ref48
yu (ref56) 0
ref47
ref42
ref43
ref8
theodoridis (ref49) 2008
ref7
lafferty (ref20) 0
ref4
ref3
germain (ref6) 1999
ref5
ref35
ref34
ref37
ref31
ref30
ref33
sun (ref29) 0
ref32
ref2
ref1
ref39
ref38
song (ref36) 0
ref23
ref26
ref25
ref22
ref21
ref28
ref27
ngiam (ref44) 2011
castellano (ref41) 0; 4738
tran (ref24) 0
bishop (ref40) 2006
References_xml – ident: ref26
  doi: 10.1109/CVPR.2007.383299
– start-page: 693
  year: 0
  ident: ref56
  article-title: Propagative Hough voting for human activity recognition
  publication-title: Proc 12th Eur Conf Comput Vis
– start-page: 58
  year: 0
  ident: ref29
  article-title: Action recognition via local descriptors and holistic features
  publication-title: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog Workshops
– ident: ref43
  doi: 10.1109/T-AFFC.2011.9
– ident: ref11
  doi: 10.1109/TMM.2012.2228476
– ident: ref4
  doi: 10.1145/2388676.2388684
– ident: ref30
  doi: 10.1007/s00138-013-0521-1
– year: 2006
  ident: ref40
  publication-title: Pattern Recognition and Machine Learning
– ident: ref22
  doi: 10.1109/CVPR.2012.6247805
– ident: ref13
  doi: 10.1162/0899766042321814
– volume: 4738
  start-page: 71
  year: 0
  ident: ref41
  article-title: Recognising human emotions from body movement and gesture dynamics
  publication-title: Proc Affective Comput Intell Interaction
  doi: 10.1007/978-3-540-74889-2_7
– ident: ref3
  doi: 10.1007/s00530-010-0182-0
– ident: ref16
  doi: 10.1155/S111086570321101X
– ident: ref45
  doi: 10.1109/T-AFFC.2011.40
– ident: ref1
  doi: 10.1109/TITS.2009.2030963
– ident: ref35
  doi: 10.1109/TSMCA.2012.2226575
– start-page: 1996
  year: 0
  ident: ref52
  article-title: Recognizing realistic actions from videos "in the wild"
  publication-title: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog
– ident: ref42
  doi: 10.1109/TAFFC.2014.2352268
– ident: ref37
  doi: 10.1109/ICME.2013.6607590
– ident: ref23
  doi: 10.1109/CVPR.2013.320
– ident: ref21
  doi: 10.1109/TPAMI.2012.24
– start-page: 689
  year: 2011
  ident: ref44
  article-title: Multimodal deep learning
  publication-title: Proc 28th Int Conf Mach Learn
– ident: ref32
  doi: 10.1109/TMM.2007.906583
– ident: ref50
  doi: 10.5244/C.22.99
– year: 2008
  ident: ref49
  publication-title: Pattern Recognition
– ident: ref18
  doi: 10.1109/TPAMI.2007.1124
– ident: ref17
  doi: 10.1145/1101149.1101236
– ident: ref53
  doi: 10.1109/TSA.2005.855842
– ident: ref10
  doi: 10.1007/978-3-642-24600-5_14
– ident: ref27
  doi: 10.5244/C.26.30
– ident: ref33
  doi: 10.1109/DICTA.2010.57
– ident: ref31
  doi: 10.1109/TMM.2013.2293060
– ident: ref14
  doi: 10.1016/j.patcog.2004.12.013
– ident: ref2
  doi: 10.1109/TPAMI.2008.52
– ident: ref5
  doi: 10.1109/FG.2011.5771341
– start-page: 2120
  year: 0
  ident: ref36
  article-title: Multi-view latent variable discriminative models for action recognition
  publication-title: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog
– volume: 6974
  start-page: 568
  year: 0
  ident: ref9
  article-title: Hybrid fusion approach for detecting affects from multichannel physiology
  publication-title: Proc 4th Int Conf Affective Comput Intell Interaction
  doi: 10.1007/978-3-642-24600-5_60
– ident: ref55
  doi: 10.1109/CVPR.2005.177
– ident: ref19
  doi: 10.1007/978-3-319-07064-3_8
– ident: ref28
  doi: 10.1109/CVPR.2011.5995672
– start-page: 282
  year: 0
  ident: ref20
  article-title: Conditional random fields: Probabilistic models for segmenting and labeling sequence data
  publication-title: Proc 8th Int Conf Machine Learning
– start-page: 539
  year: 0
  ident: ref24
  article-title: Social cues in group formation and local interactions for collective activity analysis
  publication-title: Proc 8th Int Conf Comput Vis Theory Appl
– ident: ref46
  doi: 10.1109/ICASSP.2013.6638346
– ident: ref12
  doi: 10.1109/JPROC.2010.2057231
– ident: ref25
  doi: 10.1109/CVPR.2012.6247817
– year: 1999
  ident: ref6
  publication-title: Human Behavior in the Social Environment An Ecological View
– ident: ref39
  doi: 10.1109/TMM.2014.2328311
– start-page: 600
  year: 0
  ident: ref54
  article-title: jAudio: An feature extraction library
  publication-title: Proc 6th Int Conf Music Inf Retrieval
– ident: ref7
  doi: 10.1109/FG.2013.6553804
– ident: ref8
  doi: 10.1007/978-3-642-24600-5_23
– ident: ref38
  doi: 10.1007/978-3-642-24571-8_53
– ident: ref34
  doi: 10.1109/CVPR.2008.4587756
– ident: ref48
  doi: 10.1109/MCI.2013.2247823
– ident: ref51
  doi: 10.1007/s11263-005-1838-7
– ident: ref15
  doi: 10.1109/ICASSP.2011.5946963
– ident: ref47
  doi: 10.1162/neco.2006.18.7.1527
SSID ssj0000333627
Score 2.1391242
Snippet In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 54
SubjectTerms audio-visual synchronization
Behavior
canonical correlation analysis
Computational modeling
Correlation
Correlation analysis
Feature extraction
Feature recognition
Hidden conditional random fields
Human behavior
human behavior recognition
Human motion
Human performance
multimodal fusion
Pruning
Synchronization
Video sequences
Visualization
Title Identifying Human Behaviors Using Synchronized Audio-Visual Cues
URI https://ieeexplore.ieee.org/document/7350122
https://www.proquest.com/docview/2174436342
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKJxZeBVEoKAMbJHXiOIk3qoqoQioLLeoWxS-pArWoTQb66_E5DyFAiC2DbTk--3x3vu87hG6CIKGBgKR2ibkb5hBoYjgHrEzME601DwGcPH2KJvPwcUEXHXTXYmGUUjb5THnwad_y5VqUECobxvAKFhiFu2cctwqr1cZTMCFGF8cNLgaz4WyUpmNI3qJeAFYPsKl-uXtsMZUfGtheK-khmjYTqrJJXr2y4J7YfeNq_O-Mj9BBbV86o2pDHKOOWp2gw6Z2g1Mf5R66rxC6FuXk2Ei-U1MlbraOzSNwnj9WwlLn7pR0RqVcrt2X5bY0o4_Nn5yiefowG0_cupqCK0hEC5dEOmQ5pppRKUmoKEBSuVEykieCsVgmwCbFGeY6EkpLyqWSufR9IoVQgpMz1F2tV-ocOX4cC1_nykhZhpFWDKsoJzRRQgueYNpHfrPOmaipxqHixVtmXQ7MMiubDGST1bLpo9u2z3tFtPFn6x4sdtuyXuc-GjTizOqzuM3A6QpJRMLg4vdel2g_gKq-NrAyQN1iU6orY2oU_NrusU_GydFO
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED4hGGChPEV5ZmCDFCe2k3ijqqjKoyy0qFsUv6QK1KK2GeDXk3OSCgFCbBlsy_HZd-fzfd8BnIdhwkOFSe2aSJ9lGGgSJEOsTCwTa61kCE7uP0a9Ibsb8dEKXC6xMMYYl3xmWvjp3vL1VOUYKruK8RUsLBTuWmH3mSjRWsuICqG00MZxjYwh4mrQ7nY7mL7FWyH6Pcin-sX6uHIqP3SwMyzdBvTrKZX5JC-tfCFb6uMbW-N_57wFm5WH6bXLLbENK2ayA426eoNXHeZduC4xug7n5LlYvleRJc7mnssk8J7eJ8qR534Y7bVzPZ76z-N5XozeKf5kD4bdm0Gn51f1FHxFI77waWSZyAi3gmtNmeEISpWFmtEyUULEOkE-KSmItJEyVnOpjc50EFCtlFGS7sPqZDoxB-AFcawCm5lCzppF1ghioozyxCirZEJ4E4J6nVNVkY1jzYvX1F06iEidbFKUTVrJpgkXyz5vJdXGn613cbGXLat1bsJxLc60Oo3zFK9djEaUhYe_9zqD9d6g_5A-3D7eH8FGiKbbhVmOYXUxy81J4Xgs5Knbb5_sn9Sf
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Identifying+Human+Behaviors+Using+Synchronized+Audio-Visual+Cues&rft.jtitle=IEEE+transactions+on+affective+computing&rft.au=Vrigkas%2C+Michalis&rft.au=Nikou%2C+Christophoros&rft.au=Kakadiaris%2C+Ioannis+A.&rft.date=2017-01-01&rft.issn=1949-3045&rft.volume=8&rft.issue=1&rft.spage=54&rft.epage=66&rft_id=info:doi/10.1109%2FTAFFC.2015.2507168&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAFFC_2015_2507168
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1949-3045&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1949-3045&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1949-3045&client=summon