Identifying Human Behaviors Using Synchronized Audio-Visual Cues
In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is...
Saved in:
Published in | IEEE transactions on affective computing Vol. 8; no. 1; pp. 54 - 66 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.01.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1949-3045 1949-3045 |
DOI | 10.1109/TAFFC.2015.2507168 |
Cover
Loading…
Abstract | In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is represented by a vector of spatio-temporal visual features (STIP, head orientation and proxemic features) along with audio features (MFCCs). We propose a feature pruning method for removing irrelevant and redundant features based on the spatio-temporal neighborhood of each feature in a video sequence. The proposed framework assumes that human movements are highly correlated with sound emissions. For this reason, canonical correlation analysis (CCA) is employed to find correlation between the audio and video features prior to fusion. The experimental results, performed in two human behavior recognition datasets including political speeches and human interactions from TV shows, attest the advantages of the proposed method compared with several baseline and alternative human behavior recognition methods. |
---|---|
AbstractList | In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is represented by a vector of spatio-temporal visual features (STIP, head orientation and proxemic features) along with audio features (MFCCs). We propose a feature pruning method for removing irrelevant and redundant features based on the spatio-temporal neighborhood of each feature in a video sequence. The proposed framework assumes that human movements are highly correlated with sound emissions. For this reason, canonical correlation analysis (CCA) is employed to find correlation between the audio and video features prior to fusion. The experimental results, performed in two human behavior recognition datasets including political speeches and human interactions from TV shows, attest the advantages of the proposed method compared with several baseline and alternative human behavior recognition methods. |
Author | Nikou, Christophoros Vrigkas, Michalis Kakadiaris, Ioannis A. |
Author_xml | – sequence: 1 givenname: Michalis surname: Vrigkas fullname: Vrigkas, Michalis email: mvrigkas@cs.uoi.gr organization: Dept. of Comput. Sci. & Eng., Univ. of Ioannina, Ioannina, Greece – sequence: 2 givenname: Christophoros surname: Nikou fullname: Nikou, Christophoros email: cnikou@cs.uoi.gr organization: Dept. of Comput. Sci. & Eng., Univ. of Ioannina, Ioannina, Greece – sequence: 3 givenname: Ioannis A. surname: Kakadiaris fullname: Kakadiaris, Ioannis A. email: ioannisk@uh.edu organization: Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA |
BookMark | eNp9kE1LAzEQhoNUsNb-Ab0seN6aj83u5mZdrC0UPNh6Ddl82JQ2qcmuUH-9W1tEPDiXGYb3mXd4L0HPeacBuEZwhBBkd4vxZFKNMER0hCksUF6egT5iGUsJzGjv13wBhjGuYVeEkBwXfXA_U9o11uyte0um7Va45EGvxIf1ISbLeNi-7J1cBe_sp1bJuFXWp682tmKTVK2OV-DciE3Uw1MfgOXkcVFN0_nz06waz1NJctqkJDcZE5AaRpUimaZlTmkNy1LVpWSsUCXrHq8ZrE0utVG0VloJhRBRUmpZkwG4Pd7dBf_e-TZ87dvgOkuOUZFlJCcZ7lT4qJLBxxi04btgtyLsOYL8EBb_DosfwuKnsDqo_ANJ24jGetcEYTf_ozdH1Gqtf7wKQiHCmHwBQgl5WA |
CODEN | ITACBQ |
CitedBy_id | crossref_primary_10_1109_TBCAS_2021_3060617 crossref_primary_10_1016_j_dsp_2023_104272 crossref_primary_10_1109_ACCESS_2021_3059519 crossref_primary_10_1109_TNNLS_2023_3236320 crossref_primary_10_1007_s10044_020_00953_x crossref_primary_10_1007_s41095_019_0157_9 crossref_primary_10_3390_electronics13112010 |
Cites_doi | 10.1109/CVPR.2007.383299 10.1109/T-AFFC.2011.9 10.1109/TMM.2012.2228476 10.1145/2388676.2388684 10.1007/s00138-013-0521-1 10.1109/CVPR.2012.6247805 10.1162/0899766042321814 10.1007/978-3-540-74889-2_7 10.1007/s00530-010-0182-0 10.1155/S111086570321101X 10.1109/T-AFFC.2011.40 10.1109/TITS.2009.2030963 10.1109/TSMCA.2012.2226575 10.1109/TAFFC.2014.2352268 10.1109/ICME.2013.6607590 10.1109/CVPR.2013.320 10.1109/TPAMI.2012.24 10.1109/TMM.2007.906583 10.5244/C.22.99 10.1109/TPAMI.2007.1124 10.1145/1101149.1101236 10.1109/TSA.2005.855842 10.1007/978-3-642-24600-5_14 10.5244/C.26.30 10.1109/DICTA.2010.57 10.1109/TMM.2013.2293060 10.1016/j.patcog.2004.12.013 10.1109/TPAMI.2008.52 10.1109/FG.2011.5771341 10.1007/978-3-642-24600-5_60 10.1109/CVPR.2005.177 10.1007/978-3-319-07064-3_8 10.1109/CVPR.2011.5995672 10.1109/ICASSP.2013.6638346 10.1109/JPROC.2010.2057231 10.1109/CVPR.2012.6247817 10.1109/TMM.2014.2328311 10.1109/FG.2013.6553804 10.1007/978-3-642-24600-5_23 10.1007/978-3-642-24571-8_53 10.1109/CVPR.2008.4587756 10.1109/MCI.2013.2247823 10.1007/s11263-005-1838-7 10.1109/ICASSP.2011.5946963 10.1162/neco.2006.18.7.1527 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TAFFC.2015.2507168 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Computer and Information Systems Abstracts |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1949-3045 |
EndPage | 66 |
ExternalDocumentID | 10_1109_TAFFC_2015_2507168 7350122 |
Genre | orig-research |
GrantInformation_xml | – fundername: UH Hugh Roy – fundername: Lillie Cranz Cullen Endowment Fund |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG AENEX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF M43 O9- OCL PQQKQ RIA RIE RNI RZB AAYXX CITATION RIG 7SC 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c365t-36f49a05f95dd34e58655b088db8c997d89716b90bf6cefd5bdedad113dccecb3 |
IEDL.DBID | RIE |
ISSN | 1949-3045 |
IngestDate | Sun Jun 29 15:36:29 EDT 2025 Tue Jul 01 02:57:51 EDT 2025 Thu Apr 24 22:54:56 EDT 2025 Wed Aug 27 03:05:05 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c365t-36f49a05f95dd34e58655b088db8c997d89716b90bf6cefd5bdedad113dccecb3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
PQID | 2174436342 |
PQPubID | 2040414 |
PageCount | 13 |
ParticipantIDs | crossref_citationtrail_10_1109_TAFFC_2015_2507168 proquest_journals_2174436342 crossref_primary_10_1109_TAFFC_2015_2507168 ieee_primary_7350122 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2017-01-01 |
PublicationDateYYYYMMDD | 2017-01-01 |
PublicationDate_xml | – month: 01 year: 2017 text: 2017-01-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE transactions on affective computing |
PublicationTitleAbbrev | T-AFFC |
PublicationYear | 2017 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 ref14 mcennis (ref54) 0 ref53 ref55 ref11 ref10 ref17 ref16 ref19 ref18 hussain (ref9) 0; 6974 ref51 ref50 liu (ref52) 0 ref46 ref45 ref48 yu (ref56) 0 ref47 ref42 ref43 ref8 theodoridis (ref49) 2008 ref7 lafferty (ref20) 0 ref4 ref3 germain (ref6) 1999 ref5 ref35 ref34 ref37 ref31 ref30 ref33 sun (ref29) 0 ref32 ref2 ref1 ref39 ref38 song (ref36) 0 ref23 ref26 ref25 ref22 ref21 ref28 ref27 ngiam (ref44) 2011 castellano (ref41) 0; 4738 tran (ref24) 0 bishop (ref40) 2006 |
References_xml | – ident: ref26 doi: 10.1109/CVPR.2007.383299 – start-page: 693 year: 0 ident: ref56 article-title: Propagative Hough voting for human activity recognition publication-title: Proc 12th Eur Conf Comput Vis – start-page: 58 year: 0 ident: ref29 article-title: Action recognition via local descriptors and holistic features publication-title: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog Workshops – ident: ref43 doi: 10.1109/T-AFFC.2011.9 – ident: ref11 doi: 10.1109/TMM.2012.2228476 – ident: ref4 doi: 10.1145/2388676.2388684 – ident: ref30 doi: 10.1007/s00138-013-0521-1 – year: 2006 ident: ref40 publication-title: Pattern Recognition and Machine Learning – ident: ref22 doi: 10.1109/CVPR.2012.6247805 – ident: ref13 doi: 10.1162/0899766042321814 – volume: 4738 start-page: 71 year: 0 ident: ref41 article-title: Recognising human emotions from body movement and gesture dynamics publication-title: Proc Affective Comput Intell Interaction doi: 10.1007/978-3-540-74889-2_7 – ident: ref3 doi: 10.1007/s00530-010-0182-0 – ident: ref16 doi: 10.1155/S111086570321101X – ident: ref45 doi: 10.1109/T-AFFC.2011.40 – ident: ref1 doi: 10.1109/TITS.2009.2030963 – ident: ref35 doi: 10.1109/TSMCA.2012.2226575 – start-page: 1996 year: 0 ident: ref52 article-title: Recognizing realistic actions from videos "in the wild" publication-title: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog – ident: ref42 doi: 10.1109/TAFFC.2014.2352268 – ident: ref37 doi: 10.1109/ICME.2013.6607590 – ident: ref23 doi: 10.1109/CVPR.2013.320 – ident: ref21 doi: 10.1109/TPAMI.2012.24 – start-page: 689 year: 2011 ident: ref44 article-title: Multimodal deep learning publication-title: Proc 28th Int Conf Mach Learn – ident: ref32 doi: 10.1109/TMM.2007.906583 – ident: ref50 doi: 10.5244/C.22.99 – year: 2008 ident: ref49 publication-title: Pattern Recognition – ident: ref18 doi: 10.1109/TPAMI.2007.1124 – ident: ref17 doi: 10.1145/1101149.1101236 – ident: ref53 doi: 10.1109/TSA.2005.855842 – ident: ref10 doi: 10.1007/978-3-642-24600-5_14 – ident: ref27 doi: 10.5244/C.26.30 – ident: ref33 doi: 10.1109/DICTA.2010.57 – ident: ref31 doi: 10.1109/TMM.2013.2293060 – ident: ref14 doi: 10.1016/j.patcog.2004.12.013 – ident: ref2 doi: 10.1109/TPAMI.2008.52 – ident: ref5 doi: 10.1109/FG.2011.5771341 – start-page: 2120 year: 0 ident: ref36 article-title: Multi-view latent variable discriminative models for action recognition publication-title: Proc IEEE Comput Soc Conf Comput Vis Pattern Recog – volume: 6974 start-page: 568 year: 0 ident: ref9 article-title: Hybrid fusion approach for detecting affects from multichannel physiology publication-title: Proc 4th Int Conf Affective Comput Intell Interaction doi: 10.1007/978-3-642-24600-5_60 – ident: ref55 doi: 10.1109/CVPR.2005.177 – ident: ref19 doi: 10.1007/978-3-319-07064-3_8 – ident: ref28 doi: 10.1109/CVPR.2011.5995672 – start-page: 282 year: 0 ident: ref20 article-title: Conditional random fields: Probabilistic models for segmenting and labeling sequence data publication-title: Proc 8th Int Conf Machine Learning – start-page: 539 year: 0 ident: ref24 article-title: Social cues in group formation and local interactions for collective activity analysis publication-title: Proc 8th Int Conf Comput Vis Theory Appl – ident: ref46 doi: 10.1109/ICASSP.2013.6638346 – ident: ref12 doi: 10.1109/JPROC.2010.2057231 – ident: ref25 doi: 10.1109/CVPR.2012.6247817 – year: 1999 ident: ref6 publication-title: Human Behavior in the Social Environment An Ecological View – ident: ref39 doi: 10.1109/TMM.2014.2328311 – start-page: 600 year: 0 ident: ref54 article-title: jAudio: An feature extraction library publication-title: Proc 6th Int Conf Music Inf Retrieval – ident: ref7 doi: 10.1109/FG.2013.6553804 – ident: ref8 doi: 10.1007/978-3-642-24600-5_23 – ident: ref38 doi: 10.1007/978-3-642-24571-8_53 – ident: ref34 doi: 10.1109/CVPR.2008.4587756 – ident: ref48 doi: 10.1109/MCI.2013.2247823 – ident: ref51 doi: 10.1007/s11263-005-1838-7 – ident: ref15 doi: 10.1109/ICASSP.2011.5946963 – ident: ref47 doi: 10.1162/neco.2006.18.7.1527 |
SSID | ssj0000333627 |
Score | 2.1391242 |
Snippet | In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 54 |
SubjectTerms | audio-visual synchronization Behavior canonical correlation analysis Computational modeling Correlation Correlation analysis Feature extraction Feature recognition Hidden conditional random fields Human behavior human behavior recognition Human motion Human performance multimodal fusion Pruning Synchronization Video sequences Visualization |
Title | Identifying Human Behaviors Using Synchronized Audio-Visual Cues |
URI | https://ieeexplore.ieee.org/document/7350122 https://www.proquest.com/docview/2174436342 |
Volume | 8 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKJxZeBVEoKAMbJHXiOIk3qoqoQioLLeoWxS-pArWoTQb66_E5DyFAiC2DbTk--3x3vu87hG6CIKGBgKR2ibkb5hBoYjgHrEzME601DwGcPH2KJvPwcUEXHXTXYmGUUjb5THnwad_y5VqUECobxvAKFhiFu2cctwqr1cZTMCFGF8cNLgaz4WyUpmNI3qJeAFYPsKl-uXtsMZUfGtheK-khmjYTqrJJXr2y4J7YfeNq_O-Mj9BBbV86o2pDHKOOWp2gw6Z2g1Mf5R66rxC6FuXk2Ei-U1MlbraOzSNwnj9WwlLn7pR0RqVcrt2X5bY0o4_Nn5yiefowG0_cupqCK0hEC5dEOmQ5pppRKUmoKEBSuVEykieCsVgmwCbFGeY6EkpLyqWSufR9IoVQgpMz1F2tV-ocOX4cC1_nykhZhpFWDKsoJzRRQgueYNpHfrPOmaipxqHixVtmXQ7MMiubDGST1bLpo9u2z3tFtPFn6x4sdtuyXuc-GjTizOqzuM3A6QpJRMLg4vdel2g_gKq-NrAyQN1iU6orY2oU_NrusU_GydFO |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED4hGGChPEV5ZmCDFCe2k3ijqqjKoyy0qFsUv6QK1KK2GeDXk3OSCgFCbBlsy_HZd-fzfd8BnIdhwkOFSe2aSJ9lGGgSJEOsTCwTa61kCE7uP0a9Ibsb8dEKXC6xMMYYl3xmWvjp3vL1VOUYKruK8RUsLBTuWmH3mSjRWsuICqG00MZxjYwh4mrQ7nY7mL7FWyH6Pcin-sX6uHIqP3SwMyzdBvTrKZX5JC-tfCFb6uMbW-N_57wFm5WH6bXLLbENK2ayA426eoNXHeZduC4xug7n5LlYvleRJc7mnssk8J7eJ8qR534Y7bVzPZ76z-N5XozeKf5kD4bdm0Gn51f1FHxFI77waWSZyAi3gmtNmeEISpWFmtEyUULEOkE-KSmItJEyVnOpjc50EFCtlFGS7sPqZDoxB-AFcawCm5lCzppF1ghioozyxCirZEJ4E4J6nVNVkY1jzYvX1F06iEidbFKUTVrJpgkXyz5vJdXGn613cbGXLat1bsJxLc60Oo3zFK9djEaUhYe_9zqD9d6g_5A-3D7eH8FGiKbbhVmOYXUxy81J4Xgs5Knbb5_sn9Sf |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Identifying+Human+Behaviors+Using+Synchronized+Audio-Visual+Cues&rft.jtitle=IEEE+transactions+on+affective+computing&rft.au=Vrigkas%2C+Michalis&rft.au=Nikou%2C+Christophoros&rft.au=Kakadiaris%2C+Ioannis+A.&rft.date=2017-01-01&rft.issn=1949-3045&rft.volume=8&rft.issue=1&rft.spage=54&rft.epage=66&rft_id=info:doi/10.1109%2FTAFFC.2015.2507168&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAFFC_2015_2507168 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1949-3045&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1949-3045&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1949-3045&client=summon |