Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet Domain Features

The better a machine realizes non-verbal ways of communication, such as emotion, better levels of human machine interrelation is achieved. This paper describes a method for recognizing emotions from human Speech and visual data for machine to understand. For extraction of features, videos consisting...

Full description

Saved in:
Bibliographic Details
Published in2017 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON ECE) pp. 233 - 236
Main Authors Noor, Shamman, Dhrubo, Ehsan Ahmed, Minhaz, Ahmed Tahseen, Shahnaz, Celia, Fattah, Shaikh Anowarul
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2017
Subjects
Online AccessGet full text
DOI10.1109/WIECON-ECE.2017.8468871

Cover

Abstract The better a machine realizes non-verbal ways of communication, such as emotion, better levels of human machine interrelation is achieved. This paper describes a method for recognizing emotions from human Speech and visual data for machine to understand. For extraction of features, videos consisting 6 classes of emotions (Happy, Sad, Fear, Disgust, Angry, and Surprise) of 44 different subjects from eNTERFACE05 database are used. As video feature, Horizontal and Vertical Cross Correlation (HCCR and VCCR) signals, extracted from regions-eye and mouth, are used. As Speech feature, Perceptual Linear Predictive Coefficients (PLPC) and Mel-frequency Cepstral Coefficients (MFCC), extracted from Wavelet Packet Coefficients, are used in conjunction with PLPC and MFCC extracted from original signal. For both types of feature, K-Nearest Neighbour (KNN) multiclass classification method is applied separately for identifying emotions expressed in speech and through facial movement. Emotion expressed in a video file is identified by concatenating the Speech and video features and applying KNN classification method.
AbstractList The better a machine realizes non-verbal ways of communication, such as emotion, better levels of human machine interrelation is achieved. This paper describes a method for recognizing emotions from human Speech and visual data for machine to understand. For extraction of features, videos consisting 6 classes of emotions (Happy, Sad, Fear, Disgust, Angry, and Surprise) of 44 different subjects from eNTERFACE05 database are used. As video feature, Horizontal and Vertical Cross Correlation (HCCR and VCCR) signals, extracted from regions-eye and mouth, are used. As Speech feature, Perceptual Linear Predictive Coefficients (PLPC) and Mel-frequency Cepstral Coefficients (MFCC), extracted from Wavelet Packet Coefficients, are used in conjunction with PLPC and MFCC extracted from original signal. For both types of feature, K-Nearest Neighbour (KNN) multiclass classification method is applied separately for identifying emotions expressed in speech and through facial movement. Emotion expressed in a video file is identified by concatenating the Speech and video features and applying KNN classification method.
Author Minhaz, Ahmed Tahseen
Noor, Shamman
Fattah, Shaikh Anowarul
Dhrubo, Ehsan Ahmed
Shahnaz, Celia
Author_xml – sequence: 1
  givenname: Shamman
  surname: Noor
  fullname: Noor, Shamman
  organization: Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1000, Bangladesh
– sequence: 2
  givenname: Ehsan Ahmed
  surname: Dhrubo
  fullname: Dhrubo, Ehsan Ahmed
  organization: Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1000, Bangladesh
– sequence: 3
  givenname: Ahmed Tahseen
  surname: Minhaz
  fullname: Minhaz, Ahmed Tahseen
  organization: Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1000, Bangladesh
– sequence: 4
  givenname: Celia
  surname: Shahnaz
  fullname: Shahnaz, Celia
  organization: Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1000, Bangladesh
– sequence: 5
  givenname: Shaikh Anowarul
  surname: Fattah
  fullname: Fattah, Shaikh Anowarul
  organization: Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1000, Bangladesh
BookMark eNotj8tOwzAURI0EErT0C1jgH0jxI7GdZRXSUqmiCFG6YFHdJjeVIbGRnSDx91SlqzNzFiPNiFw675CQe86mnLP8Ybssi_VzUhblVDCupyZVxmh-QUY8k0YJJVh-TSYxfjLGhDKpEeqGfMyG2nr6buMALS0731vv6CtW_uDsKW-idQdaBB8jLXwI2MLJg6vpFn6wxZ6-QPV1xKPvwDo6R-iHgPGWXDXQRpycOSabeflWPCWr9WJZzFaJ5TrrE9Ra6FxWOeMpZpCBZFxkymCj67TijTZVI7WBBjPDhFZGqv3-WLXiiABCjsnd_65FxN13sB2E3935v_wD9IJUtQ
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/WIECON-ECE.2017.8468871
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1538626209
9781538626207
1538626217
9781538626214
EndPage 236
ExternalDocumentID 8468871
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAJGR
AAWTH
ABLEC
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-e772793c9014e5a5a3012568ef7d4c1f78cf378afe580276836bb8af761eeaa23
IEDL.DBID RIE
IngestDate Wed Aug 27 02:51:30 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-e772793c9014e5a5a3012568ef7d4c1f78cf378afe580276836bb8af761eeaa23
PageCount 4
ParticipantIDs ieee_primary_8468871
PublicationCentury 2000
PublicationDate 2017-Dec.
PublicationDateYYYYMMDD 2017-12-01
PublicationDate_xml – month: 12
  year: 2017
  text: 2017-Dec.
PublicationDecade 2010
PublicationTitle 2017 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON ECE)
PublicationTitleAbbrev WIECON-ECE
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002684826
Score 1.6667744
Snippet The better a machine realizes non-verbal ways of communication, such as emotion, better levels of human machine interrelation is achieved. This paper describes...
SourceID ieee
SourceType Publisher
StartPage 233
SubjectTerms Correlation
Emotion recognition
Feature extraction
Horizontal and Vertical cross correlation
Mel frequency cepstral coefficient
Mel Frequency Cepstral Coefficient(MFCC)
Perceptual Linear Predictive Coefficient(PLPC)
Speech recognition
Videos
Viola Jones Algorithm
Visualization
Title Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet Domain Features
URI https://ieeexplore.ieee.org/document/8468871
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8NAEF1qT55UWvGbPXh008ZkN9ujxJQqVIpYW_BQ9mMCQU1Ek4u_3tkkrSgevG0CIWEnw3uz-2YfIee-4EbIkWShlcBCERiGYcZSJTTSIsJJXfdxT-_EZB7eLvmyQy42vTAAUIvPwHPDei_fFqZyS2UDxErMCax1tvA3a3q1Nusp7tQSpMqthMsfjgaLG-cFyJI4cQquyGuf_mGjUqPIeIdM1-9vxCPPXlVqz3z-Oprxvx-4S_rf_Xp0tkGiPdKBvEeeriqbFfQx-6jUC00aux56vxYM4biWC9DY4SSNnUtHo4ujKrd0oZwjRUlnCtO8pNfFq8py6ghjhQV6n8zHyUM8Ya2VAsuQH5QMkERjJhq3aQpccYV5jWRHQhrZ0PhpJE0aRFKlwCUWqkIGQmu8jIQPoNRlsE-6eZHDAaHSKqGVHIqRCkLkTxr4UPkIcSlPbejDIem5iVm9NadlrNo5Ofr79jHZdsFpBCInpFu-V3CKMF_qszq-X0wwp50
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5jHvSksom_zcGj7VabpNlRasem2xiyuYGHkTavUNRWtL341_vSdhPFg7e0EAgJj-97yffeR8ilI3gkZE9aTEuwmHAjC48ZUxUWSY0IJ8Oyjns8EYM5u1vyZYNcbWphAKAUn4FthuVbvs6iwlyVdRArMSYw19lC3Ge8qtba3KiYviVIlmsRl9PtdRZD4wZoBX5gNFyeXc__YaRS4kh_l4zXK6jkI892kYd29PmrOeN_l7hH2t8Ve3S6waJ90oC0RZ5uCp1k9DH5KNQLDSrDHvqwlgzhuBQMUN8gJfWNT0eljKMq1XShjCdFTqcKAz2nt9mrSlJqKGOBKXqbzPvBzB9YtZmClSBDyC1AGo2xGJlnU-CKK4xspDsSYk-zyIk9GcWuJ1UMXGKqKqQrwhA_PeEAKHXtHpBmmqVwSKjUSoRKdkVPuQwZVAi8qxwEuZjHmjlwRFpmY1ZvVb-MVb0nx3__viDbg9l4tBoNJ_cnZMccVCUXOSXN_L2AMwT9PDwvz_oLQc6q6g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2017+IEEE+International+WIE+Conference+on+Electrical+and+Computer+Engineering+%28WIECON+ECE%29&rft.atitle=Audio+Visual+Emotion+Recognition+Using+Cross+Correlation+and+Wavelet+Packet+Domain+Features&rft.au=Noor%2C+Shamman&rft.au=Dhrubo%2C+Ehsan+Ahmed&rft.au=Minhaz%2C+Ahmed+Tahseen&rft.au=Shahnaz%2C+Celia&rft.date=2017-12-01&rft.pub=IEEE&rft.spage=233&rft.epage=236&rft_id=info:doi/10.1109%2FWIECON-ECE.2017.8468871&rft.externalDocID=8468871