Sentiment analysis using image-based deep spectrum features

We test the suitability of our novel deep spectrum feature representation for performing speech-based sentiment analysis. Deep spectrum features are formed by passing spectrograms through a pre-trained image convolutional neural network (CNN) and have been shown to capture useful emotion information...

Full description

Saved in:
Bibliographic Details
Published in2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) pp. 26 - 29
Main Authors Amiriparian, Shahin, Cummins, Nicholas, Ottl, Sandra, Gerczuk, Maurice, Schuller, Bjorn
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We test the suitability of our novel deep spectrum feature representation for performing speech-based sentiment analysis. Deep spectrum features are formed by passing spectrograms through a pre-trained image convolutional neural network (CNN) and have been shown to capture useful emotion information in speech; however, their usefulness for sentiment analysis is yet to be investigated. Using a data set of movie reviews collected from YouTube, we compare deep spectrum features combined with the bag-of-audio-words (BoAW) paradigm with a state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) based BoAW system when performing a binary sentiment classification task. Key results presented indicate the suitability of both features for the proposed task. The deep spectrum features achieve an unweighted average recall of 74.5 %. The results provide further evidence for the effectiveness of deep spectrum features as a robust feature representation for speech analysis.
DOI:10.1109/ACIIW.2017.8272618