Multimodal fusion using dynamic hybrid models
We propose a novel hybrid model that exploits the strength of discriminative classifiers along with the representational power of generative models. Our focus is on detecting multimodal events in time varying sequences. Discriminative classifiers have been shown to achieve higher performances than t...
Saved in:
Published in | IEEE Winter Conference on Applications of Computer Vision pp. 556 - 563 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2014
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We propose a novel hybrid model that exploits the strength of discriminative classifiers along with the representational power of generative models. Our focus is on detecting multimodal events in time varying sequences. Discriminative classifiers have been shown to achieve higher performances than the corresponding generative likelihood-based classifiers. On the other hand, generative models learn a rich informative space which allows for data generation and joint feature representation that discriminative models lack. We employ a deep temporal generative model for unsupervised learning of a shared representation across multiple modalities with time varying data. The temporal generative model takes into account short term temporal phenomena and allows for filling in missing data by generating data within or across modalities. The hybrid model involves augmenting the temporal generative model with a temporal discriminative model for event detection, and classification, which enables modeling long range temporal dynamics. We evaluate our approach on audio-visual datasets (AVEC, AVLetters, and CUAVE) and demonstrate its superiority compared to the state-of-the-art. |
---|---|
ISSN: | 1550-5790 2642-9381 |
DOI: | 10.1109/WACV.2014.6836053 |