Variability modelling for audio events detection in movies

Detecting audio events in Hollywood movies is a complex task due to the presence of variability between the soundtracks of the movies. This inter-movies variability is shown to impair the audio events detection results in a realistic framework. In this article, we propose to model the variability us...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 74; no. 4; pp. 1143 - 1173
Main Authors	Penet, Cédric, Demarty, Claire-Hélène, Gravier, Guillaume, Gros, Patrick
Format	Journal Article
Language	English
Published	Boston Springer US 01.02.2015 Springer Nature B.V Springer Verlag
Subjects	Bayesian analysis Computer Communication Networks Computer Science Data Structures and Information Theory Engineering Sciences Explosions Factor analysis Modelling Motion pictures Multimedia Multimedia computer applications Multimedia Information Systems Pattern recognition Robustness Signal and Image Processing Sound Special effects Special Purpose and Application-Based Systems State of the art Studies Tasks Television news Audio events detection Movies Factor Analysis Bayesian network Audio words
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Detecting audio events in Hollywood movies is a complex task due to the presence of variability between the soundtracks of the movies. This inter-movies variability is shown to impair the audio events detection results in a realistic framework. In this article, we propose to model the variability using a factor analysis technique, which we then use to compensate the audio features. The factor analysis compensation is validated using the state-of-the-art system based on multiple audio words sequences and contextual Bayesian networks which we previously developed in Penet et al. ( 2013 ). Results obtained on the same publicly available dataset for the detection of gunshots and explosions show an improvement in the handling of the variability, while keeping the robustness capabilities of the previous system. Furthermore, the system is applied to the detection of screams and proves its ability to generalise to other types of events. The obtained results also emphasise the fact that, in addition to modelling variability, adding concepts in the system may also be beneficial for the precision rates
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-014-2038-7