Similarity based join over audio feeds in a multimedia data stream management system

Over the last several years, processing of high performance data streams has become very important in various domains. A new type of data processing is needed for applications where input data streams are modeled as multimedia data streams, such as audio and video feeds. For example, in the public s...

Full description

Saved in:

Bibliographic Details
Published in	Bell Labs technical journal Vol. 18; no. 1; pp. 195 - 212
Main Authors	Maison, Rafal, Majda, Ewelina, Dobrowolski, Andrzej P., Zakrzewicz, Maciej
Format	Journal Article
Language	English
Published	Murray Hill IEEE 01.06.2013
Subjects	Audio data Data communication Data transmission Management systems Multimedia Multimedia communication Multimedia databases Parametrization Signal processing Speaker recognition Speech processing Speech recognition Streaming media Voice
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Over the last several years, processing of high performance data streams has become very important in various domains. A new type of data processing is needed for applications where input data streams are modeled as multimedia data streams, such as audio and video feeds. For example, in the public safety sector, monitoring and automatic identification of particular individuals suspected of terrorist or criminal activity requires the processing of complex audio and video streams, which is beyond the capabilities of a typical data stream management system (DSMS). The concept of a multimedia data stream management system (MMDSMS) has recently been introduced in order to effectively process continuous queries over dynamic multimedia data streams. In this paper, we address MMDSMS functionalities related to speaker recognition problems in the area of detecting individuals who may pose security threats. We focus on audio feed processing using our novel similarity-based join and on parameterization of the multimedia signal for the process of recognition. We propose a set of signal parameters which a clearly discriminate among individual voices by describing the signal using a homomorphic processing method. Our research was primarily focused on assessing the applicability of cepstral analysis in speech recognition systems, based on a set of acquired digitized voice samples. We developed a research prototype to assess the proposed concepts, and verified the effectiveness of our framework in a lab environment.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1089-7089 1538-7305
DOI:	10.1002/bltj.21599