Audio Recording Location Identification Using Acoustic Environment Signature
An audio recording is subject to a number of possible distortions and artifacts. Consider, for example, artifacts due to acoustic reverberation and background noise. The acoustic reverberation depends on the shape and the composition of the room, and it causes temporal and spectral smearing of the r...
Saved in:
Published in | IEEE transactions on information forensics and security Vol. 8; no. 11; pp. 1746 - 1759 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York, NY
IEEE
01.11.2013
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | An audio recording is subject to a number of possible distortions and artifacts. Consider, for example, artifacts due to acoustic reverberation and background noise. The acoustic reverberation depends on the shape and the composition of the room, and it causes temporal and spectral smearing of the recorded sound. The background noise, on the other hand, depends on the secondary audio source activities present in the evidentiary recording. Extraction of acoustic cues from an audio recording is an important but challenging task. Temporal changes in the estimated reverberation and background noise can be used for dynamic acoustic environment identification (AEI), audio forensics, and ballistic settings. We describe a statistical technique based on spectral subtraction to estimate the amount of reverberation and nonlinear filtering based on particle filtering to estimate the background noise. The effectiveness of the proposed method is tested using a data set consisting of speech recordings of two human speakers (one male and one female) made in eight acoustic environments using four commercial grade microphones. Performance of the proposed method is evaluated for various experimental settings such as microphone independent, semi- and full-blind AEI, and robustness to MP3 compression. Performance of the proposed framework is also evaluated using Temporal Derivative-based Spectrum and Mel-Cepstrum (TDSM)-based features. Experimental results show that the proposed method improves AEI performance compared with the direct method (i.e., feature vector is extracted from the audio recording directly). In addition, experimental results also show that the proposed scheme is robust to MP3 compression attack. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23 |
ISSN: | 1556-6013 1556-6021 |
DOI: | 10.1109/TIFS.2013.2278843 |