A General Compression Approach to Multi-Channel Three-Dimensional Audio

This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (i...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on audio, speech, and language processing Vol. 21; no. 8; pp. 1676 - 1688
Main Authors	Cheng, Bin, Ritz, Christian, Burnett, Ian, Zheng, Xiguang
Format	Journal Article
Language	English
Published	Piscataway, NJ IEEE 01.08.2013 Institute of Electrical and Electronics Engineers
Subjects	3D audio Applied sciences Audio coding Azimuth Bit rate Coding, codes Encoding Exact sciences and technology Information, signal and communications theory Loudspeakers Miscellaneous Quantization (signal) Signal and communications theory Signal processing Telecommunications and information theory Three-dimensional displays Time-frequency analysis Time-frequency analysis Audio signal processing Acoustic signal Information rate Sound source Acoustic signal processing 3D audio Loudspeaker Information transmission Three dimensional model Audio coding Audio signal Low bit rate Localization Multiple channel
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel.
ISSN:	1558-7916 1558-7924
DOI:	10.1109/TASL.2013.2260156