A General Compression Approach to Multi-Channel Three-Dimensional Audio

This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (i...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on audio, speech, and language processing Vol. 21; no. 8; pp. 1676 - 1688
Main Authors Cheng, Bin, Ritz, Christian, Burnett, Ian, Zheng, Xiguang
Format Journal Article
LanguageEnglish
Published Piscataway, NJ IEEE 01.08.2013
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text

Cover

Loading…
Abstract This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel.
AbstractList This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel.
Author Zheng, Xiguang
Burnett, Ian
Cheng, Bin
Ritz, Christian
Author_xml – sequence: 1
  givenname: Bin
  surname: Cheng
  fullname: Cheng, Bin
  email: bchen@dolby.com
  organization: ICT Research Institute and School of Electrical Computer and Telecommunications Engineering, University of Wollongong, Australia
– sequence: 2
  givenname: Christian
  surname: Ritz
  fullname: Ritz, Christian
  email: critz@uow.edu.au
  organization: ICT Research Institute and School of Electrical Computer and Telecommunications Engineering, University of Wollongong, NSW, Australia
– sequence: 3
  givenname: Ian
  surname: Burnett
  fullname: Burnett, Ian
  email: ian.burnett@rmit.edu.au
  organization: School of Electrical and Computer Engineering, RMIT University
– sequence: 4
  givenname: Xiguang
  surname: Zheng
  fullname: Zheng, Xiguang
  email: xzhen@dolby.com
  organization: ICT Research Institute and School of Electrical Computer and Telecommunications Engineering, University of Wollongong, Australia
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27572203$$DView record in Pascal Francis
BookMark eNo9kLFOwzAURS1UJErhAxBLFsYUP9uxnTEKUJCKGChz5DjPqlHqRHY78Pc0atXpXendc4dzS2ZhCEjIA9AlAC2fN9X3esko8CVjkkIhr8gcikLnqmRidskgb8htSr-UCi4FzMmqylYYMJo-q4fdGDElP4SsGsc4GLvN9kP2eej3Pq-3JgTss802IuYvfodhah656tD54Y5cO9MnvD_fBfl5e93U7_n6a_VRV-vccin3OdNcaVfaEttWgXESdKdkRzUAdqVQVrRtV7jOaNEyKShjrbXAOWqn0DrOFwROuzYOKUV0zRj9zsS_BmgzmWgmE81kojmbODJPJ2Y0yZreRROsTxeQqUIxRqftx1PPI-LlLQuqtWD8H83AaOY
CODEN ITASD8
CitedBy_id crossref_primary_10_1186_1687_4722_2014_10
crossref_primary_10_1186_s13636_016_0091_z
crossref_primary_10_3390_app7121301
crossref_primary_10_1049_el_2015_3422
crossref_primary_10_1109_TASLP_2015_2419980
crossref_primary_10_1007_s11042_015_2463_2
Cites_doi 10.1049/el:20081199
10.1109/JPROC.2010.2102310
10.1109/TSA.2003.818108
10.1109/ICASSP.2007.366604
10.1155/ASP.2005.1305
10.1155/2010/415840
10.1109/ICASSP.2011.5946328
10.1109/ICASSP.2008.4517623
10.1109/TSA.2003.818109
10.1007/978-1-4615-0327-9
ContentType Journal Article
Copyright 2014 INIST-CNRS
Copyright_xml – notice: 2014 INIST-CNRS
DBID 97E
RIA
RIE
IQODW
AAYXX
CITATION
DOI 10.1109/TASL.2013.2260156
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
Pascal-Francis
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Applied Sciences
EISSN 1558-7924
EndPage 1688
ExternalDocumentID 10_1109_TASL_2013_2260156
27572203
6508842
Genre orig-research
GroupedDBID 0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AASAJ
ABQJQ
ABVLG
AETIX
ALMA_UNASSIGNED_HOLDINGS
B-7
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
RIA
RIE
RIG
RNS
ATWAV
IPNFZ
IQODW
AAYXX
CITATION
ID FETCH-LOGICAL-c366t-28378f9c9ebb71af618d76d0811ed947c4bbd5fda84b264022bcc133e8f7ecf33
IEDL.DBID RIE
ISSN 1558-7916
IngestDate Fri Aug 23 03:24:28 EDT 2024
Tue Sep 20 22:33:41 EDT 2022
Wed Jun 26 19:27:22 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords Time-frequency analysis
Audio signal processing
Acoustic signal
Information rate
Sound source
Acoustic signal processing
3D audio
Loudspeaker
Information transmission
Three dimensional model
Audio coding
Audio signal
Low bit rate
Localization
Multiple channel
Language English
License CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c366t-28378f9c9ebb71af618d76d0811ed947c4bbd5fda84b264022bcc133e8f7ecf33
OpenAccessLink https://ro.uow.edu.au/cgi/viewcontent.cgi?article=2020&context=eispapers
PageCount 13
ParticipantIDs pascalfrancis_primary_27572203
crossref_primary_10_1109_TASL_2013_2260156
ieee_primary_6508842
PublicationCentury 2000
PublicationDate 2013-08-01
PublicationDateYYYYMMDD 2013-08-01
PublicationDate_xml – month: 08
  year: 2013
  text: 2013-08-01
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway, NJ
PublicationPlace_xml – name: Piscataway, NJ
PublicationTitle IEEE transactions on audio, speech, and language processing
PublicationTitleAbbrev TASL
PublicationYear 2013
Publisher IEEE
Institute of Electrical and Electronics Engineers
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
References (ref6) 1999
ref31
ref11
ref10
jot (ref15) 2007
ref1
jerome (ref20) 2003
(ref35) 0
ref18
pulkki (ref16) 2006
villemoes (ref12) 2006
(ref3) 1997
(ref21) 2012
potard (ref22) 2006
pulkki (ref23) 2007; 55
(ref37) 1997
bosi (ref4) 1997; 45
rice (ref33) 1991
ref24
blauert (ref32) 1997
breebaart (ref13) 2005
merimaa (ref28) 2005; 53
ref25
cheng (ref26) 2009
gerzon (ref19) 1975; 17
(ref5) 2006
cheng (ref38) 2007; 4810
faller (ref17) 2006
ref29
ref8
walmsey (ref36) 0
bosi (ref7) 1999
ref9
goodwin (ref14) 2008
pulkki (ref27) 1997; 45
(ref2) 1999
laitinen (ref30) 2011; 59
walmsey (ref34) 0
References_xml – year: 2007
  ident: ref15
  article-title: Spatial audio scene coding in a universal two-channel 3-D stereo format
  publication-title: Proc Audio Eng Soc Conv 123
  contributor:
    fullname: jot
– year: 2006
  ident: ref5
  article-title: Multichannel stereophonic sound system with and without accompanying picture
– ident: ref25
  doi: 10.1049/el:20081199
– year: 0
  ident: ref36
  article-title: Tijuana taxi?Ambisonic surround sound. ambisonics, 5.1, audio recordings
  contributor:
    fullname: walmsey
– year: 1997
  ident: ref3
  publication-title: Information technology?Generic coding of moving pictures and associated audio information?Part 7 Advanced audio coding (AAC)
– year: 0
  ident: ref35
  article-title: Ambisonic surround sound. ambisonics, 5.1, audio recordings?ambisonic surround sound. ambisonics, 5.1, audio recordings
– volume: 45
  start-page: 456
  year: 1997
  ident: ref27
  article-title: Virtual sound source positioning using vector base amplitude panning
  publication-title: J Audio Eng Soc
  contributor:
    fullname: pulkki
– ident: ref8
  doi: 10.1109/JPROC.2010.2102310
– ident: ref11
  doi: 10.1109/TSA.2003.818108
– year: 2006
  ident: ref16
  article-title: Directional audio coding in spatial sound reproduction and stereo upmixing
  publication-title: Proc Audio Eng Soc Conf 28th Int Conf The Future of Audio Technol -Surround and Beyond
  contributor:
    fullname: pulkki
– year: 2012
  ident: ref21
  publication-title: Dolby Laboratories Dolby ATMOS Cinema Technical Guidelines
– year: 1999
  ident: ref6
  article-title: Surround sound past, present and future, technical report
  publication-title: Dolby Laboratories
– volume: 45
  start-page: 789
  year: 1997
  ident: ref4
  article-title: ISO/IEC MPEG-2 advanced audio coding
  publication-title: J Audio Eng Soc
  contributor:
    fullname: bosi
– year: 1999
  ident: ref7
  article-title: High quality multichannel audio coding: Trends and challenges
  publication-title: Proc Audio Eng Soc Conf 16th Int Conf Spatial Sound Reproduction
  contributor:
    fullname: bosi
– volume: 55
  start-page: 503
  year: 2007
  ident: ref23
  article-title: Spatial sound reproduction with directional audio coding
  publication-title: J Audio Eng Soc
  contributor:
    fullname: pulkki
– ident: ref18
  doi: 10.1109/ICASSP.2007.366604
– year: 0
  ident: ref34
  publication-title: Spanish FleaAmbisonic Surround Sound Ambisonics 5 1 Audio Recordings
  contributor:
    fullname: walmsey
– ident: ref9
  doi: 10.1155/ASP.2005.1305
– year: 1991
  ident: ref33
  article-title: Some practical universal noiseless coding techniques
  publication-title: California Institute of Technology National Aeronautics and Space Administration Jet Propulsion Laboratory
  contributor:
    fullname: rice
– year: 1997
  ident: ref37
  article-title: Methods for the subjective assessment of intermediate quality levels of coding systems
  publication-title: B 1534 International Telecommunication Union
– ident: ref31
  doi: 10.1155/2010/415840
– year: 2006
  ident: ref12
  article-title: MPEG surround: The forthcoming ISO standard for spatial audio coding
  publication-title: Proc Audio Eng Soc Conf 28th Int Conf The Future of Audio Technol -Surround and Beyond
  contributor:
    fullname: villemoes
– year: 2005
  ident: ref13
  article-title: MPEG spatial audio coding/MPEG surround: Overview and current status
  publication-title: Proc Audio Eng Soc Conv 119
  contributor:
    fullname: breebaart
– year: 1997
  ident: ref32
  publication-title: Spatial Hearing The Psychophysics of Human Sound Localization
  contributor:
    fullname: blauert
– year: 2008
  ident: ref14
  article-title: Spatial audio scene coding
  publication-title: Audio Engineering Society Convention 125
  contributor:
    fullname: goodwin
– ident: ref29
  doi: 10.1109/ICASSP.2011.5946328
– year: 1999
  ident: ref2
  article-title: International Organization for Standardization
  publication-title: Information technology?Coding of moving pictures and associated audio for digital storage media at up to about 1 5 Mbit/s?Part 3 Audio
– ident: ref24
  doi: 10.1109/ICASSP.2008.4517623
– ident: ref10
  doi: 10.1109/TSA.2003.818109
– volume: 53
  start-page: 1115
  year: 2005
  ident: ref28
  article-title: Spatial impulse response rendering I: Analysis and synthesis
  publication-title: J Audio Eng Soc
  contributor:
    fullname: merimaa
– year: 2006
  ident: ref17
  article-title: Directional audio coding: Filterbank and STFT-based design
  publication-title: Proc Audio Eng Soc Conv 120
  contributor:
    fullname: faller
– ident: ref1
  doi: 10.1007/978-1-4615-0327-9
– year: 2009
  ident: ref26
  article-title: Spatial audio coding by squeezing: Analysis and application to compressing multiple soundfields
  publication-title: The 17th Eur Signal Process Conf
  contributor:
    fullname: cheng
– volume: 4810
  start-page: 804
  year: 2007
  ident: ref38
  publication-title: Advances in Multimedia Information Processing
  contributor:
    fullname: cheng
– year: 2003
  ident: ref20
  article-title: Further investigation of high order ambisonics and wavefield synthesis for holophonic sound imaging
  publication-title: AES Convention 114
  contributor:
    fullname: jerome
– volume: 59
  start-page: 29
  year: 2011
  ident: ref30
  article-title: Reproducing applause-type signals with directional audio coding
  publication-title: J Audio Eng Soc
  contributor:
    fullname: laitinen
– volume: 17
  start-page: 24
  year: 1975
  ident: ref19
  article-title: Ambisonics part two: Studio techniques
  publication-title: Studio Sound
  contributor:
    fullname: gerzon
– year: 2006
  ident: ref22
  article-title: 3D-audio object oriented coding
  publication-title: University of Wollongong Thesis Collection
  contributor:
    fullname: potard
SSID ssj0043641
Score 2.2179573
Snippet This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based...
SourceID crossref
pascalfrancis
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 1676
SubjectTerms 3D audio
Applied sciences
Audio coding
Azimuth
Bit rate
Coding, codes
Encoding
Exact sciences and technology
Information, signal and communications theory
Loudspeakers
Miscellaneous
Quantization (signal)
Signal and communications theory
Signal processing
Telecommunications and information theory
Three-dimensional displays
Time-frequency analysis
Title A General Compression Approach to Multi-Channel Three-Dimensional Audio
URI https://ieeexplore.ieee.org/document/6508842
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEN7UnvTgqxrro9mDJ-NSYIGFI1FrY6wX26Q3wr4SowGjcPHXu7NLm2o8eCMBwmZmmPlmd-YbhC4TykUgYk1C6Wti8i9KeBmb_4qZNE6ZgMY17HfMnpLpInpYxsseul73wiilbPGZ8uDSnuXLWrSwVTa2aCIyDncr9UPXq7XyuhFNIseNaj7EDObpTjADPxvP8-dHKOKiXggEWjCreiMG2aEqUBJZfhqpaDfOYiPGTPbQbLU6V1ry6rUN98TXL-LG_y5_H-12YBPnzjoOUE9Vh2hng4JwgO5z3DFPY3ANriq2wnlHNY6bGtseXQJtCJV6w3OjfEVuYSiAI_TAeStf6iO0mNzNb6akG65ABE2ShgDrTaozkSnOWVDqJEglS6RBCIGSWcRExLmMtSzTiBvQZEI9F8IktCrVTAlN6THqV3WlThDODESEjt44lIHBY6pMmF8GAdVCgAWUQ3S1Enfx7jg0Cpt7-FkBuilAN0WnmyEagOTWD3ZCG6LRDwWt74csZmHo09O_3ztD26EdYAEle-eo33y06sLAiIaPrP18A4Ocw_8
link.rule.ids 315,786,790,802,27955,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEJ409aAefBvro-7Bk5EWWGDLkai1atuLNemNsK_EaKhRevHXOwO0qcaDNxIgwMww883uzDcAFxGXylOhdXztWgfzL-7ILMT_SmAaZzCgSUvrHaNxNHgOHqbhtAFXy14YY0xZfGY6dFju5euZmtNSWbdEEwE63DWM866ourUWfjfgUVCxo-KjBKKeeg_Tc-PuJHkaUhkX7_hEoUXTqleiUDlWhYois0-Ui60GWqxEmf42jBbvVxWXvHbmheyor1_Ujf_9gB3YquEmSyr72IWGyfdgc4WEcB_uElZzTzNyDlVdbM6SmmycFTNWduk61IiQmzc2QfUb54bGAlSUHiyZ65fZATz3byfXA6cer-AoHkWFQ7w3PRur2EgpvMxGXk-LSCNG8IyOA6ECKXVoddYLJMImDPZSKUxpTc8Koyznh9DMZ7k5AhYjSKSe3tDXHiIyk0XCzTyPW6XIBrIWXC7Enb5XLBppmX24cUq6SUk3aa2bFuyT5JYX1kJrQfuHgpbnfREK33f58d_3ncP6YDIapsP78eMJbPjlOAsq4DuFZvExN2cIKgrZLm3pGwU8x1M
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+General+Compression+Approach+to+Multi-Channel+Three-Dimensional+Audio&rft.jtitle=IEEE+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Cheng%2C+Bin&rft.au=Ritz%2C+Christian&rft.au=Burnett%2C+Ian&rft.au=Zheng%2C+Xiguang&rft.date=2013-08-01&rft.pub=IEEE&rft.issn=1558-7916&rft.eissn=1558-7924&rft.volume=21&rft.issue=8&rft.spage=1676&rft.epage=1688&rft_id=info:doi/10.1109%2FTASL.2013.2260156&rft.externalDocID=6508842
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7916&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7916&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7916&client=summon