A General Compression Approach to Multi-Channel Three-Dimensional Audio
This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (i...
Saved in:
Published in | IEEE transactions on audio, speech, and language processing Vol. 21; no. 8; pp. 1676 - 1688 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway, NJ
IEEE
01.08.2013
Institute of Electrical and Electronics Engineers |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. |
---|---|
AbstractList | This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. |
Author | Zheng, Xiguang Burnett, Ian Cheng, Bin Ritz, Christian |
Author_xml | – sequence: 1 givenname: Bin surname: Cheng fullname: Cheng, Bin email: bchen@dolby.com organization: ICT Research Institute and School of Electrical Computer and Telecommunications Engineering, University of Wollongong, Australia – sequence: 2 givenname: Christian surname: Ritz fullname: Ritz, Christian email: critz@uow.edu.au organization: ICT Research Institute and School of Electrical Computer and Telecommunications Engineering, University of Wollongong, NSW, Australia – sequence: 3 givenname: Ian surname: Burnett fullname: Burnett, Ian email: ian.burnett@rmit.edu.au organization: School of Electrical and Computer Engineering, RMIT University – sequence: 4 givenname: Xiguang surname: Zheng fullname: Zheng, Xiguang email: xzhen@dolby.com organization: ICT Research Institute and School of Electrical Computer and Telecommunications Engineering, University of Wollongong, Australia |
BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=27572203$$DView record in Pascal Francis |
BookMark | eNo9kLFOwzAURS1UJErhAxBLFsYUP9uxnTEKUJCKGChz5DjPqlHqRHY78Pc0atXpXendc4dzS2ZhCEjIA9AlAC2fN9X3esko8CVjkkIhr8gcikLnqmRidskgb8htSr-UCi4FzMmqylYYMJo-q4fdGDElP4SsGsc4GLvN9kP2eej3Pq-3JgTss802IuYvfodhah656tD54Y5cO9MnvD_fBfl5e93U7_n6a_VRV-vccin3OdNcaVfaEttWgXESdKdkRzUAdqVQVrRtV7jOaNEyKShjrbXAOWqn0DrOFwROuzYOKUV0zRj9zsS_BmgzmWgmE81kojmbODJPJ2Y0yZreRROsTxeQqUIxRqftx1PPI-LlLQuqtWD8H83AaOY |
CODEN | ITASD8 |
CitedBy_id | crossref_primary_10_1186_1687_4722_2014_10 crossref_primary_10_1186_s13636_016_0091_z crossref_primary_10_3390_app7121301 crossref_primary_10_1049_el_2015_3422 crossref_primary_10_1109_TASLP_2015_2419980 crossref_primary_10_1007_s11042_015_2463_2 |
Cites_doi | 10.1049/el:20081199 10.1109/JPROC.2010.2102310 10.1109/TSA.2003.818108 10.1109/ICASSP.2007.366604 10.1155/ASP.2005.1305 10.1155/2010/415840 10.1109/ICASSP.2011.5946328 10.1109/ICASSP.2008.4517623 10.1109/TSA.2003.818109 10.1007/978-1-4615-0327-9 |
ContentType | Journal Article |
Copyright | 2014 INIST-CNRS |
Copyright_xml | – notice: 2014 INIST-CNRS |
DBID | 97E RIA RIE IQODW AAYXX CITATION |
DOI | 10.1109/TASL.2013.2260156 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore Pascal-Francis CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Applied Sciences |
EISSN | 1558-7924 |
EndPage | 1688 |
ExternalDocumentID | 10_1109_TASL_2013_2260156 27572203 6508842 |
Genre | orig-research |
GroupedDBID | 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AASAJ ABQJQ ABVLG AETIX ALMA_UNASSIGNED_HOLDINGS B-7 BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL RIA RIE RIG RNS ATWAV IPNFZ IQODW AAYXX CITATION |
ID | FETCH-LOGICAL-c366t-28378f9c9ebb71af618d76d0811ed947c4bbd5fda84b264022bcc133e8f7ecf33 |
IEDL.DBID | RIE |
ISSN | 1558-7916 |
IngestDate | Fri Aug 23 03:24:28 EDT 2024 Tue Sep 20 22:33:41 EDT 2022 Wed Jun 26 19:27:22 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 8 |
Keywords | Time-frequency analysis Audio signal processing Acoustic signal Information rate Sound source Acoustic signal processing 3D audio Loudspeaker Information transmission Three dimensional model Audio coding Audio signal Low bit rate Localization Multiple channel |
Language | English |
License | CC BY 4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c366t-28378f9c9ebb71af618d76d0811ed947c4bbd5fda84b264022bcc133e8f7ecf33 |
OpenAccessLink | https://ro.uow.edu.au/cgi/viewcontent.cgi?article=2020&context=eispapers |
PageCount | 13 |
ParticipantIDs | pascalfrancis_primary_27572203 crossref_primary_10_1109_TASL_2013_2260156 ieee_primary_6508842 |
PublicationCentury | 2000 |
PublicationDate | 2013-08-01 |
PublicationDateYYYYMMDD | 2013-08-01 |
PublicationDate_xml | – month: 08 year: 2013 text: 2013-08-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway, NJ |
PublicationPlace_xml | – name: Piscataway, NJ |
PublicationTitle | IEEE transactions on audio, speech, and language processing |
PublicationTitleAbbrev | TASL |
PublicationYear | 2013 |
Publisher | IEEE Institute of Electrical and Electronics Engineers |
Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers |
References | (ref6) 1999 ref31 ref11 ref10 jot (ref15) 2007 ref1 jerome (ref20) 2003 (ref35) 0 ref18 pulkki (ref16) 2006 villemoes (ref12) 2006 (ref3) 1997 (ref21) 2012 potard (ref22) 2006 pulkki (ref23) 2007; 55 (ref37) 1997 bosi (ref4) 1997; 45 rice (ref33) 1991 ref24 blauert (ref32) 1997 breebaart (ref13) 2005 merimaa (ref28) 2005; 53 ref25 cheng (ref26) 2009 gerzon (ref19) 1975; 17 (ref5) 2006 cheng (ref38) 2007; 4810 faller (ref17) 2006 ref29 ref8 walmsey (ref36) 0 bosi (ref7) 1999 ref9 goodwin (ref14) 2008 pulkki (ref27) 1997; 45 (ref2) 1999 laitinen (ref30) 2011; 59 walmsey (ref34) 0 |
References_xml | – year: 2007 ident: ref15 article-title: Spatial audio scene coding in a universal two-channel 3-D stereo format publication-title: Proc Audio Eng Soc Conv 123 contributor: fullname: jot – year: 2006 ident: ref5 article-title: Multichannel stereophonic sound system with and without accompanying picture – ident: ref25 doi: 10.1049/el:20081199 – year: 0 ident: ref36 article-title: Tijuana taxi?Ambisonic surround sound. ambisonics, 5.1, audio recordings contributor: fullname: walmsey – year: 1997 ident: ref3 publication-title: Information technology?Generic coding of moving pictures and associated audio information?Part 7 Advanced audio coding (AAC) – year: 0 ident: ref35 article-title: Ambisonic surround sound. ambisonics, 5.1, audio recordings?ambisonic surround sound. ambisonics, 5.1, audio recordings – volume: 45 start-page: 456 year: 1997 ident: ref27 article-title: Virtual sound source positioning using vector base amplitude panning publication-title: J Audio Eng Soc contributor: fullname: pulkki – ident: ref8 doi: 10.1109/JPROC.2010.2102310 – ident: ref11 doi: 10.1109/TSA.2003.818108 – year: 2006 ident: ref16 article-title: Directional audio coding in spatial sound reproduction and stereo upmixing publication-title: Proc Audio Eng Soc Conf 28th Int Conf The Future of Audio Technol -Surround and Beyond contributor: fullname: pulkki – year: 2012 ident: ref21 publication-title: Dolby Laboratories Dolby ATMOS Cinema Technical Guidelines – year: 1999 ident: ref6 article-title: Surround sound past, present and future, technical report publication-title: Dolby Laboratories – volume: 45 start-page: 789 year: 1997 ident: ref4 article-title: ISO/IEC MPEG-2 advanced audio coding publication-title: J Audio Eng Soc contributor: fullname: bosi – year: 1999 ident: ref7 article-title: High quality multichannel audio coding: Trends and challenges publication-title: Proc Audio Eng Soc Conf 16th Int Conf Spatial Sound Reproduction contributor: fullname: bosi – volume: 55 start-page: 503 year: 2007 ident: ref23 article-title: Spatial sound reproduction with directional audio coding publication-title: J Audio Eng Soc contributor: fullname: pulkki – ident: ref18 doi: 10.1109/ICASSP.2007.366604 – year: 0 ident: ref34 publication-title: Spanish FleaAmbisonic Surround Sound Ambisonics 5 1 Audio Recordings contributor: fullname: walmsey – ident: ref9 doi: 10.1155/ASP.2005.1305 – year: 1991 ident: ref33 article-title: Some practical universal noiseless coding techniques publication-title: California Institute of Technology National Aeronautics and Space Administration Jet Propulsion Laboratory contributor: fullname: rice – year: 1997 ident: ref37 article-title: Methods for the subjective assessment of intermediate quality levels of coding systems publication-title: B 1534 International Telecommunication Union – ident: ref31 doi: 10.1155/2010/415840 – year: 2006 ident: ref12 article-title: MPEG surround: The forthcoming ISO standard for spatial audio coding publication-title: Proc Audio Eng Soc Conf 28th Int Conf The Future of Audio Technol -Surround and Beyond contributor: fullname: villemoes – year: 2005 ident: ref13 article-title: MPEG spatial audio coding/MPEG surround: Overview and current status publication-title: Proc Audio Eng Soc Conv 119 contributor: fullname: breebaart – year: 1997 ident: ref32 publication-title: Spatial Hearing The Psychophysics of Human Sound Localization contributor: fullname: blauert – year: 2008 ident: ref14 article-title: Spatial audio scene coding publication-title: Audio Engineering Society Convention 125 contributor: fullname: goodwin – ident: ref29 doi: 10.1109/ICASSP.2011.5946328 – year: 1999 ident: ref2 article-title: International Organization for Standardization publication-title: Information technology?Coding of moving pictures and associated audio for digital storage media at up to about 1 5 Mbit/s?Part 3 Audio – ident: ref24 doi: 10.1109/ICASSP.2008.4517623 – ident: ref10 doi: 10.1109/TSA.2003.818109 – volume: 53 start-page: 1115 year: 2005 ident: ref28 article-title: Spatial impulse response rendering I: Analysis and synthesis publication-title: J Audio Eng Soc contributor: fullname: merimaa – year: 2006 ident: ref17 article-title: Directional audio coding: Filterbank and STFT-based design publication-title: Proc Audio Eng Soc Conv 120 contributor: fullname: faller – ident: ref1 doi: 10.1007/978-1-4615-0327-9 – year: 2009 ident: ref26 article-title: Spatial audio coding by squeezing: Analysis and application to compressing multiple soundfields publication-title: The 17th Eur Signal Process Conf contributor: fullname: cheng – volume: 4810 start-page: 804 year: 2007 ident: ref38 publication-title: Advances in Multimedia Information Processing contributor: fullname: cheng – year: 2003 ident: ref20 article-title: Further investigation of high order ambisonics and wavefield synthesis for holophonic sound imaging publication-title: AES Convention 114 contributor: fullname: jerome – volume: 59 start-page: 29 year: 2011 ident: ref30 article-title: Reproducing applause-type signals with directional audio coding publication-title: J Audio Eng Soc contributor: fullname: laitinen – volume: 17 start-page: 24 year: 1975 ident: ref19 article-title: Ambisonics part two: Studio techniques publication-title: Studio Sound contributor: fullname: gerzon – year: 2006 ident: ref22 article-title: 3D-audio object oriented coding publication-title: University of Wollongong Thesis Collection contributor: fullname: potard |
SSID | ssj0043641 |
Score | 2.2179573 |
Snippet | This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based... |
SourceID | crossref pascalfrancis ieee |
SourceType | Aggregation Database Index Database Publisher |
StartPage | 1676 |
SubjectTerms | 3D audio Applied sciences Audio coding Azimuth Bit rate Coding, codes Encoding Exact sciences and technology Information, signal and communications theory Loudspeakers Miscellaneous Quantization (signal) Signal and communications theory Signal processing Telecommunications and information theory Three-dimensional displays Time-frequency analysis |
Title | A General Compression Approach to Multi-Channel Three-Dimensional Audio |
URI | https://ieeexplore.ieee.org/document/6508842 |
Volume | 21 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEN7UnvTgqxrro9mDJ-NSYIGFI1FrY6wX26Q3wr4SowGjcPHXu7NLm2o8eCMBwmZmmPlmd-YbhC4TykUgYk1C6Wti8i9KeBmb_4qZNE6ZgMY17HfMnpLpInpYxsseul73wiilbPGZ8uDSnuXLWrSwVTa2aCIyDncr9UPXq7XyuhFNIseNaj7EDObpTjADPxvP8-dHKOKiXggEWjCreiMG2aEqUBJZfhqpaDfOYiPGTPbQbLU6V1ry6rUN98TXL-LG_y5_H-12YBPnzjoOUE9Vh2hng4JwgO5z3DFPY3ANriq2wnlHNY6bGtseXQJtCJV6w3OjfEVuYSiAI_TAeStf6iO0mNzNb6akG65ABE2ShgDrTaozkSnOWVDqJEglS6RBCIGSWcRExLmMtSzTiBvQZEI9F8IktCrVTAlN6THqV3WlThDODESEjt44lIHBY6pMmF8GAdVCgAWUQ3S1Enfx7jg0Cpt7-FkBuilAN0WnmyEagOTWD3ZCG6LRDwWt74csZmHo09O_3ztD26EdYAEle-eo33y06sLAiIaPrP18A4Ocw_8 |
link.rule.ids | 315,786,790,802,27955,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEJ409aAefBvro-7Bk5EWWGDLkai1atuLNemNsK_EaKhRevHXOwO0qcaDNxIgwMww883uzDcAFxGXylOhdXztWgfzL-7ILMT_SmAaZzCgSUvrHaNxNHgOHqbhtAFXy14YY0xZfGY6dFju5euZmtNSWbdEEwE63DWM866ourUWfjfgUVCxo-KjBKKeeg_Tc-PuJHkaUhkX7_hEoUXTqleiUDlWhYois0-Ui60GWqxEmf42jBbvVxWXvHbmheyor1_Ujf_9gB3YquEmSyr72IWGyfdgc4WEcB_uElZzTzNyDlVdbM6SmmycFTNWduk61IiQmzc2QfUb54bGAlSUHiyZ65fZATz3byfXA6cer-AoHkWFQ7w3PRur2EgpvMxGXk-LSCNG8IyOA6ECKXVoddYLJMImDPZSKUxpTc8Koyznh9DMZ7k5AhYjSKSe3tDXHiIyk0XCzTyPW6XIBrIWXC7Enb5XLBppmX24cUq6SUk3aa2bFuyT5JYX1kJrQfuHgpbnfREK33f58d_3ncP6YDIapsP78eMJbPjlOAsq4DuFZvExN2cIKgrZLm3pGwU8x1M |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+General+Compression+Approach+to+Multi-Channel+Three-Dimensional+Audio&rft.jtitle=IEEE+transactions+on+audio%2C+speech%2C+and+language+processing&rft.au=Cheng%2C+Bin&rft.au=Ritz%2C+Christian&rft.au=Burnett%2C+Ian&rft.au=Zheng%2C+Xiguang&rft.date=2013-08-01&rft.pub=IEEE&rft.issn=1558-7916&rft.eissn=1558-7924&rft.volume=21&rft.issue=8&rft.spage=1676&rft.epage=1688&rft_id=info:doi/10.1109%2FTASL.2013.2260156&rft.externalDocID=6508842 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1558-7916&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1558-7916&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1558-7916&client=summon |