Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics
As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, V ideo C oding for M achines ( VCM ) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimiz...
Saved in:
Published in | IEEE transactions on pattern analysis and machine intelligence Vol. 46; no. 7; pp. 5174 - 5191 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
IEEE
01.07.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 0162-8828 1939-3539 2160-9292 1939-3539 |
DOI | 10.1109/TPAMI.2024.3367293 |
Cover
Loading…
Abstract | As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, V ideo C oding for M achines ( VCM ) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimize compactness and efficiency jointly from a unified perspective of high accuracy machine vision and full fidelity human vision. With the rapid advances of deep feature representation and visual data compression in mind, in this paper, we summarize VCM methodology and philosophy based on existing academia and industrial efforts. The development of VCM follows a general rate-distortion optimization, and the categorization of key modules or techniques is established including feature-assisted coding, scalable coding, intermediate feature compression/optimization, and machine vision targeted codec, from broader perspectives of vision tasks, analytics resources, etc. From previous works, it is demonstrated that, although existing works attempt to reveal the nature of scalable representation in bits when dealing with machine and human vision tasks, there remains a rare study in the generality of low bit rate representation, and accordingly how to support a variety of visual analytic tasks. Therefore, we investigate a novel visual information compression for the analytics taxonomy problem to strengthen the capability of compact visual representations extracted from multiple tasks for visual analytics. A new perspective of task relationships versus compression is revisited. By keeping in mind the transferability among different machine vision tasks (e.g. high-level semantic and mid-level geometry-related), we aim to support multiple tasks jointly at low bit rates. In particular, to narrow the dimensionality gap between neural network generated features extracted from pixels and a variety of machine vision features/labels (e.g. scene class, segmentation labels), a codebook hyperprior is designed to compress the neural network-generated features. As demonstrated in our experiments, this new hyperprior model is expected to improve feature compression efficiency by estimating the signal entropy more accurately, which enables further investigation of the granularity of abstracting compact features among different tasks. |
---|---|
AbstractList | As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, V ideo C oding for M achines ( VCM ) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimize compactness and efficiency jointly from a unified perspective of high accuracy machine vision and full fidelity human vision. With the rapid advances of deep feature representation and visual data compression in mind, in this paper, we summarize VCM methodology and philosophy based on existing academia and industrial efforts. The development of VCM follows a general rate-distortion optimization, and the categorization of key modules or techniques is established including feature-assisted coding, scalable coding, intermediate feature compression/optimization, and machine vision targeted codec, from broader perspectives of vision tasks, analytics resources, etc. From previous works, it is demonstrated that, although existing works attempt to reveal the nature of scalable representation in bits when dealing with machine and human vision tasks, there remains a rare study in the generality of low bit rate representation, and accordingly how to support a variety of visual analytic tasks. Therefore, we investigate a novel visual information compression for the analytics taxonomy problem to strengthen the capability of compact visual representations extracted from multiple tasks for visual analytics. A new perspective of task relationships versus compression is revisited. By keeping in mind the transferability among different machine vision tasks (e.g. high-level semantic and mid-level geometry-related), we aim to support multiple tasks jointly at low bit rates. In particular, to narrow the dimensionality gap between neural network generated features extracted from pixels and a variety of machine vision features/labels (e.g. scene class, segmentation labels), a codebook hyperprior is designed to compress the neural network-generated features. As demonstrated in our experiments, this new hyperprior model is expected to improve feature compression efficiency by estimating the signal entropy more accurately, which enables further investigation of the granularity of abstracting compact features among different tasks. As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimize compactness and efficiency jointly from a unified perspective of high accuracy machine vision and full fidelity human vision. With the rapid advances of deep feature representation and visual data compression in mind, in this paper, we summarize VCM methodology and philosophy based on existing academia and industrial efforts. The development of VCM follows a general rate-distortion optimization, and the categorization of key modules or techniques is established including feature-assisted coding, scalable coding, intermediate feature compression/optimization, and machine vision targeted codec, from broader perspectives of vision tasks, analytics resources, etc. From previous works, it is demonstrated that, although existing works attempt to reveal the nature of scalable representation in bits when dealing with machine and human vision tasks, there remains a rare study in the generality of low bit rate representation, and accordingly how to support a variety of visual analytic tasks. Therefore, we investigate a novel visual information compression for the analytics taxonomy problem to strengthen the capability of compact visual representations extracted from multiple tasks for visual analytics. A new perspective of task relationships versus compression is revisited. By keeping in mind the transferability among different machine vision tasks (e.g. high-level semantic and mid-level geometry-related), we aim to support multiple tasks jointly at low bit rates. In particular, to narrow the dimensionality gap between neural network generated features extracted from pixels and a variety of machine vision features/labels (e.g. scene class, segmentation labels), a codebook hyperprior is designed to compress the neural network-generated features. As demonstrated in our experiments, this new hyperprior model is expected to improve feature compression efficiency by estimating the signal entropy more accurately, which enables further investigation of the granularity of abstracting compact features among different tasks.As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimize compactness and efficiency jointly from a unified perspective of high accuracy machine vision and full fidelity human vision. With the rapid advances of deep feature representation and visual data compression in mind, in this paper, we summarize VCM methodology and philosophy based on existing academia and industrial efforts. The development of VCM follows a general rate-distortion optimization, and the categorization of key modules or techniques is established including feature-assisted coding, scalable coding, intermediate feature compression/optimization, and machine vision targeted codec, from broader perspectives of vision tasks, analytics resources, etc. From previous works, it is demonstrated that, although existing works attempt to reveal the nature of scalable representation in bits when dealing with machine and human vision tasks, there remains a rare study in the generality of low bit rate representation, and accordingly how to support a variety of visual analytic tasks. Therefore, we investigate a novel visual information compression for the analytics taxonomy problem to strengthen the capability of compact visual representations extracted from multiple tasks for visual analytics. A new perspective of task relationships versus compression is revisited. By keeping in mind the transferability among different machine vision tasks (e.g. high-level semantic and mid-level geometry-related), we aim to support multiple tasks jointly at low bit rates. In particular, to narrow the dimensionality gap between neural network generated features extracted from pixels and a variety of machine vision features/labels (e.g. scene class, segmentation labels), a codebook hyperprior is designed to compress the neural network-generated features. As demonstrated in our experiments, this new hyperprior model is expected to improve feature compression efficiency by estimating the signal entropy more accurately, which enables further investigation of the granularity of abstracting compact features among different tasks. As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimize compactness and efficiency jointly from a unified perspective of high accuracy machine vision and full fidelity human vision. With the rapid advances of deep feature representation and visual data compression in mind, in this paper, we summarize VCM methodology and philosophy based on existing academia and industrial efforts. The development of VCM follows a general rate-distortion optimization, and the categorization of key modules or techniques is established including feature-assisted coding, scalable coding, intermediate feature compression/optimization, and machine vision targeted codec, from broader perspectives of vision tasks, analytics resources, etc. From previous works, it is demonstrated that, although existing works attempt to reveal the nature of scalable representation in bits when dealing with machine and human vision tasks, there remains a rare study in the generality of low bit rate representation, and accordingly how to support a variety of visual analytic tasks. Therefore, we investigate a novel visual information compression for the analytics taxonomy problem to strengthen the capability of compact visual representations extracted from multiple tasks for visual analytics. A new perspective of task relationships versus compression is revisited. By keeping in mind the transferability among different machine vision tasks (e.g. high-level semantic and mid-level geometry-related), we aim to support multiple tasks jointly at low bit rates. In particular, to narrow the dimensionality gap between neural network generated features extracted from pixels and a variety of machine vision features/labels (e.g. scene class, segmentation labels), a codebook hyperprior is designed to compress the neural network-generated features. As demonstrated in our experiments, this new hyperprior model is expected to improve feature compression efficiency by estimating the signal entropy more accurately, which enables further investigation of the granularity of abstracting compact features among different tasks. |
Author | Duan, Ling-Yu Huang, Haofeng Liu, Jiaying Hu, Yueyu Yang, Wenhan |
Author_xml | – sequence: 1 givenname: Wenhan orcidid: 0000-0002-1692-0069 surname: Yang fullname: Yang, Wenhan email: yangwenhan@pku.edu.cn organization: Peking University, Beijing, China – sequence: 2 givenname: Haofeng orcidid: 0000-0002-1480-7388 surname: Huang fullname: Huang, Haofeng email: huang6013@pku.edu.cn organization: Peking University, Beijing, China – sequence: 3 givenname: Yueyu orcidid: 0000-0003-4919-4515 surname: Hu fullname: Hu, Yueyu email: huyy@pku.edu.cn organization: Peking University, Beijing, China – sequence: 4 givenname: Ling-Yu orcidid: 0000-0002-4491-2023 surname: Duan fullname: Duan, Ling-Yu email: lingyu@pcl.ac.cn organization: National Engineering Research Center of Visual Technology, School of Computer Science, Peking University, Beijing, China – sequence: 5 givenname: Jiaying orcidid: 0000-0002-0468-9576 surname: Liu fullname: Liu, Jiaying email: liujiaying@pku.edu.cn organization: Peking University, Beijing, China |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/38376966$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kU9vEzEQxS3UiqaFL4AQWolLLxvG_73coohCpFZFqPRqeb1OceXYwd5F6rfHaQKqeuhpbM_vjUfvnaKjmKJD6B2GOcbQfbr5vrhazQkQNqdUSNLRV2hGsIC2Ix05QjPAgrRKEXWCTku5B8CMA32NTqiiUnRCzFC49YNLzTINPt4165SbK2N_-ejK5_q42Ro7Nre-TCY0P9w2u-LiaEaf4mO33svuvNOt4uhC8HcVqL0QTJ9yJf-4ZhFNeBi9LW_Q8dqE4t4e6hn6efHlZvmtvbz-ulouLltLORlbNyjeOd5LTKXBhjJGiKwVE6l4bzEo6LDEagBpqLWKWmXBSCFhINArQs_Q-X7uNqffkyuj3vhi63YmujQVXd3pOAMJvKIfn6H3acp14aIpCCZBAGeV-nCgpn7jBr3NfmPyg_7nYwXIHrA5lZLd-j-CQe_C0o9h6V1Y-hBWFalnIuv35o7Z-PCy9P1e6p1zT_5iDDgh9C8TlKDx |
CODEN | ITPIDJ |
CitedBy_id | crossref_primary_10_1186_s13640_024_00647_y crossref_primary_10_1109_JETCAS_2024_3524260 crossref_primary_10_1109_JSAC_2024_3460078 crossref_primary_10_1109_ACCESS_2025_3549316 crossref_primary_10_1038_s41598_025_85602_1 crossref_primary_10_1109_TCSVT_2024_3467124 |
Cites_doi | 10.1145/3394171.3413968 10.1109/CVPR46437.2021.00991 10.1109/ICIP40778.2020.9191184 10.1109/TCSVT.2021.3104305 10.1109/ICCV.2017.244 10.1109/ICME51207.2021.9428417 10.1109/CVPR.2016.90 10.1109/DCC50243.2021.00024 10.1145/3343031.3350874 10.5555/2969033.2969125 10.1109/TIP.2019.2941660 10.1109/ICIP40778.2020.9191247 10.1109/ICIP40778.2020.9190860 10.1109/ICME46284.2020.9102843 10.1109/cvprw53098.2021.00271 10.1109/MSP.2014.2371951 10.1109/TIP.2022.3160602 10.1109/ICASSP40776.2020.9053011 10.1109/ICME46284.2020.9102750 10.1109/TMM.2021.3094300 10.1109/ICIP.2019.8803255 10.1109/DCC50243.2021.00057 10.1109/TMM.2020.2966885 10.1109/TMM.2021.3068580 10.1109/DCC47342.2020.00044 10.1109/ICASSP.2019.8682641 10.1109/CVPRW50498.2020.00088 10.1109/TIP.2021.3060875 10.1109/ICME51207.2021.9428224 10.1109/ICASSP40776.2020.9054770 10.1109/TIP.2020.3016485 10.1109/TCSVT.2003.815165 10.1109/ICME51207.2021.9428366 10.1109/ICASSP39728.2021.9413603 10.1109/CVPR42600.2020.01013 10.1007/978-3-319-10602-1_48 10.1109/TCSVT.2012.2221191 10.1109/ICASSP39728.2021.9413943 10.1007/978-3-030-58565-5_19 10.1109/VCIP49819.2020.9301807 10.1145/3343031.3350849 10.1145/1274871.1274888 10.1109/ICASSP40776.2020.9054527 10.1109/ICME51207.2021.9428228 10.1109/CVPR46437.2021.01641 10.1109/ICIP.2019.8803275 10.1109/ICIP.2019.8803110 10.1109/ICIP40778.2020.9190843 10.1002/047174882x 10.1109/ICME51207.2021.9428258 10.1109/ICASSP.2019.8683541 10.1109/ICIP40778.2020.9190933 10.1109/TPAMI.2021.3054719 10.1109/CVPR42600.2020.00813 10.1109/CVPR.2016.91 10.1109/JIOT.2020.3039359 10.1109/ICASSP39728.2021.9414465 10.1109/ICME46284.2020.9102810 10.1109/CVPR42600.2020.00271 10.1109/ICIP.2019.8803805 10.1109/ICCV.2015.169 10.1109/CVPR42600.2020.00796 10.1007/s11263-021-01491-7 10.1109/ICASSP39728.2021.9413506 10.1109/CVPR.2018.00391 10.1109/ICASSP40776.2020.9054165 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
DBID | 97E RIA RIE AAYXX CITATION NPM 7SC 7SP 8FD JQ2 L7M L~C L~D 7X8 |
DOI | 10.1109/TPAMI.2024.3367293 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef PubMed Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitle | CrossRef PubMed Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic Technology Research Database PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 2160-9292 1939-3539 |
EndPage | 5191 |
ExternalDocumentID | 38376966 10_1109_TPAMI_2024_3367293 10440522 |
Genre | orig-research Journal Article |
GrantInformation_xml | – fundername: Fuzhou Chengtou New Infrastructure Group – fundername: Boyun Vision Company Ltd. – fundername: National Natural Science Foundation of China grantid: 62332010; 62088102 funderid: 10.13039/501100001809 – fundername: PKU-NTU Joint Research Institute – fundername: AI Joint Lab of Future Urban Infrastructure – fundername: Ng Teng Fong Charitable Foundation funderid: 10.13039/501100018807 |
GroupedDBID | --- -DZ -~X .DC 0R~ 29I 4.4 53G 5GY 5VS 6IK 97E 9M8 AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT ADRHT AENEX AETEA AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P FA8 HZ~ H~9 IBMZZ ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RXW RZB TAE TN5 UHB VH1 XJT ~02 AAYOK AAYXX CITATION RIG NPM 7SC 7SP 8FD JQ2 L7M L~C L~D 7X8 |
ID | FETCH-LOGICAL-c352t-ed859e5b7137a1a3442271a312785bc108091718d07a3cc83c8c0a7670d20b823 |
IEDL.DBID | RIE |
ISSN | 0162-8828 1939-3539 |
IngestDate | Fri Jul 11 10:36:03 EDT 2025 Sun Jun 29 12:14:58 EDT 2025 Thu Apr 03 07:00:53 EDT 2025 Tue Jul 01 01:43:09 EDT 2025 Thu Apr 24 22:51:59 EDT 2025 Wed Aug 27 02:06:04 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 7 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c352t-ed859e5b7137a1a3442271a312785bc108091718d07a3cc83c8c0a7670d20b823 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ORCID | 0000-0003-4919-4515 0000-0002-1480-7388 0000-0002-4491-2023 0000-0002-0468-9576 0000-0002-1692-0069 |
PMID | 38376966 |
PQID | 3064706054 |
PQPubID | 85458 |
PageCount | 18 |
ParticipantIDs | ieee_primary_10440522 crossref_primary_10_1109_TPAMI_2024_3367293 pubmed_primary_38376966 proquest_miscellaneous_2929540705 proquest_journals_3064706054 crossref_citationtrail_10_1109_TPAMI_2024_3367293 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-07-01 |
PublicationDateYYYYMMDD | 2024-07-01 |
PublicationDate_xml | – month: 07 year: 2024 text: 2024-07-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States – name: New York |
PublicationTitle | IEEE transactions on pattern analysis and machine intelligence |
PublicationTitleAbbrev | TPAMI |
PublicationTitleAlternate | IEEE Trans Pattern Anal Mach Intell |
PublicationYear | 2024 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref57 ref12 ref56 ref15 ref59 ref14 ref58 ref53 ref11 Simonyan (ref52) ref55 ref10 ref54 ref17 ref19 ref18 Bellard (ref7) 2021 ref51 ref50 ref46 ref45 ref48 ref47 ref42 ref41 ref44 Ballé (ref6) ref49 ref8 ref9 ref4 ref3 ref5 Krizhevsky (ref36) ref35 ref34 ref37 ref31 ref75 ref30 ref74 ref33 ref32 ref76 ref2 ref1 ref39 ref38 Chun (ref16) Locatello (ref43) ref71 ref70 ref73 ref72 ref24 ref68 ref23 ref67 Lin (ref40) ref25 ref69 ref20 ref64 ref63 ref66 ref21 ref65 ref28 ref27 ref29 (ref26) 2021 Gao (ref22) 2021 ref60 ref62 ref61 |
References_xml | – ident: ref71 doi: 10.1145/3394171.3413968 – ident: ref63 doi: 10.1109/CVPR46437.2021.00991 – ident: ref69 doi: 10.1109/ICIP40778.2020.9191184 – ident: ref23 doi: 10.1109/TCSVT.2021.3104305 – ident: ref76 doi: 10.1109/ICCV.2017.244 – ident: ref32 doi: 10.1109/ICME51207.2021.9428417 – start-page: 10129 volume-title: Proc. AAAI Conf. Artif. Intell. ident: ref40 article-title: Enhancing unsupervised video representation learning by decoupling the scene and the motion – ident: ref27 doi: 10.1109/CVPR.2016.90 – ident: ref8 doi: 10.1109/DCC50243.2021.00024 – ident: ref39 doi: 10.1145/3343031.3350874 – ident: ref25 doi: 10.5555/2969033.2969125 – ident: ref14 doi: 10.1109/TIP.2019.2941660 – ident: ref48 doi: 10.1109/ICIP40778.2020.9191247 – ident: ref54 doi: 10.1109/ICIP40778.2020.9190860 – ident: ref66 doi: 10.1109/ICME46284.2020.9102843 – ident: ref46 doi: 10.1109/cvprw53098.2021.00271 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Representations ident: ref6 article-title: Variational image compression with a scale hyperprior – ident: ref45 doi: 10.1109/MSP.2014.2371951 – year: 2021 ident: ref26 article-title: Draft of white paper on motivation and requirements for video coding for machine – ident: ref17 doi: 10.1109/TIP.2022.3160602 – ident: ref18 doi: 10.1109/ICASSP40776.2020.9053011 – ident: ref30 doi: 10.1109/ICME46284.2020.9102750 – ident: ref62 doi: 10.1109/TMM.2021.3094300 – ident: ref60 doi: 10.1109/ICIP.2019.8803255 – ident: ref49 doi: 10.1109/DCC50243.2021.00057 – start-page: 1106 volume-title: Proc. Annu. Conf. Neural Inf. Process. Syst. ident: ref36 article-title: ImageNet classification with deep convolutional neural networks – ident: ref44 doi: 10.1109/TMM.2020.2966885 – ident: ref70 doi: 10.1109/TMM.2021.3068580 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Representations ident: ref52 article-title: Very deep convolutional networks for large-scale image recognition – ident: ref68 doi: 10.1109/DCC47342.2020.00044 – ident: ref11 doi: 10.1109/ICASSP.2019.8682641 – ident: ref28 doi: 10.1109/CVPRW50498.2020.00088 – ident: ref53 doi: 10.1109/ICIP40778.2020.9190860 – ident: ref4 doi: 10.1109/TIP.2021.3060875 – ident: ref38 doi: 10.1109/ICME51207.2021.9428224 – start-page: 4114 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref43 article-title: Challenging common assumptions in the unsupervised learning of disentangled representations – ident: ref3 doi: 10.1109/ICASSP40776.2020.9054770 – ident: ref21 doi: 10.1109/TIP.2020.3016485 – ident: ref64 doi: 10.1109/TCSVT.2003.815165 – ident: ref10 doi: 10.1109/ICME51207.2021.9428366 – ident: ref58 doi: 10.1109/ICASSP39728.2021.9413603 – ident: ref29 doi: 10.1109/CVPR42600.2020.01013 – ident: ref41 doi: 10.1007/978-3-319-10602-1_48 – ident: ref55 doi: 10.1109/TCSVT.2012.2221191 – ident: ref5 doi: 10.1109/ICASSP39728.2021.9413943 – ident: ref19 doi: 10.1007/978-3-030-58565-5_19 – ident: ref31 doi: 10.1109/VCIP49819.2020.9301807 – ident: ref13 doi: 10.1145/3343031.3350849 – ident: ref47 doi: 10.1145/1274871.1274888 – ident: ref51 doi: 10.1109/ICASSP40776.2020.9054527 – ident: ref33 doi: 10.1109/ICME51207.2021.9428228 – ident: ref67 doi: 10.1109/CVPR46437.2021.01641 – ident: ref56 doi: 10.1109/ICIP.2019.8803275 – ident: ref2 doi: 10.1109/ICIP.2019.8803110 – ident: ref12 doi: 10.1109/ICIP40778.2020.9190843 – ident: ref20 doi: 10.1002/047174882x – ident: ref75 doi: 10.1109/ICME51207.2021.9428258 – ident: ref1 doi: 10.1109/ICASSP.2019.8683541 – ident: ref57 doi: 10.1109/ICIP40778.2020.9190933 – ident: ref59 doi: 10.1109/TPAMI.2021.3054719 – start-page: 7936 volume-title: Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit. Workshops ident: ref16 article-title: Learned prior information for image compression – ident: ref34 doi: 10.1109/CVPR42600.2020.00813 – ident: ref50 doi: 10.1109/CVPR.2016.91 – ident: ref74 doi: 10.1109/JIOT.2020.3039359 – year: 2021 ident: ref7 article-title: BPG image format – ident: ref37 doi: 10.1109/ICASSP39728.2021.9414465 – ident: ref65 doi: 10.1109/ICME46284.2020.9102810 – ident: ref72 doi: 10.1109/CVPR42600.2020.00271 – ident: ref9 doi: 10.1109/ICIP.2019.8803805 – ident: ref24 doi: 10.1109/ICCV.2015.169 – ident: ref15 doi: 10.1109/CVPR42600.2020.00796 – ident: ref42 doi: 10.1007/s11263-021-01491-7 – ident: ref61 doi: 10.1109/ICASSP39728.2021.9413506 – year: 2021 ident: ref22 article-title: Recent Standard Development Activities on Video Coding for Machines. arXiv e-prints – ident: ref73 doi: 10.1109/CVPR.2018.00391 – ident: ref35 doi: 10.1109/ICASSP40776.2020.9054165 |
SSID | ssj0014503 |
Score | 2.5258112 |
Snippet | As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, V ideo C oding for M achines ( VCM... As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, Video Coding for Machines (VCM) is... |
SourceID | proquest pubmed crossref ieee |
SourceType | Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 5174 |
SubjectTerms | analytics taxonomy codebook-hyperprior Codec Coding compact visual representation Data compression Encoding Feature extraction Image coding Image compression Labels Machine vision multiple tasks Neural networks Optimization Representations Task analysis Taxonomy Video coding Video coding for machines Video compression Vision systems Visual tasks |
Title | Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics |
URI | https://ieeexplore.ieee.org/document/10440522 https://www.ncbi.nlm.nih.gov/pubmed/38376966 https://www.proquest.com/docview/3064706054 https://www.proquest.com/docview/2929540705 |
Volume | 46 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS-1ADA7qShe-H70-GMGd9NjTmZ7OuJODosIRERV3pZ3OAVFa8bR3cX_9TdKHIiiuWminnZJkkq-TfAE40pkxU2qTisGB9pVFWRj0A76dojvQqGAhc3dObkaXD-r6KXpqi9W5FsY5x8lnbkCnvJefl7amX2Vo4UrR6HmYR-TWFGv1WwYq4jbIGMKgiSOO6CpkAnNyf3s2uUIsGKqBlCMMJ6l7DkGzkWF2xA-HxB1Wvg822elcrMBNN90m1-RlUFfZwP77wuT46-9ZheU2_BRnjb6swZwr1mGla-0gWktfh6VPPIUb8Pr4nLtSjEtydALDXDHhHEw3OxW8oNhKPD7PanzyHWfWtgVNBV9tMm0LHnfVM4BWYvyhgn-dYHYU4ozehIeL8_vxpd-2afAtRm-V73IdGRdlCHfjdJhKpcIwxuMwjHWUWUpiREw41HkQp9JaLa22QRqP4iAPg0yHcgsWirJwOyCkjJ1xakoLi8LYJU2zFDGWlXmQ20BNPRh2skpsy2FOrTReE8YygUlY1AmJOmlF7cFxP-atYfD48e5NktOnOxsRebDX6UTSWvksIfRG7EOR8uCwv4z2SZsuaeHKepaEhnZScWGNPNhudKl_eKeCf7556S4s0tya7OA9WKjea7ePMVCVHbDu_wcf8v09 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fb9QwDLZgPAAPDMaAwoAg8YZ6a5v0mvA2nZjuYHdC6DbtLWrTnDQxtdOu5YG_Htv9sQlpiKdWatOmsh1_buzPAB91YcyG2qQiONChcigLg34gdBt0BxoVLGHuzuVqOj9VX8_T875YnWthvPecfOYndMp7-WXtWvpVhhauFI2-Dw_Q8adxV641bhqolBshI4hBI8dIYqiRiczh-vvRcoHRYKImUk4RUFL_HArOpob5EW9cEvdYuRtusts53oXVMOEu2-TnpG2Kifv9F5fjf3_RU3jSA1Bx1GnMM7jnqz3YHZo7iN7W9-DxLabC53B5dlH6WsxqcnUCga5Ycham334WvKS4RpxdbFt88g_Ore1Lmiq-2uXaVjxuMXKANmJ2o4S_vGB-FGKN3ofT4y_r2TzsGzWEDvFbE_pSp8anBQa8WR7nUqkkyfAYJ5lOC0dpjBgVxrqMslw6p6XTLsqzaRaVSVToRL6Anaqu_CsQUmbeeLWhpUUhesnzIscoy8kyKl2kNgHEg6ys61nMqZnGpeVoJjKWRW1J1LYXdQCfxjFXHYfHP-_eJzndurMTUQAHg07Y3s63luI34h9KVQAfxstoobTtkle-brc2MbSXiktrGsDLTpfGhw8q-PqOl76Hh_P18sSeLFbf3sAjmmeXK3wAO811698iImqKd2wHfwB6JwCV |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Video+Coding+for+Machines%3A+Compact+Visual+Representation+Compression+for+Intelligent+Collaborative+Analytics&rft.jtitle=IEEE+transactions+on+pattern+analysis+and+machine+intelligence&rft.au=Yang%2C+Wenhan&rft.au=Huang%2C+Haofeng&rft.au=Hu%2C+Yueyu&rft.au=Ling-Yu%2C+Duan&rft.date=2024-07-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=0162-8828&rft.eissn=1939-3539&rft.volume=46&rft.issue=7&rft.spage=5174&rft_id=info:doi/10.1109%2FTPAMI.2024.3367293&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0162-8828&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0162-8828&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0162-8828&client=summon |