A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding
Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work focuses on network structure design and employs a single powerful network to solve all problems. In contrast, this paper proposes a deep lear...
Saved in:
Published in | IEEE transactions on circuits and systems for video technology Vol. 30; no. 7; pp. 1871 - 1887 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.07.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work focuses on network structure design and employs a single powerful network to solve all problems. In contrast, this paper proposes a deep learning based systematic approach that includes an effective Convolutional Neural Network (CNN) structure, a hierarchical training strategy, and a video codec oriented switchable mechanism. First, we propose a novel CNN structure, i.e., Squeeze-and-Excitation Filtering CNN (SEFCNN), as an optional in-loop filter. To capture the non-linear interaction between channels, the SEFCNN is comprised of two subnets, i.e., Feature EXtracting (FEX) subnet and Feature ENhancing (FEN) subnet. Then, we develop a hierarchical model training strategy to adapt the two subnets to different coding scenarios. For high-rate videos with small artifacts, we train a single global model using the FEX for all types of frames, whereas for low-rate videos with large artifacts, different models are trained using both FEX and FEN for different types of frames. Finally, we propose an adaptive enhancing mechanism which is switchable between the CNN-based and the conventional methods. We selectively apply the CNN model to some frames or some regions in a frame. Experimental results show that the proposed scheme outperforms state-of-the-art work in coding efficiency, while the computational complexity is acceptable after GPU acceleration. |
---|---|
AbstractList | Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work focuses on network structure design and employs a single powerful network to solve all problems. In contrast, this paper proposes a deep learning based systematic approach that includes an effective Convolutional Neural Network (CNN) structure, a hierarchical training strategy, and a video codec oriented switchable mechanism. First, we propose a novel CNN structure, i.e., Squeeze-and-Excitation Filtering CNN (SEFCNN), as an optional in-loop filter. To capture the non-linear interaction between channels, the SEFCNN is comprised of two subnets, i.e., Feature EXtracting (FEX) subnet and Feature ENhancing (FEN) subnet. Then, we develop a hierarchical model training strategy to adapt the two subnets to different coding scenarios. For high-rate videos with small artifacts, we train a single global model using the FEX for all types of frames, whereas for low-rate videos with large artifacts, different models are trained using both FEX and FEN for different types of frames. Finally, we propose an adaptive enhancing mechanism which is switchable between the CNN-based and the conventional methods. We selectively apply the CNN model to some frames or some regions in a frame. Experimental results show that the proposed scheme outperforms state-of-the-art work in coding efficiency, while the computational complexity is acceptable after GPU acceleration. |
Author | Ding, Dandan Kong, Lingyi Chen, Guangyao Liu, Zoe Fang, Yong |
Author_xml | – sequence: 1 givenname: Dandan surname: Ding fullname: Ding, Dandan email: dandanding@hznu.edu.cn organization: School of Information Science and Engineering, Hangzhou Normal University, Hangzhou, China – sequence: 2 givenname: Lingyi surname: Kong fullname: Kong, Lingyi organization: School of Information Science and Engineering, Hangzhou Normal University, Hangzhou, China – sequence: 3 givenname: Guangyao surname: Chen fullname: Chen, Guangyao organization: School of Information Science and Engineering, Hangzhou Normal University, Hangzhou, China – sequence: 4 givenname: Zoe surname: Liu fullname: Liu, Zoe email: zoeliu@visionular.com organization: Visionular Inc., Mountain View, CA, USA – sequence: 5 givenname: Yong orcidid: 0000-0002-3345-8259 surname: Fang fullname: Fang, Yong email: fy@chd.edu.cn organization: School of Information Engineering, Chang'an University, Xi'an, China |
BookMark | eNp9kDFPwzAQhS1UJErhD8BiiTnF58SxPVaFQqVIDC1dI9exqatgBycV4t-T0IqBgenudPe9d3qXaOSDNwjdAJkCEHm_nq826yklIKdUpowRcYbGwJhIKCVs1PeEQSIosAt02bZ7QiATGR-j5QyvPl2nd2pbG_xgTIMLo6J3_g3PmiYGpXfYhoiXPilCaPDC1Z2Jw9p5vHGVCXgeqn6-QudW1a25PtUJel08rufPSfHytJzPikSnOXQJgyyTORMVV9TkSgDPDNFWp1YP7wsB0upM5xk1ksut3VYpWCYUk1TxHGw6QXdH3f65j4Npu3IfDtH3liXNQPbihOf9lThe6RjaNhpbatepzgXfReXqEkg5BFf-BFcOzuUpuB6lf9AmuncVv_6Hbo-QM8b8AkIQEJyn3wTxeec |
CODEN | ITCTEM |
CitedBy_id | crossref_primary_10_1016_j_dcan_2023_09_001 crossref_primary_10_1109_TCSVT_2021_3096072 crossref_primary_10_3390_electronics13122422 crossref_primary_10_1109_LSP_2023_3277343 crossref_primary_10_1109_TCYB_2020_2998481 crossref_primary_10_1109_TIP_2021_3084345 crossref_primary_10_3390_app14188276 crossref_primary_10_1016_j_image_2023_117005 crossref_primary_10_1109_TCSVT_2024_3420435 crossref_primary_10_1109_TMM_2023_3316429 crossref_primary_10_1007_s11042_021_11214_2 crossref_primary_10_1109_TCSVT_2021_3089498 crossref_primary_10_1109_TMM_2023_3304895 crossref_primary_10_1109_TCSVT_2023_3270729 crossref_primary_10_1049_ipr2_12644 crossref_primary_10_1109_ACCESS_2021_3075623 crossref_primary_10_1109_OJSP_2021_3092598 crossref_primary_10_1109_TMM_2023_3269663 crossref_primary_10_1109_TVCG_2024_3375861 crossref_primary_10_3390_s24010299 crossref_primary_10_1016_j_image_2020_115956 crossref_primary_10_1016_j_image_2021_116409 crossref_primary_10_1145_3551641 crossref_primary_10_1109_ACCESS_2023_3301145 crossref_primary_10_1109_TCSVT_2023_3323483 crossref_primary_10_1109_TIP_2022_3152627 crossref_primary_10_1016_j_dsp_2021_103368 crossref_primary_10_1109_MMUL_2022_3159372 crossref_primary_10_1109_TDSC_2022_3140899 crossref_primary_10_1109_TIP_2021_3134465 crossref_primary_10_3390_s23052631 crossref_primary_10_3390_s24061907 crossref_primary_10_1109_TCSVT_2022_3213515 crossref_primary_10_1109_TBC_2022_3152064 crossref_primary_10_1145_3612925 crossref_primary_10_1109_TCSVT_2023_3260266 crossref_primary_10_1109_JPROC_2021_3059994 |
Cites_doi | 10.1109/TCSVT.2003.815165 10.1109/CVPRW.2017.149 10.1109/PCS.2018.8456278 10.1109/ICCV.2015.73 10.1109/CVPRW.2017.151 10.1109/TCSVT.2018.2816932 10.1109/ISCAS.2017.8050458 10.1109/TCSVT.2012.2221529 10.1109/CVPR.2018.00262 10.1109/ICIP.2017.8296236 10.3115/v1/D14-1179 10.1109/ICME.2017.8019299 10.1109/TCSVT.2017.2727682 10.1109/TPAMI.2015.2439281 10.1109/ICASSP.2017.7952409 10.1007/978-3-319-73600-6_6 10.1007/978-3-319-06895-4 10.1109/VCIP.2017.8305149 10.1109/ICIP.2017.8296284 10.1109/TCSVT.2017.2734838 10.1109/CVPR.2018.00745 10.1109/DCC.2018.00027 10.1007/978-3-319-51811-4_3 10.5594/M001518 10.1109/DCC.2017.42 10.1109/ICIP.2018.8451589 10.1109/VCIP.2017.8305033 10.1109/IVMSPW.2016.7528223 10.1109/VCIP.2017.8305104 10.1109/PCS.2018.8456249 10.1109/ICIP.2018.8451086 10.1109/CVPR.2016.182 10.1109/TIP.2018.2815841 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TCSVT.2019.2935508 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-2205 |
EndPage | 1887 |
ExternalDocumentID | 10_1109_TCSVT_2019_2935508 8801877 |
Genre | orig-research |
GrantInformation_xml | – fundername: Fundamental Research Fund for the Central Universities of China grantid: 300102249304; 310824173601; 300102248303 funderid: 10.13039/501100012226 – fundername: Google Chrome University Research Program funderid: 10.13039/100006785 – fundername: National Key Research and Development Program of China grantid: 2017YFB1002803 |
GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c361t-51449658d7a2e6a8174e0cfc3fc20198819fc4c642e979bfbd31f58a592a761f3 |
IEDL.DBID | RIE |
ISSN | 1051-8215 |
IngestDate | Mon Jun 30 04:31:23 EDT 2025 Thu Apr 24 23:07:31 EDT 2025 Tue Jul 01 00:41:13 EDT 2025 Wed Aug 27 02:02:18 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 7 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c361t-51449658d7a2e6a8174e0cfc3fc20198819fc4c642e979bfbd31f58a592a761f3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-3345-8259 |
PQID | 2419496076 |
PQPubID | 85433 |
PageCount | 17 |
ParticipantIDs | crossref_primary_10_1109_TCSVT_2019_2935508 crossref_citationtrail_10_1109_TCSVT_2019_2935508 ieee_primary_8801877 proquest_journals_2419496076 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2020-07-01 |
PublicationDateYYYYMMDD | 2020-07-01 |
PublicationDate_xml | – month: 07 year: 2020 text: 2020-07-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on circuits and systems for video technology |
PublicationTitleAbbrev | TCSVT |
PublicationYear | 2020 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 hsiao (ref43) 2019 ref11 ref10 wang (ref45) 2019 ref17 ref16 (ref51) 0 ref19 bjøntegaard (ref41) 2001 ref18 wiegand (ref6) 2003; 13 ref50 ballé (ref12) 2017 ref42 sze (ref5) 2014 ref49 (ref2) 2013 ref7 ref4 ref3 ref40 valin (ref8) 2016 ref35 ref34 ref37 ref36 tsai (ref15) 2018 ref30 ref33 ref32 yao (ref44) 2019 ref39 ref38 yang (ref31) 2018 wang (ref47) 2019 theis (ref14) 2017 ref24 ref23 ref26 ref25 ref20 ref22 he (ref48) 2016 midtskogen (ref9) 2016 ref28 ref29 (ref1) 2003 yu (ref27) 2018 zhang (ref21) 2017 yin (ref46) 2019 |
References_xml | – volume: 13 start-page: 560 year: 2003 ident: ref6 article-title: overview of the h.264/avc video coding standard publication-title: IEEE Transactions on Circuits and Systems for Video Technology doi: 10.1109/TCSVT.2003.815165 – ident: ref50 doi: 10.1109/CVPRW.2017.149 – ident: ref32 doi: 10.1109/PCS.2018.8456278 – ident: ref28 doi: 10.1109/ICCV.2015.73 – year: 2001 ident: ref41 publication-title: Calculation of average PSNR Differences between RDcurves – ident: ref26 doi: 10.1109/CVPRW.2017.151 – ident: ref11 doi: 10.1109/TCSVT.2018.2816932 – start-page: 1 year: 2017 ident: ref21 article-title: Learning a CNN for fractional interpolation in HEVC inter coding publication-title: Proc IEEE Vis Commun Image Process (VCIP) – year: 2019 ident: ref43 publication-title: Convolutional Neural Network Loop Filter – ident: ref22 doi: 10.1109/ISCAS.2017.8050458 – start-page: 15 year: 2017 ident: ref12 article-title: End-to-end optimized image compression publication-title: Proc Int Conf Learn Represent (ICLR) – year: 2003 ident: ref1 – start-page: 2621 year: 2018 ident: ref27 article-title: Wide-activated deep residual networks based restoration for BPG-compressed images publication-title: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR) – year: 0 ident: ref51 – start-page: 1 year: 2018 ident: ref15 article-title: Learning binary residual representations for domain-specific video streaming publication-title: Proc Am Assoc for Artificial Intelligence (AAAI) – year: 2019 ident: ref45 publication-title: CE13 Dense Residual Convolutional Neural Network based In-Loop Filter (Test 2 2 and 2 3) – ident: ref7 doi: 10.1109/TCSVT.2012.2221529 – ident: ref25 doi: 10.1109/CVPR.2018.00262 – ident: ref37 doi: 10.1109/ICIP.2017.8296236 – start-page: 1 year: 2017 ident: ref14 article-title: Lossy image compression with compressive autoencoders publication-title: Proc Int Conf Learn Represent (ICLR) – ident: ref42 doi: 10.3115/v1/D14-1179 – ident: ref30 doi: 10.1109/ICME.2017.8019299 – ident: ref18 doi: 10.1109/TCSVT.2017.2727682 – ident: ref23 doi: 10.1109/TPAMI.2015.2439281 – ident: ref13 doi: 10.1109/ICASSP.2017.7952409 – ident: ref19 doi: 10.1007/978-3-319-73600-6_6 – year: 2014 ident: ref5 publication-title: High Efficiency Video Coding (HEVC) Algorithms and Architectures doi: 10.1007/978-3-319-06895-4 – ident: ref38 doi: 10.1109/VCIP.2017.8305149 – start-page: 630 year: 2016 ident: ref48 article-title: Identity mappings in deep residual networks publication-title: Proc Eur Conf Comput Vis (ECCV) – ident: ref10 doi: 10.1109/ICIP.2017.8296284 – year: 2019 ident: ref47 publication-title: CE13-Related In-Loop Filter With Only CNN-Based Filter – ident: ref17 doi: 10.1109/TCSVT.2017.2734838 – ident: ref49 doi: 10.1109/CVPR.2018.00745 – ident: ref40 doi: 10.1109/DCC.2018.00027 – ident: ref36 doi: 10.1007/978-3-319-51811-4_3 – year: 2016 ident: ref9 publication-title: Constrained Low Pass Filter Network Working Group Internet Draft – year: 2013 ident: ref2 – ident: ref3 doi: 10.5594/M001518 – ident: ref29 doi: 10.1109/DCC.2017.42 – start-page: 1 year: 2018 ident: ref31 article-title: Enhancing quality for HEVC compressed videos publication-title: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR) – ident: ref33 doi: 10.1109/ICIP.2018.8451589 – ident: ref16 doi: 10.1109/VCIP.2017.8305033 – ident: ref35 doi: 10.1109/IVMSPW.2016.7528223 – ident: ref20 doi: 10.1109/VCIP.2017.8305104 – ident: ref4 doi: 10.1109/PCS.2018.8456249 – year: 2019 ident: ref44 publication-title: CE13-2 1 Convolutional Neural Network Filter (CNNF) for Intra Frame – ident: ref34 doi: 10.1109/ICIP.2018.8451086 – year: 2016 ident: ref8 publication-title: A Deringing Filter for Daala... and Beyond – ident: ref24 doi: 10.1109/CVPR.2016.182 – ident: ref39 doi: 10.1109/TIP.2018.2815841 – year: 2019 ident: ref46 publication-title: Adaptive Convolutional Neural Network Loop Filter |
SSID | ssj0014847 |
Score | 2.5381255 |
Snippet | Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1871 |
SubjectTerms | Adaptation models Artificial neural networks CNN Codec Coding Correlation Deep learning Encoding enhancement Feature extraction Filtration Frames (data processing) in-loop filter Machine learning Structural hierarchy Training Video coding |
Title | A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding |
URI | https://ieeexplore.ieee.org/document/8801877 https://www.proquest.com/docview/2419496076 |
Volume | 30 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLbGTnDgjRgMlAM36OgzTY_TYNoQcGEgblWTOmgCtRNsQuLXE2dthQAhbpWaKKntNp9r-zPASRhJGXGlHDeQrhMqTsXKPHKEzLUxB5QCqXb45paP7sOrx-ixBWdNLQwi2uQz7NGljeXnpVrQr7JzY2ueiOMVWDGO27JWq4kYhMI2EzNwwTOreVFdIOMm55PB3cOEsriSnk904tRK8sshZLuq_PgU2_NluAE39c6WaSXPvcVc9tTHN9LG_259E9YroMn6S8vYghYW27D2hX5wB8Z9dvc-NWqj8il2gThjFd3qE-tXXOPMgFo2Lpzrspyx4ZRi63R7WrCHaY4lG5R0-O3C_fByMhg5VWsFRwXcmzsGJhFRvMjjzEeeCeOXoKu0CrQiOQmDE7Qymgt9TOJEapkHno5EFiV-FnNPB3vQLsoC94FlsYs8ybh5wDjUws2QgrmhDownkgU87IBXyzpVFe84tb94Sa3_4Sap1U9K66aVfjpw2syZLVk3_hy9QwJvRlay7kC3VmlavZhvqQEsiXlyN-YHv886hFWfXGqbkduF9vx1gUcGd8zlsTW4TxgE0UM |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NTxsxEB1ROLQ9AC1FDV_1oT1VG_bTax84RIEoKYELAXHbrr3jKmq1G5VECH5L_0r_G2PHiVBbcUPqbaVd7649T55ne-YNwMc0UyrjWgdhosIg1dwmK_MsEKoyBAdUAm3u8Nk571-mX66z6xX4tcyFQUQXfIZte-nO8qtGz-xW2SFhLRJ57kMoT_HulhZoN0eDY7LmpzjunYy6_cDXEAh0wqNpQHzAKqKLKi9j5KUgAo6hNjoxmlyfFOQQjaZfTGOUuVRGVUlkMlFmMi5phW8Seu8LWCOekcXz7LDlGUUqXPkyIigR9S_KFik5oTwcdS-uRjZuTLZjK2Bui1c-cnuujstfk7_zaL0N-L0Yi3kgy_f2bKra-v4Pmcj_dbA2Yd1TadaZY_8NrGD9Fl4_EljcgkGHXdyOCZg2QYwdI06YF5T9xjpeTZ0RbWeDOhg2zYT1xjZ6wN4e1-xqXGHDuo117-_g8ln6sg2rdVPje2BlHiKXJacBzVMjwhLtcXVqElprlQlPWxAtbFtor6xuC3z8KNwKK5SFw0Nhv1t4PLTg87LNZK4r8uTTW9bAyye9bVuwt4BQ4aeem4IomaSehznf-XerD_CyPzobFsPB-ekuvIrtBoKLP96D1enPGe4Ty5qqAwd2Bl-fGzAPjLItqA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Switchable+Deep+Learning+Approach+for+In-Loop+Filtering+in+Video+Coding&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Ding%2C+Dandan&rft.au=Kong%2C+Lingyi&rft.au=Chen%2C+Guangyao&rft.au=Liu%2C+Zoe&rft.date=2020-07-01&rft.pub=IEEE&rft.issn=1051-8215&rft.volume=30&rft.issue=7&rft.spage=1871&rft.epage=1887&rft_id=info:doi/10.1109%2FTCSVT.2019.2935508&rft.externalDocID=8801877 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |