A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding

Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work focuses on network structure design and employs a single powerful network to solve all problems. In contrast, this paper proposes a deep lear...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 30; no. 7; pp. 1871 - 1887
Main Authors	Ding, Dandan, Kong, Lingyi, Chen, Guangyao, Liu, Zoe, Fang, Yong
Format	Journal Article
Language	English
Published	New York IEEE 01.07.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation models Artificial neural networks CNN Codec Coding Correlation Deep learning Encoding enhancement Feature extraction Filtration Frames (data processing) in-loop filter Machine learning Structural hierarchy Training Video coding
Online Access	Get full text

Cover

Loading…

Abstract	Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work focuses on network structure design and employs a single powerful network to solve all problems. In contrast, this paper proposes a deep learning based systematic approach that includes an effective Convolutional Neural Network (CNN) structure, a hierarchical training strategy, and a video codec oriented switchable mechanism. First, we propose a novel CNN structure, i.e., Squeeze-and-Excitation Filtering CNN (SEFCNN), as an optional in-loop filter. To capture the non-linear interaction between channels, the SEFCNN is comprised of two subnets, i.e., Feature EXtracting (FEX) subnet and Feature ENhancing (FEN) subnet. Then, we develop a hierarchical model training strategy to adapt the two subnets to different coding scenarios. For high-rate videos with small artifacts, we train a single global model using the FEX for all types of frames, whereas for low-rate videos with large artifacts, different models are trained using both FEX and FEN for different types of frames. Finally, we propose an adaptive enhancing mechanism which is switchable between the CNN-based and the conventional methods. We selectively apply the CNN model to some frames or some regions in a frame. Experimental results show that the proposed scheme outperforms state-of-the-art work in coding efficiency, while the computational complexity is acceptable after GPU acceleration.
AbstractList	Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work focuses on network structure design and employs a single powerful network to solve all problems. In contrast, this paper proposes a deep learning based systematic approach that includes an effective Convolutional Neural Network (CNN) structure, a hierarchical training strategy, and a video codec oriented switchable mechanism. First, we propose a novel CNN structure, i.e., Squeeze-and-Excitation Filtering CNN (SEFCNN), as an optional in-loop filter. To capture the non-linear interaction between channels, the SEFCNN is comprised of two subnets, i.e., Feature EXtracting (FEX) subnet and Feature ENhancing (FEN) subnet. Then, we develop a hierarchical model training strategy to adapt the two subnets to different coding scenarios. For high-rate videos with small artifacts, we train a single global model using the FEX for all types of frames, whereas for low-rate videos with large artifacts, different models are trained using both FEX and FEN for different types of frames. Finally, we propose an adaptive enhancing mechanism which is switchable between the CNN-based and the conventional methods. We selectively apply the CNN model to some frames or some regions in a frame. Experimental results show that the proposed scheme outperforms state-of-the-art work in coding efficiency, while the computational complexity is acceptable after GPU acceleration.
Author	Ding, Dandan Kong, Lingyi Chen, Guangyao Liu, Zoe Fang, Yong
Author_xml	– sequence: 1 givenname: Dandan surname: Ding fullname: Ding, Dandan email: dandanding@hznu.edu.cn organization: School of Information Science and Engineering, Hangzhou Normal University, Hangzhou, China – sequence: 2 givenname: Lingyi surname: Kong fullname: Kong, Lingyi organization: School of Information Science and Engineering, Hangzhou Normal University, Hangzhou, China – sequence: 3 givenname: Guangyao surname: Chen fullname: Chen, Guangyao organization: School of Information Science and Engineering, Hangzhou Normal University, Hangzhou, China – sequence: 4 givenname: Zoe surname: Liu fullname: Liu, Zoe email: zoeliu@visionular.com organization: Visionular Inc., Mountain View, CA, USA – sequence: 5 givenname: Yong orcidid: 0000-0002-3345-8259 surname: Fang fullname: Fang, Yong email: fy@chd.edu.cn organization: School of Information Engineering, Chang'an University, Xi'an, China
BookMark	eNp9kDFPwzAQhS1UJErhD8BiiTnF58SxPVaFQqVIDC1dI9exqatgBycV4t-T0IqBgenudPe9d3qXaOSDNwjdAJkCEHm_nq826yklIKdUpowRcYbGwJhIKCVs1PeEQSIosAt02bZ7QiATGR-j5QyvPl2nd2pbG_xgTIMLo6J3_g3PmiYGpXfYhoiXPilCaPDC1Z2Jw9p5vHGVCXgeqn6-QudW1a25PtUJel08rufPSfHytJzPikSnOXQJgyyTORMVV9TkSgDPDNFWp1YP7wsB0upM5xk1ksut3VYpWCYUk1TxHGw6QXdH3f65j4Npu3IfDtH3liXNQPbihOf9lThe6RjaNhpbatepzgXfReXqEkg5BFf-BFcOzuUpuB6lf9AmuncVv_6Hbo-QM8b8AkIQEJyn3wTxeec
CODEN	ITCTEM
CitedBy_id	crossref_primary_10_1016_j_dcan_2023_09_001 crossref_primary_10_1109_TCSVT_2021_3096072 crossref_primary_10_3390_electronics13122422 crossref_primary_10_1109_LSP_2023_3277343 crossref_primary_10_1109_TCYB_2020_2998481 crossref_primary_10_1109_TIP_2021_3084345 crossref_primary_10_3390_app14188276 crossref_primary_10_1016_j_image_2023_117005 crossref_primary_10_1109_TCSVT_2024_3420435 crossref_primary_10_1109_TMM_2023_3316429 crossref_primary_10_1007_s11042_021_11214_2 crossref_primary_10_1109_TCSVT_2021_3089498 crossref_primary_10_1109_TMM_2023_3304895 crossref_primary_10_1109_TCSVT_2023_3270729 crossref_primary_10_1049_ipr2_12644 crossref_primary_10_1109_ACCESS_2021_3075623 crossref_primary_10_1109_OJSP_2021_3092598 crossref_primary_10_1109_TMM_2023_3269663 crossref_primary_10_1109_TVCG_2024_3375861 crossref_primary_10_3390_s24010299 crossref_primary_10_1016_j_image_2020_115956 crossref_primary_10_1016_j_image_2021_116409 crossref_primary_10_1145_3551641 crossref_primary_10_1109_ACCESS_2023_3301145 crossref_primary_10_1109_TCSVT_2023_3323483 crossref_primary_10_1109_TIP_2022_3152627 crossref_primary_10_1016_j_dsp_2021_103368 crossref_primary_10_1109_MMUL_2022_3159372 crossref_primary_10_1109_TDSC_2022_3140899 crossref_primary_10_1109_TIP_2021_3134465 crossref_primary_10_3390_s23052631 crossref_primary_10_3390_s24061907 crossref_primary_10_1109_TCSVT_2022_3213515 crossref_primary_10_1109_TBC_2022_3152064 crossref_primary_10_1145_3612925 crossref_primary_10_1109_TCSVT_2023_3260266 crossref_primary_10_1109_JPROC_2021_3059994
Cites_doi	10.1109/TCSVT.2003.815165 10.1109/CVPRW.2017.149 10.1109/PCS.2018.8456278 10.1109/ICCV.2015.73 10.1109/CVPRW.2017.151 10.1109/TCSVT.2018.2816932 10.1109/ISCAS.2017.8050458 10.1109/TCSVT.2012.2221529 10.1109/CVPR.2018.00262 10.1109/ICIP.2017.8296236 10.3115/v1/D14-1179 10.1109/ICME.2017.8019299 10.1109/TCSVT.2017.2727682 10.1109/TPAMI.2015.2439281 10.1109/ICASSP.2017.7952409 10.1007/978-3-319-73600-6_6 10.1007/978-3-319-06895-4 10.1109/VCIP.2017.8305149 10.1109/ICIP.2017.8296284 10.1109/TCSVT.2017.2734838 10.1109/CVPR.2018.00745 10.1109/DCC.2018.00027 10.1007/978-3-319-51811-4_3 10.5594/M001518 10.1109/DCC.2017.42 10.1109/ICIP.2018.8451589 10.1109/VCIP.2017.8305033 10.1109/IVMSPW.2016.7528223 10.1109/VCIP.2017.8305104 10.1109/PCS.2018.8456249 10.1109/ICIP.2018.8451086 10.1109/CVPR.2016.182 10.1109/TIP.2018.2815841
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/TCSVT.2019.2935508
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-2205
EndPage	1887
ExternalDocumentID	10_1109_TCSVT_2019_2935508 8801877
Genre	orig-research
GrantInformation_xml	– fundername: Fundamental Research Fund for the Central Universities of China grantid: 300102249304; 310824173601; 300102248303 funderid: 10.13039/501100012226 – fundername: Google Chrome University Research Program funderid: 10.13039/100006785 – fundername: National Key Research and Development Program of China grantid: 2017YFB1002803
GroupedDBID	-~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c361t-51449658d7a2e6a8174e0cfc3fc20198819fc4c642e979bfbd31f58a592a761f3
IEDL.DBID	RIE
ISSN	1051-8215
IngestDate	Mon Jun 30 04:31:23 EDT 2025 Thu Apr 24 23:07:31 EDT 2025 Tue Jul 01 00:41:13 EDT 2025 Wed Aug 27 02:02:18 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	7
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c361t-51449658d7a2e6a8174e0cfc3fc20198819fc4c642e979bfbd31f58a592a761f3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-3345-8259
PQID	2419496076
PQPubID	85433
PageCount	17
ParticipantIDs	crossref_primary_10_1109_TCSVT_2019_2935508 crossref_citationtrail_10_1109_TCSVT_2019_2935508 ieee_primary_8801877 proquest_journals_2419496076
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2020-07-01
PublicationDateYYYYMMDD	2020-07-01
PublicationDate_xml	– month: 07 year: 2020 text: 2020-07-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev	TCSVT
PublicationYear	2020
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 hsiao (ref43) 2019 ref11 ref10 wang (ref45) 2019 ref17 ref16 (ref51) 0 ref19 bjøntegaard (ref41) 2001 ref18 wiegand (ref6) 2003; 13 ref50 ballé (ref12) 2017 ref42 sze (ref5) 2014 ref49 (ref2) 2013 ref7 ref4 ref3 ref40 valin (ref8) 2016 ref35 ref34 ref37 ref36 tsai (ref15) 2018 ref30 ref33 ref32 yao (ref44) 2019 ref39 ref38 yang (ref31) 2018 wang (ref47) 2019 theis (ref14) 2017 ref24 ref23 ref26 ref25 ref20 ref22 he (ref48) 2016 midtskogen (ref9) 2016 ref28 ref29 (ref1) 2003 yu (ref27) 2018 zhang (ref21) 2017 yin (ref46) 2019
References_xml	– volume: 13 start-page: 560 year: 2003 ident: ref6 article-title: overview of the h.264/avc video coding standard publication-title: IEEE Transactions on Circuits and Systems for Video Technology doi: 10.1109/TCSVT.2003.815165 – ident: ref50 doi: 10.1109/CVPRW.2017.149 – ident: ref32 doi: 10.1109/PCS.2018.8456278 – ident: ref28 doi: 10.1109/ICCV.2015.73 – year: 2001 ident: ref41 publication-title: Calculation of average PSNR Differences between RDcurves – ident: ref26 doi: 10.1109/CVPRW.2017.151 – ident: ref11 doi: 10.1109/TCSVT.2018.2816932 – start-page: 1 year: 2017 ident: ref21 article-title: Learning a CNN for fractional interpolation in HEVC inter coding publication-title: Proc IEEE Vis Commun Image Process (VCIP) – year: 2019 ident: ref43 publication-title: Convolutional Neural Network Loop Filter – ident: ref22 doi: 10.1109/ISCAS.2017.8050458 – start-page: 15 year: 2017 ident: ref12 article-title: End-to-end optimized image compression publication-title: Proc Int Conf Learn Represent (ICLR) – year: 2003 ident: ref1 – start-page: 2621 year: 2018 ident: ref27 article-title: Wide-activated deep residual networks based restoration for BPG-compressed images publication-title: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR) – year: 0 ident: ref51 – start-page: 1 year: 2018 ident: ref15 article-title: Learning binary residual representations for domain-specific video streaming publication-title: Proc Am Assoc for Artificial Intelligence (AAAI) – year: 2019 ident: ref45 publication-title: CE13 Dense Residual Convolutional Neural Network based In-Loop Filter (Test 2 2 and 2 3) – ident: ref7 doi: 10.1109/TCSVT.2012.2221529 – ident: ref25 doi: 10.1109/CVPR.2018.00262 – ident: ref37 doi: 10.1109/ICIP.2017.8296236 – start-page: 1 year: 2017 ident: ref14 article-title: Lossy image compression with compressive autoencoders publication-title: Proc Int Conf Learn Represent (ICLR) – ident: ref42 doi: 10.3115/v1/D14-1179 – ident: ref30 doi: 10.1109/ICME.2017.8019299 – ident: ref18 doi: 10.1109/TCSVT.2017.2727682 – ident: ref23 doi: 10.1109/TPAMI.2015.2439281 – ident: ref13 doi: 10.1109/ICASSP.2017.7952409 – ident: ref19 doi: 10.1007/978-3-319-73600-6_6 – year: 2014 ident: ref5 publication-title: High Efficiency Video Coding (HEVC) Algorithms and Architectures doi: 10.1007/978-3-319-06895-4 – ident: ref38 doi: 10.1109/VCIP.2017.8305149 – start-page: 630 year: 2016 ident: ref48 article-title: Identity mappings in deep residual networks publication-title: Proc Eur Conf Comput Vis (ECCV) – ident: ref10 doi: 10.1109/ICIP.2017.8296284 – year: 2019 ident: ref47 publication-title: CE13-Related In-Loop Filter With Only CNN-Based Filter – ident: ref17 doi: 10.1109/TCSVT.2017.2734838 – ident: ref49 doi: 10.1109/CVPR.2018.00745 – ident: ref40 doi: 10.1109/DCC.2018.00027 – ident: ref36 doi: 10.1007/978-3-319-51811-4_3 – year: 2016 ident: ref9 publication-title: Constrained Low Pass Filter Network Working Group Internet Draft – year: 2013 ident: ref2 – ident: ref3 doi: 10.5594/M001518 – ident: ref29 doi: 10.1109/DCC.2017.42 – start-page: 1 year: 2018 ident: ref31 article-title: Enhancing quality for HEVC compressed videos publication-title: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR) – ident: ref33 doi: 10.1109/ICIP.2018.8451589 – ident: ref16 doi: 10.1109/VCIP.2017.8305033 – ident: ref35 doi: 10.1109/IVMSPW.2016.7528223 – ident: ref20 doi: 10.1109/VCIP.2017.8305104 – ident: ref4 doi: 10.1109/PCS.2018.8456249 – year: 2019 ident: ref44 publication-title: CE13-2 1 Convolutional Neural Network Filter (CNNF) for Intra Frame – ident: ref34 doi: 10.1109/ICIP.2018.8451086 – year: 2016 ident: ref8 publication-title: A Deringing Filter for Daala... and Beyond – ident: ref24 doi: 10.1109/CVPR.2016.182 – ident: ref39 doi: 10.1109/TIP.2018.2815841 – year: 2019 ident: ref46 publication-title: Adaptive Convolutional Neural Network Loop Filter
SSID	ssj0014847
Score	2.5381255
Snippet	Deep learning provides a great potential for in-loop filtering to improve both coding efficiency and subjective quality in video coding. State-of-the-art work...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1871
SubjectTerms	Adaptation models Artificial neural networks CNN Codec Coding Correlation Deep learning Encoding enhancement Feature extraction Filtration Frames (data processing) in-loop filter Machine learning Structural hierarchy Training Video coding
Title	A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding
URI	https://ieeexplore.ieee.org/document/8801877 https://www.proquest.com/docview/2419496076
Volume	30
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLbGTnDgjRgMlAM36OgzTY_TYNoQcGEgblWTOmgCtRNsQuLXE2dthQAhbpWaKKntNp9r-zPASRhJGXGlHDeQrhMqTsXKPHKEzLUxB5QCqXb45paP7sOrx-ixBWdNLQwi2uQz7NGljeXnpVrQr7JzY2ueiOMVWDGO27JWq4kYhMI2EzNwwTOreVFdIOMm55PB3cOEsriSnk904tRK8sshZLuq_PgU2_NluAE39c6WaSXPvcVc9tTHN9LG_259E9YroMn6S8vYghYW27D2hX5wB8Z9dvc-NWqj8il2gThjFd3qE-tXXOPMgFo2Lpzrspyx4ZRi63R7WrCHaY4lG5R0-O3C_fByMhg5VWsFRwXcmzsGJhFRvMjjzEeeCeOXoKu0CrQiOQmDE7Qymgt9TOJEapkHno5EFiV-FnNPB3vQLsoC94FlsYs8ybh5wDjUws2QgrmhDownkgU87IBXyzpVFe84tb94Sa3_4Sap1U9K66aVfjpw2syZLVk3_hy9QwJvRlay7kC3VmlavZhvqQEsiXlyN-YHv886hFWfXGqbkduF9vx1gUcGd8zlsTW4TxgE0UM
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NTxsxEB1ROLQ9AC1FDV_1oT1VG_bTax84RIEoKYELAXHbrr3jKmq1G5VECH5L_0r_G2PHiVBbcUPqbaVd7649T55ne-YNwMc0UyrjWgdhosIg1dwmK_MsEKoyBAdUAm3u8Nk571-mX66z6xX4tcyFQUQXfIZte-nO8qtGz-xW2SFhLRJ57kMoT_HulhZoN0eDY7LmpzjunYy6_cDXEAh0wqNpQHzAKqKLKi9j5KUgAo6hNjoxmlyfFOQQjaZfTGOUuVRGVUlkMlFmMi5phW8Seu8LWCOekcXz7LDlGUUqXPkyIigR9S_KFik5oTwcdS-uRjZuTLZjK2Bui1c-cnuujstfk7_zaL0N-L0Yi3kgy_f2bKra-v4Pmcj_dbA2Yd1TadaZY_8NrGD9Fl4_EljcgkGHXdyOCZg2QYwdI06YF5T9xjpeTZ0RbWeDOhg2zYT1xjZ6wN4e1-xqXGHDuo117-_g8ln6sg2rdVPje2BlHiKXJacBzVMjwhLtcXVqElprlQlPWxAtbFtor6xuC3z8KNwKK5SFw0Nhv1t4PLTg87LNZK4r8uTTW9bAyye9bVuwt4BQ4aeem4IomaSehznf-XerD_CyPzobFsPB-ekuvIrtBoKLP96D1enPGe4Ty5qqAwd2Bl-fGzAPjLItqA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Switchable+Deep+Learning+Approach+for+In-Loop+Filtering+in+Video+Coding&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Ding%2C+Dandan&rft.au=Kong%2C+Lingyi&rft.au=Chen%2C+Guangyao&rft.au=Liu%2C+Zoe&rft.date=2020-07-01&rft.pub=IEEE&rft.issn=1051-8215&rft.volume=30&rft.issue=7&rft.spage=1871&rft.epage=1887&rft_id=info:doi/10.1109%2FTCSVT.2019.2935508&rft.externalDocID=8801877
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon