Theme-aware Visual Attribute Reasoning for Image Aesthetics Assessment

People usually assess image aesthetics according to visual attributes, e.g., interesting content, good lighting and vivid color, etc. Further, the perception of visual attributes depends on the image theme. Therefore, the inherent relationship between visual attributes and image theme is crucial for...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 33; no. 9; p. 1
Main Authors Li, Leida, Huang, Yipo, Wu, Jinjian, Yang, Yuzhe, Li, Yaqian, Guo, Yandong, Shi, Guangming
Format Journal Article
LanguageEnglish
Published New York IEEE 01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1051-8215
1558-2205
DOI10.1109/TCSVT.2023.3249185

Cover

Loading…
Abstract People usually assess image aesthetics according to visual attributes, e.g., interesting content, good lighting and vivid color, etc. Further, the perception of visual attributes depends on the image theme. Therefore, the inherent relationship between visual attributes and image theme is crucial for image aesthetics assessment (IAA), which has not been comprehensively investigated. With this motivation, this paper presents a new IAA model based on Theme-Aware Visual Attribute Reasoning (TAVAR). The underlying idea is to simulate the process of human perception in image aesthetics by performing bilevel reasoning. Specifically, a visual attribute analysis network and a theme understanding network are first pre-trained to extract aesthetic attribute features and theme features, respectively. Then, the first level Attribute-Theme Graph (ATG) is built to investigate the coupling relationship between visual attributes and image theme. Further, a flexible aesthetics network is introduced to extract general aesthetic features, based on which we built the second level Attribute-Aesthetics Graph (AAG) to mine the relationship between theme-aware visual attributes and aesthetic features, producing the final aesthetic prediction. Extensive experiments on four public IAA databases demonstrate the superiority of the proposed TAVAR model over the state-of-the-arts. Furthermore, TAVAR features better explainability due to the use of visual attributes.
AbstractList People usually assess image aesthetics according to visual attributes, e.g., interesting content, good lighting and vivid color, etc. Further, the perception of visual attributes depends on the image theme. Therefore, the inherent relationship between visual attributes and image theme is crucial for image aesthetics assessment (IAA), which has not been comprehensively investigated. With this motivation, this paper presents a new IAA model based on Theme-Aware Visual Attribute Reasoning (TAVAR). The underlying idea is to simulate the process of human perception in image aesthetics by performing bilevel reasoning. Specifically, a visual attribute analysis network and a theme understanding network are first pre-trained to extract aesthetic attribute features and theme features, respectively. Then, the first level Attribute-Theme Graph (ATG) is built to investigate the coupling relationship between visual attributes and image theme. Further, a flexible aesthetics network is introduced to extract general aesthetic features, based on which we built the second level Attribute-Aesthetics Graph (AAG) to mine the relationship between theme-aware visual attributes and aesthetic features, producing the final aesthetic prediction. Extensive experiments on four public IAA databases demonstrate the superiority of the proposed TAVAR model over the state-of-the-arts. Furthermore, TAVAR features better explainability due to the use of visual attributes.
Author Wu, Jinjian
Li, Leida
Huang, Yipo
Yang, Yuzhe
Shi, Guangming
Li, Yaqian
Guo, Yandong
Author_xml – sequence: 1
  givenname: Leida
  orcidid: 0000-0001-9069-8796
  surname: Li
  fullname: Li, Leida
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 2
  givenname: Yipo
  orcidid: 0000-0003-0908-2180
  surname: Huang
  fullname: Huang, Yipo
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 3
  givenname: Jinjian
  orcidid: 0000-0001-7501-0009
  surname: Wu
  fullname: Wu, Jinjian
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
– sequence: 4
  givenname: Yuzhe
  orcidid: 0000-0001-9098-2105
  surname: Yang
  fullname: Yang, Yuzhe
  organization: Intelligent Perception and Interaction Research Department, OPPO Research Institute, Shanghai, China
– sequence: 5
  givenname: Yaqian
  orcidid: 0000-0003-3582-9997
  surname: Li
  fullname: Li, Yaqian
  organization: Intelligent Perception and Interaction Research Department, OPPO Research Institute, Shanghai, China
– sequence: 6
  givenname: Yandong
  surname: Guo
  fullname: Guo, Yandong
  organization: Intelligent Perception and Interaction Research Department, OPPO Research Institute, Shanghai, China
– sequence: 7
  givenname: Guangming
  orcidid: 0000-0003-2179-3292
  surname: Shi
  fullname: Shi, Guangming
  organization: School of Artificial Intelligence, Xidian University, Xi'an, China
BookMark eNp9kEFLwzAUx4NMcJt-AfFQ8NyapEmTHstwOhgIWncNafa6ZaztTFLEb2_ndhAPnt7_8H7v__hN0KjtWkDoluCEEJw_lLO3VZlQTNMkpSwnkl-gMeFcxpRiPhoy5iSWlPArNPF-hzFhkokxmpdbaCDWn9pBtLK-1_uoCMHZqg8QvYL2XWvbTVR3Llo0egNRAT5sIVjjo8J78L6BNlyjy1rvPdyc5xS9zx_L2XO8fHlazIplbGiehTiVHK_BME1pamglKjxEIQ1kdaZZDVAJDsJQqStZcZlRvtZCklqYXDNjdDpF96e7B9d99MMnatf1rh0qFZUZYZnIhRy26GnLuM57B7U6ONto96UIVkdf6seXOvpSZ18DJP9AxgYdbNcGp-3-f_TuhFoA-NWFOSNMpN9iXHuI
CODEN ITCTEM
CitedBy_id crossref_primary_10_1109_ACCESS_2024_3349961
crossref_primary_10_1109_TMM_2023_3290479
crossref_primary_10_1109_TMM_2024_3389452
crossref_primary_10_1007_s00530_025_01736_2
crossref_primary_10_32604_cmc_2024_050344
crossref_primary_10_1016_j_patcog_2024_110584
crossref_primary_10_1016_j_jvcir_2024_104316
crossref_primary_10_1111_bjop_12707
crossref_primary_10_1007_s00530_024_01490_x
crossref_primary_10_1109_TCSVT_2024_3374887
crossref_primary_10_1109_TPAMI_2024_3492259
crossref_primary_10_1117_1_JEI_33_5_053059
crossref_primary_10_1109_TCSVT_2024_3470870
crossref_primary_10_1109_TIM_2024_3365174
crossref_primary_10_1145_3716820
crossref_primary_10_1145_3719012
crossref_primary_10_1007_s40745_024_00531_6
Cites_doi 10.1007/978-3-540-88690-7_29
10.1109/CVPR42600.2020.01412
10.1109/CVPR.2017.243
10.1109/ICCV.2017.546
10.1109/TCSVT.2020.3010181
10.1109/ICCV48922.2021.00510
10.1145/2647868.2654927
10.1109/CVPR.2016.90
10.1109/ICCV48922.2021.00986
10.1109/TMM.2019.2911428
10.24963/ijcai.2022/132
10.1109/TIP.2019.2941778
10.1109/TCSVT.2022.3201510
10.1109/TIP.2020.2968285
10.1109/TMM.2013.2269899
10.1109/CVPR52688.2022.01924
10.1023/A:1011139631724
10.1023/B:VISI.0000029664.99615.94
10.1109/TCSVT.2022.3164467
10.1109/MSP.2017.2696576
10.1109/TIP.2021.3061932
10.1109/TIP.2022.3191853
10.1007/978-3-319-46448-0_40
10.1016/j.aiopen.2021.01.001
10.1007/978-3-319-24574-4_28
10.1109/CVPR.2017.650
10.1109/TCSVT.2020.3048945
10.1109/CVPR.2006.303
10.1109/TCSVT.2022.3179744
10.1016/j.neucom.2020.04.142
10.1109/CVPR.2012.6247954
10.1109/TIP.2018.2831899
10.1109/TCSVT.2012.2189689
10.1109/TMM.2017.2777664
10.1109/TCSVT.2022.3188991
10.1049/iet-cvi.2018.5249
10.1109/TCSVT.2016.2555658
10.1109/TCSVT.2017.2741472
10.1145/3240508.3240635
10.1109/TCSVT.2019.2915103
10.1109/CVPR46437.2021.00837
10.1109/CVPR.2017.84
10.1109/TCSVT.2022.3186307
10.1109/TCSVT.2021.3112197
10.1109/CVPR.2016.319
10.1109/ICCV.2019.00140
10.1109/TMM.2021.3118881
10.1109/TCSVT.2020.3024882
10.1145/3423268.3423590
10.1109/WACV45572.2020.9093412
10.1038/s41562-021-01124-6
10.1109/TKDE.2020.2981333
10.1109/ICCV.2015.123
10.1007/978-3-030-01246-5_41
10.1109/CVPR.2019.00960
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TCSVT.2023.3249185
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2205
EndPage 1
ExternalDocumentID 10_1109_TCSVT_2023_3249185
10054147
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61771473; 61991451; 62171340
  funderid: 10.13039/501100001809
GroupedDBID -~X
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
RXW
TAE
TN5
5VS
AAYXX
AETIX
AGSQL
AI.
AIBXA
ALLEH
CITATION
EJD
H~9
ICLAB
IFJZH
RIG
VH1
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c296t-3850dec4a223c2b7b04a278ce6f6a4feeb75e7c28ab8b58625da781f7c9a4cca3
IEDL.DBID RIE
ISSN 1051-8215
IngestDate Mon Jun 30 04:08:00 EDT 2025
Thu Apr 24 22:59:21 EDT 2025
Tue Jul 01 00:41:20 EDT 2025
Mon Aug 04 05:48:53 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c296t-3850dec4a223c2b7b04a278ce6f6a4feeb75e7c28ab8b58625da781f7c9a4cca3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-3582-9997
0000-0001-7501-0009
0000-0001-9069-8796
0000-0001-9098-2105
0000-0003-2179-3292
0000-0003-0908-2180
PQID 2861467978
PQPubID 85433
PageCount 1
ParticipantIDs crossref_primary_10_1109_TCSVT_2023_3249185
proquest_journals_2861467978
ieee_primary_10054147
crossref_citationtrail_10_1109_TCSVT_2023_3249185
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-09-01
PublicationDateYYYYMMDD 2023-09-01
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-09-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev TCSVT
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref57
ref12
ref14
ref58
ref53
ref52
Touvron (ref59)
ref11
ref55
ref10
ref54
ref17
ref16
ref18
Malu (ref35) 2017
Tan (ref56)
Hou (ref47) 2016
ref51
ref50
ref46
ref45
ref48
ref42
ref41
ref44
ref43
Kipf (ref30) 2016
ref49
ref8
ref7
Oliva (ref19) 2001; 42
ref9
ref4
ref3
ref6
ref5
ref40
ref34
ref37
ref36
ref31
ref33
ref32
ref2
ref1
ref39
ref38
ref24
ref23
ref26
ref25
ref20
ref22
Barnbaum (ref15) 2017
ref21
ref28
ref27
ref29
ref60
ref61
References_xml – year: 2016
  ident: ref30
  article-title: Semi-supervised classification with graph convolutional networks
  publication-title: arXiv:1609.02907
– ident: ref17
  doi: 10.1007/978-3-540-88690-7_29
– ident: ref23
  doi: 10.1109/CVPR42600.2020.01412
– ident: ref57
  doi: 10.1109/CVPR.2017.243
– ident: ref28
  doi: 10.1109/ICCV.2017.546
– ident: ref10
  doi: 10.1109/TCSVT.2020.3010181
– ident: ref50
  doi: 10.1109/ICCV48922.2021.00510
– year: 2016
  ident: ref47
  article-title: Squared Earth mover’s distance-based loss for training deep neural networks
  publication-title: arXiv:1611.05916
– ident: ref21
  doi: 10.1145/2647868.2654927
– ident: ref34
  doi: 10.1109/CVPR.2016.90
– ident: ref38
  doi: 10.1109/ICCV48922.2021.00986
– ident: ref54
  doi: 10.1109/TMM.2019.2911428
– ident: ref12
  doi: 10.24963/ijcai.2022/132
– ident: ref42
  doi: 10.1109/TIP.2019.2941778
– ident: ref2
  doi: 10.1109/TCSVT.2022.3201510
– ident: ref24
  doi: 10.1109/TIP.2020.2968285
– ident: ref18
  doi: 10.1109/TMM.2013.2269899
– ident: ref41
  doi: 10.1109/CVPR52688.2022.01924
– volume: 42
  start-page: 145
  issue: 3
  year: 2001
  ident: ref19
  article-title: Modeling the shape of the scene: A holistic representation of the spatial envelope
  publication-title: Int. J. Comput. Vis.
  doi: 10.1023/A:1011139631724
– ident: ref20
  doi: 10.1023/B:VISI.0000029664.99615.94
– year: 2017
  ident: ref35
  article-title: Learning photography aesthetics with deep CNNs
  publication-title: arXiv:1707.03981
– ident: ref46
  doi: 10.1109/TCSVT.2022.3164467
– ident: ref3
  doi: 10.1109/MSP.2017.2696576
– ident: ref58
  doi: 10.1109/TIP.2021.3061932
– ident: ref51
  doi: 10.1109/TIP.2022.3191853
– year: 2017
  ident: ref15
  publication-title: The Art Photography: A Personal Approach to Artistic Expression
– ident: ref16
  doi: 10.1007/978-3-319-46448-0_40
– ident: ref31
  doi: 10.1016/j.aiopen.2021.01.001
– ident: ref60
  doi: 10.1007/978-3-319-24574-4_28
– ident: ref29
  doi: 10.1109/CVPR.2017.650
– ident: ref6
  doi: 10.1109/TCSVT.2020.3048945
– ident: ref11
  doi: 10.1109/CVPR.2006.303
– ident: ref44
  doi: 10.1109/TCSVT.2022.3179744
– start-page: 6105
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref56
  article-title: EfficientNet: Rethinking model scaling for convolutional neural networks
– ident: ref25
  doi: 10.1016/j.neucom.2020.04.142
– ident: ref40
  doi: 10.1109/CVPR.2012.6247954
– ident: ref13
  doi: 10.1109/TIP.2018.2831899
– ident: ref4
  doi: 10.1109/TCSVT.2012.2189689
– start-page: 10347
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref59
  article-title: Training data-efficient image transformers & distillation through attention
– ident: ref7
  doi: 10.1109/TMM.2017.2777664
– ident: ref45
  doi: 10.1109/TCSVT.2022.3188991
– ident: ref22
  doi: 10.1049/iet-cvi.2018.5249
– ident: ref8
  doi: 10.1109/TCSVT.2016.2555658
– ident: ref5
  doi: 10.1109/TCSVT.2017.2741472
– ident: ref53
  doi: 10.1145/3240508.3240635
– ident: ref9
  doi: 10.1109/TCSVT.2019.2915103
– ident: ref26
  doi: 10.1109/CVPR46437.2021.00837
– ident: ref52
  doi: 10.1109/CVPR.2017.84
– ident: ref1
  doi: 10.1109/TCSVT.2022.3186307
– ident: ref37
  doi: 10.1109/TCSVT.2021.3112197
– ident: ref61
  doi: 10.1109/CVPR.2016.319
– ident: ref55
  doi: 10.1109/ICCV.2019.00140
– ident: ref33
  doi: 10.1109/TMM.2021.3118881
– ident: ref43
  doi: 10.1109/TCSVT.2020.3024882
– ident: ref39
  doi: 10.1145/3423268.3423590
– ident: ref48
  doi: 10.1109/WACV45572.2020.9093412
– ident: ref14
  doi: 10.1038/s41562-021-01124-6
– ident: ref27
  doi: 10.1109/TKDE.2020.2981333
– ident: ref36
  doi: 10.1109/ICCV.2015.123
– ident: ref32
  doi: 10.1007/978-3-030-01246-5_41
– ident: ref49
  doi: 10.1109/CVPR.2019.00960
SSID ssj0014847
Score 2.607188
Snippet People usually assess image aesthetics according to visual attributes, e.g., interesting content, good lighting and vivid color, etc. Further, the perception...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Aesthetics
bilevel reasoning
Cognition
Data mining
Feature extraction
Image aesthetics assessment
Image color analysis
image theme
Lighting
Perception
Predictive models
Reasoning
visual attribute
Visualization
Title Theme-aware Visual Attribute Reasoning for Image Aesthetics Assessment
URI https://ieeexplore.ieee.org/document/10054147
https://www.proquest.com/docview/2861467978
Volume 33
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZoJxh4FlEoKAMbSsjbzlhVVIBEB2irbpEfF4GgLWoSIfHrOTtJVYFALJGHu8jy2b7vbN99hFwyHoQq8pQdZgo_ARN2onAzVFRIwRMEvKAP9B9G8e0kvJ9FszpZ3eTCAIB5fAaObpq7fLWUpT4qwxXuatZq2iItjNyqZK31lUHIDJsY4gXPZujImgwZN7keD56mY0cThTuIHxJPEydveCFDq_JjLzYOZrhHRk3Xqnclr05ZCEd-fqva-O--75PdGmpa_WpuHJAtWBySnY0ChEdkiLNkDjb_4Cuwpi95qeWLigQLrEfguTmttRDZWndz3HqsPvb8WSc-5lZ_XdSzQybDm_Hg1q6ZFWzpJ3FhByxyFciQIziQvqDCxSZlEuIs5mEGIGgEVPqMCyYiDHoixSnzMioTHqLNg2PSXiwXcEIs9IESUCaIUVOpiGWIyRCUBDRhuixEl3jNSKeyLjuu2S_eUhN-uElqrJNq66S1dbrkaq3zXhXd-FO6o4d7Q7Ia6S7pNRZN64WZpz6LtW_A2Pn0F7Uzsq3_Xr0j65F2sSrhHIFHIS7MhPsClIDTTA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLZ4HIADb8R49sANtbTrI-lxQkzjtQOMabcqD1cg2Ia2Tkj8epy0myYQiEuVg6NGdhJ_SWx_AGdchJGOA-1GuaZPyKWbatoMNZNKipQAL5oL_ft20nqKbnpxr0pWt7kwiGiDz9AzTfuWr4dqYq7KaIX7hrWaLcIyOf44KNO1Zo8GEbd8YoQYApeTK5vmyPjpRefysdvxDFW4RwgiDQx18pwfssQqP3Zj62KaG9CeDq6MLHn1JoX01Oe3uo3_Hv0mrFdg02mUs2MLFnCwDWtzJQh3oEnzpI-u-BAjdLov44mRL0oaLHQeUIztfa1D2Na57tPm4zRo5M8m9XHsNGZlPXfhqXnVuWy5FbeCq-ppUrghj32NKhIED1RdMulTk3GFSZ6IKEeULEam6lxILmM69sRaMB7kTKUiIquHe7A0GA5wHxzyggpJJkyop9Yxz8k4BEtClnJTGKIGwVTTmaoKjxv-i7fMHkD8NLPWyYx1sso6NTif9Xkvy278Kb1r1D0nWWq6BkdTi2bV0hxndZ4Y70Cn54Nfup3CSqtzf5fdXbdvD2HV_KmMKjuCpWI0wWOCIYU8sZPvC8b01pU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Theme-Aware+Visual+Attribute+Reasoning+for+Image+Aesthetics+Assessment&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Li%2C+Leida&rft.au=Huang%2C+Yipo&rft.au=Wu%2C+Jinjian&rft.au=Yang%2C+Yuzhe&rft.date=2023-09-01&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=33&rft.issue=9&rft.spage=4798&rft.epage=4811&rft_id=info:doi/10.1109%2FTCSVT.2023.3249185&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2023_3249185
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon