Self reward design with fine-grained interpretability

The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumven...

Full description

Saved in:
Bibliographic Details
Published inScientific reports Vol. 13; no. 1; pp. 1638 - 10
Main Authors Tjoa, Erico, Guan, Cuntai
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 30.01.2023
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.
AbstractList The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.
The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.
Abstract The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.
ArticleNumber 1638
Author Tjoa, Erico
Guan, Cuntai
Author_xml – sequence: 1
  givenname: Erico
  surname: Tjoa
  fullname: Tjoa, Erico
  email: ericotjo001@e.ntu.edu.sg
  organization: Nanyang Technological University, Alibaba Group
– sequence: 2
  givenname: Cuntai
  surname: Guan
  fullname: Guan, Cuntai
  organization: Nanyang Technological University
BackLink https://www.ncbi.nlm.nih.gov/pubmed/36717641$$D View this record in MEDLINE/PubMed
BookMark eNp9kstuFDEQRS0URELID7BALbFh0-D3Y4OEIh6RIrEA1pbbru541GMPdg9R_h5POoSEBd6U5Tp1q1S-z9FRygkQeknwW4KZflc5EUb3mLKeao15b56gE4q56Cmj9OjB_Rid1brB7QhqODHP0DGTiijJyQkS32AeuwLXroQuQI1T6q7jctWNMUE_FddC6GJaoOwKLG6Ic1xuXqCno5srnN3FU_Tj08fv51_6y6-fL84_XPaeG7L0dAAFuvWVxAfihRISKB2xl2Ec9CBlAMOoYMJrIk0wjhvNBPCW9zDgwE7RxaobstvYXYlbV25sdtHePuQyWVeW6GewQBQbPFFaGsKpIZrLtgAnR-m18Uo2rfer1m4_bKE1SEtx8yPRx5kUr-yUf1mjm6Y0TeDNnUDJP_dQF7uN1cM8uwR5Xy1VijDGNKUNff0Pusn7ktqqDhTWVEoiGvXq4UT3o_z5nQbQFfAl11pgvEcItgcX2NUFtrnA3rrAHsZka1FtcJqg_O39n6rfPnqygA
Cites_doi 10.1007/978-3-030-57321-8_5
10.1109/CVPR46437.2021.01549
10.1016/j.knosys.2020.106685
10.1109/DSAA.2018.00018
10.1609/aaai.v32i1.11694
10.1038/nature16961
10.1109/TNNLS.2020.3027314
10.1109/ICRA.2018.8460655
10.1016/j.inffus.2019.12.012
10.1109/IROS.2012.6386109
10.1038/nature14236
10.1098/rstb.2002.1099
ContentType Journal Article
Copyright The Author(s) 2023
2023. The Author(s).
The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2023
– notice: 2023. The Author(s).
– notice: The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
NPM
3V.
7X7
7XB
88A
88E
88I
8FE
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
FYUFA
GHDGH
GNUQQ
HCIFZ
K9.
LK8
M0S
M1P
M2P
M7P
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
7X8
5PM
DOA
DOI 10.1038/s41598-023-28804-9
DatabaseName Springer Nature OA Free Journals
CrossRef
PubMed
ProQuest Central (Corporate)
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Biology Database (Alumni Edition)
Medical Database (Alumni Edition)
Science Database (Alumni Edition)
ProQuest SciTech Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Natural Science Collection
ProQuest One
ProQuest Central
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
ProQuest SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
Biological Sciences
ProQuest Health & Medical Collection
Medical Database
Science Database
Biological Science Database
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
PubMed
Publicly Available Content Database
ProQuest Central Student
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central China
ProQuest Biology Journals (Alumni Edition)
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
ProQuest Health & Medical Research Collection
Health Research Premium Collection
Health and Medicine Complete (Alumni Edition)
Natural Science Collection
ProQuest Central Korea
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
ProQuest Science Journals (Alumni Edition)
ProQuest Biological Science Collection
ProQuest Central Basic
ProQuest Science Journals
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest SciTech Collection
ProQuest Hospital Collection (Alumni)
ProQuest Health & Medical Complete
ProQuest Medical Library
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

Publicly Available Content Database
CrossRef


PubMed
Database_xml – sequence: 1
  dbid: C6C
  name: Springer Nature OA Free Journals
  url: http://www.springeropen.com/
  sourceTypes: Publisher
– sequence: 2
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 3
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 4
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2045-2322
EndPage 10
ExternalDocumentID oai_doaj_org_article_e173bc1786914291846045a6f6c89c76
PMC9886969
36717641
10_1038_s41598_023_28804_9
Genre Journal Article
GroupedDBID 0R~
3V.
4.4
53G
5VS
7X7
88A
88E
88I
8FE
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKDD
ABDBF
ABUWG
ACGFS
ACSMW
ACUHS
ADBBV
ADRAZ
AENEX
AEUYN
AFKRA
AJTQC
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
AZQEC
BAWUL
BBNVY
BCNDV
BENPR
BHPHI
BPHCQ
BVXVI
C6C
CCPQU
DIK
DWQXO
EBD
EBLON
EBS
ESX
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HH5
HMCUK
HYE
KQ8
LK8
M0L
M1P
M2P
M48
M7P
M~E
NAO
OK1
PIMPY
PQQKQ
PROAC
PSQYO
RNT
RNTTT
RPM
SNYQT
UKHRP
AASML
AAYXX
AFPKN
CITATION
PHGZM
PHGZT
NPM
7XB
8FK
K9.
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQUKI
PRINS
Q9U
7X8
5PM
PUEGO
ID FETCH-LOGICAL-c491t-2be7e800561cd1c5756e22f0c6dfb8b66de932535c8169d9a49835e4c6dceb0d3
IEDL.DBID 7X7
ISSN 2045-2322
IngestDate Wed Aug 27 01:23:46 EDT 2025
Thu Aug 21 18:38:20 EDT 2025
Fri Jul 11 04:20:13 EDT 2025
Sat Aug 23 13:27:24 EDT 2025
Thu Jan 02 22:53:15 EST 2025
Tue Jul 01 00:55:50 EDT 2025
Fri Feb 21 02:39:45 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License 2023. The Author(s).
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c491t-2be7e800561cd1c5756e22f0c6dfb8b66de932535c8169d9a49835e4c6dceb0d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
OpenAccessLink https://www.proquest.com/docview/2770826615?pq-origsite=%requestingapplication%
PMID 36717641
PQID 2770826615
PQPubID 2041939
PageCount 10
ParticipantIDs doaj_primary_oai_doaj_org_article_e173bc1786914291846045a6f6c89c76
pubmedcentral_primary_oai_pubmedcentral_nih_gov_9886969
proquest_miscellaneous_2771333822
proquest_journals_2770826615
pubmed_primary_36717641
crossref_primary_10_1038_s41598_023_28804_9
springer_journals_10_1038_s41598_023_28804_9
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-01-30
PublicationDateYYYYMMDD 2023-01-30
PublicationDate_xml – month: 01
  year: 2023
  text: 2023-01-30
  day: 30
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
– name: England
PublicationTitle Scientific reports
PublicationTitleAbbrev Sci Rep
PublicationTitleAlternate Sci Rep
PublicationYear 2023
Publisher Nature Publishing Group UK
Nature Publishing Group
Nature Portfolio
Publisher_xml – name: Nature Publishing Group UK
– name: Nature Publishing Group
– name: Nature Portfolio
References Shu, T., Xiong, C. & Socher, R. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. arXiv preprintarXiv:1712.07294 (2017).
Puiutta, E. & Veith, E. M. S. P. Explainable reinforcement learning: A survey. In Holzinger, A., Kieseberg, P., Tjoa, A. M. & Weippl, E. (eds.) Machine Learning and Knowledge Extraction, 77–95 (Springer International Publishing, Cham, 2020).
Verma, A., Murali, V., Singh, R., Kohli, P. & Chaudhuri, S. Programmatically interpretable reinforcement learning. In International Conference on Machine Learning, 5045–5054 (PMLR, 2018).
Juozapaitis, Z., Koul, A., Fern, A., Erwig, M. & Doshi-Velez, F. Explainable reinforcement learning via reward decomposition. In Proceedings at the International Joint Conference on Artificial Intelligence. A Workshop on Explainable Artificial Intelligence. (2019).
Clark, J. & Amodei, D. Faulty reward functions in the wild. Internet: https://blog.openai.com/faulty-reward-functions (2016).
Zambaldi, V. et al. Deep reinforcement learning with relational inductive biases. In International Conference on Learning Representations (2019).
MnihVHuman-level control through deep reinforcement learningNature20155185295332015Natur.518..529M1:CAS:528:DC%2BC2MXjsVagur0%3D10.1038/nature14236
Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. & Dragan, A. D. Inverse reward design. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 6768–6777 (Curran Associates Inc., Red Hook, NY, USA, 2017).
Russell, S. J. Artificial Intelligence a Modern Approach (Pearson Education, Inc., 2010).
MillerEKFreedmanDJWallisJDThe prefrontal cortex: Categories, concepts and cognitionPhilos. Trans. R. Soc. Lond. Ser. B Biol. Sci.20023571123113610.1098/rstb.2002.1099
Chen, X. & He, K. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15750–15758 (2021).
Henderson, P. et al. Deep reinforcement learning that matters. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018).
Hafner, D., Lillicrap, T., Ba, J. & Norouzi, M. Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations (2020).
Kalweit, G. & Boedecker, J. Uncertainty-driven imagination for continuous deep reinforcement learning. In Levine, S., Vanhoucke, V. & Goldberg, K. (eds.) Proceedings of the 1st Annual Conference on Robot Learning, vol. 78 of Proceedings of Machine Learning Research, 195–206 (PMLR, 2017).
Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. arXiv preprintarXiv:2003.04297 (2020).
Gilpin, L. H. et al. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 80–89 (2018).
Tjoa, E. & Guan, C. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Transactions on Neural Networks and Learning Systems 1–21, https://doi.org/10.1109/TNNLS.2020.3027314 (2020).
Singh, S., Lewis, R. L. & Barto, A. G. Where do rewards come from. In Proceedings of the Annual Conference of the Cognitive Science Society, 2601–2606 (Cognitive Science Society, 2009).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 1597–1607 (PMLR, 2020).
Racanière, S. et al. Imagination-augmented agents for deep reinforcement learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 5694–5705 (Curran Associates Inc., Red Hook, NY, USA, 2017).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT press, 2018).
ArrietaABExplainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible aiInf. Fusion2020588211510.1016/j.inffus.2019.12.012
SilverDMastering the game of go with deep neural networks and tree searchNature20165294844892016Natur.529..484S1:CAS:528:DC%2BC28Xhs12is7w%3D10.1038/nature16961
Greydanus, S., Koul, A., Dodge, J. & Fern, A. Visualizing and understanding Atari agents. In Dy, J. & Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, 1792–1801 (PMLR, 2018).
Oh, J., Singh, S. & Lee, H. Value prediction network. In Guyon, I. et al. (eds.) Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., 2017).
Todorov, E., Erez, T. & Tassa, Y. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 5026–5033, https://doi.org/10.1109/IROS.2012.6386109 (2012).
HeuilletACouthouisFDíaz-RodríguezNExplainability in deep reinforcement learningKnowledge-Based Syst.202121410.1016/j.knosys.2020.106685
Kahn, G., Villaflor, A., Ding, B., Abbeel, P. & Levine, S. Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 5129–5136, https://doi.org/10.1109/ICRA.2018.8460655 (2018).
28804_CR21
EK Miller (28804_CR20) 2002; 357
28804_CR25
V Mnih (28804_CR2) 2015; 518
28804_CR24
28804_CR23
28804_CR22
28804_CR28
28804_CR8
28804_CR27
28804_CR9
28804_CR26
28804_CR19
D Silver (28804_CR3) 2016; 529
28804_CR10
28804_CR14
28804_CR12
28804_CR11
28804_CR18
28804_CR17
AB Arrieta (28804_CR4) 2020; 58
28804_CR16
28804_CR15
28804_CR6
28804_CR7
28804_CR5
28804_CR1
A Heuillet (28804_CR13) 2021; 214
References_xml – reference: Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT press, 2018).
– reference: Tjoa, E. & Guan, C. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Transactions on Neural Networks and Learning Systems 1–21, https://doi.org/10.1109/TNNLS.2020.3027314 (2020).
– reference: Russell, S. J. Artificial Intelligence a Modern Approach (Pearson Education, Inc., 2010).
– reference: Racanière, S. et al. Imagination-augmented agents for deep reinforcement learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 5694–5705 (Curran Associates Inc., Red Hook, NY, USA, 2017).
– reference: Oh, J., Singh, S. & Lee, H. Value prediction network. In Guyon, I. et al. (eds.) Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., 2017).
– reference: Gilpin, L. H. et al. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 80–89 (2018).
– reference: Verma, A., Murali, V., Singh, R., Kohli, P. & Chaudhuri, S. Programmatically interpretable reinforcement learning. In International Conference on Machine Learning, 5045–5054 (PMLR, 2018).
– reference: Puiutta, E. & Veith, E. M. S. P. Explainable reinforcement learning: A survey. In Holzinger, A., Kieseberg, P., Tjoa, A. M. & Weippl, E. (eds.) Machine Learning and Knowledge Extraction, 77–95 (Springer International Publishing, Cham, 2020).
– reference: MillerEKFreedmanDJWallisJDThe prefrontal cortex: Categories, concepts and cognitionPhilos. Trans. R. Soc. Lond. Ser. B Biol. Sci.20023571123113610.1098/rstb.2002.1099
– reference: Shu, T., Xiong, C. & Socher, R. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. arXiv preprintarXiv:1712.07294 (2017).
– reference: Juozapaitis, Z., Koul, A., Fern, A., Erwig, M. & Doshi-Velez, F. Explainable reinforcement learning via reward decomposition. In Proceedings at the International Joint Conference on Artificial Intelligence. A Workshop on Explainable Artificial Intelligence. (2019).
– reference: Zambaldi, V. et al. Deep reinforcement learning with relational inductive biases. In International Conference on Learning Representations (2019).
– reference: MnihVHuman-level control through deep reinforcement learningNature20155185295332015Natur.518..529M1:CAS:528:DC%2BC2MXjsVagur0%3D10.1038/nature14236
– reference: Greydanus, S., Koul, A., Dodge, J. & Fern, A. Visualizing and understanding Atari agents. In Dy, J. & Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, 1792–1801 (PMLR, 2018).
– reference: Kalweit, G. & Boedecker, J. Uncertainty-driven imagination for continuous deep reinforcement learning. In Levine, S., Vanhoucke, V. & Goldberg, K. (eds.) Proceedings of the 1st Annual Conference on Robot Learning, vol. 78 of Proceedings of Machine Learning Research, 195–206 (PMLR, 2017).
– reference: Todorov, E., Erez, T. & Tassa, Y. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 5026–5033, https://doi.org/10.1109/IROS.2012.6386109 (2012).
– reference: Kahn, G., Villaflor, A., Ding, B., Abbeel, P. & Levine, S. Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 5129–5136, https://doi.org/10.1109/ICRA.2018.8460655 (2018).
– reference: SilverDMastering the game of go with deep neural networks and tree searchNature20165294844892016Natur.529..484S1:CAS:528:DC%2BC28Xhs12is7w%3D10.1038/nature16961
– reference: Singh, S., Lewis, R. L. & Barto, A. G. Where do rewards come from. In Proceedings of the Annual Conference of the Cognitive Science Society, 2601–2606 (Cognitive Science Society, 2009).
– reference: Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 1597–1607 (PMLR, 2020).
– reference: Chen, X. & He, K. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15750–15758 (2021).
– reference: ArrietaABExplainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible aiInf. Fusion2020588211510.1016/j.inffus.2019.12.012
– reference: Henderson, P. et al. Deep reinforcement learning that matters. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018).
– reference: Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. arXiv preprintarXiv:2003.04297 (2020).
– reference: Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. & Dragan, A. D. Inverse reward design. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 6768–6777 (Curran Associates Inc., Red Hook, NY, USA, 2017).
– reference: Hafner, D., Lillicrap, T., Ba, J. & Norouzi, M. Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations (2020).
– reference: HeuilletACouthouisFDíaz-RodríguezNExplainability in deep reinforcement learningKnowledge-Based Syst.202121410.1016/j.knosys.2020.106685
– reference: Clark, J. & Amodei, D. Faulty reward functions in the wild. Internet: https://blog.openai.com/faulty-reward-functions (2016).
– ident: 28804_CR18
– ident: 28804_CR14
  doi: 10.1007/978-3-030-57321-8_5
– ident: 28804_CR24
– ident: 28804_CR26
– ident: 28804_CR22
  doi: 10.1109/CVPR46437.2021.01549
– ident: 28804_CR28
– ident: 28804_CR8
– ident: 28804_CR12
– volume: 214
  year: 2021
  ident: 28804_CR13
  publication-title: Knowledge-Based Syst.
  doi: 10.1016/j.knosys.2020.106685
– ident: 28804_CR10
– ident: 28804_CR16
– ident: 28804_CR5
  doi: 10.1109/DSAA.2018.00018
– ident: 28804_CR17
– ident: 28804_CR19
– ident: 28804_CR9
  doi: 10.1609/aaai.v32i1.11694
– volume: 529
  start-page: 484
  year: 2016
  ident: 28804_CR3
  publication-title: Nature
  doi: 10.1038/nature16961
– ident: 28804_CR6
  doi: 10.1109/TNNLS.2020.3027314
– ident: 28804_CR21
  doi: 10.1109/ICRA.2018.8460655
– ident: 28804_CR27
– volume: 58
  start-page: 82
  year: 2020
  ident: 28804_CR4
  publication-title: Inf. Fusion
  doi: 10.1016/j.inffus.2019.12.012
– ident: 28804_CR23
– ident: 28804_CR7
– ident: 28804_CR25
  doi: 10.1109/IROS.2012.6386109
– ident: 28804_CR1
– ident: 28804_CR15
– volume: 518
  start-page: 529
  year: 2015
  ident: 28804_CR2
  publication-title: Nature
  doi: 10.1038/nature14236
– ident: 28804_CR11
– volume: 357
  start-page: 1123
  year: 2002
  ident: 28804_CR20
  publication-title: Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci.
  doi: 10.1098/rstb.2002.1099
SSID ssj0000529419
Score 2.3781698
Snippet The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or...
Abstract The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 1638
SubjectTerms 639/705/117
639/705/258
Design
Humanities and Social Sciences
multidisciplinary
Neural networks
Reinforcement
Science
Science (multidisciplinary)
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT9wwEB4hJCQuqBTaBrZVkLi10caP-HFsURFCggustDdr_WpXqgKC5bD_nrGTDWxLxYVrnMPkGzvzjT3-BuA4Oka1xLQkCqvSlZy6slQ0VZA4naj31OcT04tLcTbh59Nm-qzVV6oJ6-SBO-DGgUhmHZFKaIL_TkxIBLKQmYjCKe1kFtvGmPcsmepUvanmRPe3ZGqmxvcYqdJtMsoqinOWV3otEmXB_pdY5r_Fkn-dmOZAdPoOdnoGWX7vLN-FjdC-h62up-RyD5qr8CeWdyFVw5Y-12eUabO1jMgnq1-pI0Tw5XyoNczFsct9mJz-vD45q_reCJXjmiwqaoMMKgt5Ok8cki4RKI21Ez5aZYXwAZlZwxqniNBez7hGrhU4jrtga88-wGZ704ZPUEZMomxiAlY6HiVXFhe1bljgxM9Uwwv4usLJ3HYSGCYfXTNlOlQNomoyqkYX8CNBObyZ5KvzA3Sq6Z1qXnNqAaOVI0y_pu4NlRKtRD7RFHA0DONqSEccszbcPOR3MOlmyHoK-Nj5bbCECUxdBScFyDWPrpm6PtLOf2fFba3QUoHf9m3l-yez_g_FwVtAcQjbqcN92vVh9Qg2F3cP4TPyoIX9kqf8I2db_1I
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: Scholars Portal Journals: Open Access
  dbid: M48
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1ZSxxBEC7UEPBFYg4zRsMEfIsTt4_p4yGICREJ6Euy4FuzfemCzMZ1Bfffp7pnZmWT1dfpHqiug_qquw6Ag-gY1RLDkiisSiU5g8pSUVdBojpR76nPL6bnF-JsyH9e1pdr0I876hh4tzK0S_OkhtObLw-382M0-K9tybg6ukMnlArFKKsoqiOv9Dq8QM8kk6Ged3C_7fVNNSe6q51Z_euSf8pt_Fdhz_9TKP95R83u6fQVbHW4sjxpFWEb1kLzGl62kybnb6D-FW5iOQ0pR7b0OWujTFewZUSUWV2lORHBl-NFBmJOmZ2_heHpj9_fz6puYkLluCazitogg8rtPZ0nDqGYCJTGgRM-WmWF8AHxWs1qp4jQXo-4RgQWOK67YAeevYONZtKE91BGDK1swgdWOh4lVxZNXdcscOJHquYFfO75ZP60jTFMftBmyrRcNchVk7lqdAHfEisXO1NT6_xhMr0ynY2YQCSzjkglNEE3ibGnQMA5ElE4pZ0UBez1gjC9ohgqJVKJKKMu4NNiGW0kPXyMmjC5z3swFGeIhQrYaeW2oIQJDGgFJwXIJYkukbq80oyvcx9urZBSgWc77GX_SNbTrNh9_hQfYDNNtE-3PGywBxuz6X3YR9wzsx-zMv8Frhb7qg
  priority: 102
  providerName: Scholars Portal
– databaseName: Springer Nature OA Free Journals
  dbid: C6C
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JaxVBEC5iRPAiieuYhRG86eD0MtXdR30kBEEvGsiteb3FgMwLLy-H_HurexZ5Gg9ep3ugpqp66quuDeBt8oIbRW5JQqdzSU7bOI5dExWpEw-BhxIx_fIVz87l54vuYgf4VAtTkvZLS8vym56ywz7ckKHJxWBcNJxUTjbmATzMrduzVi9wMd-r5MiVZGasj2mFvufVLRtUWvXfhy__TpP8I1ZaTNDpHjwZsWP9caB2H3Zi_xQeDdMk755B9y3-TPU65jzYOpTMjDpfs9aJkGRzmWdBxFBfzVmGJS327jmcn558X5w141SExkvDNg13UUVdWnj6wDzBLYycp9ZjSE47xBAJk3Wi85qhCWYpDaGsKGndR9cG8QJ2-1UfX0GdyH1yGQM45WVSUjs6zqYTUbKw1J2s4N3EJ3s9NL-wJWgttB24aomrtnDVmgo-ZVbOO3Pj6vJgtb60oyBtZEo4z5RGw8gUkn-JBCqXmNBr4xVWcDgJwo6n6cZypYhKQhJdBW_mZToHObix7OPqtuwhd1sQ3qng5SC3mRKB5LSiZBWoLYlukbq90l_9KL22jSZKkb7t_ST732T9mxWv_2_7ATzOU-zzzY5oD2F3s76NR4R1Nu64KPcvnlr2UQ
  priority: 102
  providerName: Springer Nature
Title Self reward design with fine-grained interpretability
URI https://link.springer.com/article/10.1038/s41598-023-28804-9
https://www.ncbi.nlm.nih.gov/pubmed/36717641
https://www.proquest.com/docview/2770826615
https://www.proquest.com/docview/2771333822
https://pubmed.ncbi.nlm.nih.gov/PMC9886969
https://doaj.org/article/e173bc1786914291846045a6f6c89c76
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LaxsxEB7ahEIvpe9umpot9NaK7EpaPU7FMQnBkFCaBnwT1isJlHVqO4f8-4y06w3u67ILkg7SjKT55qEZgE_RMaolqiVRWJWe5FTEUtGQIHE7Ue-pzx7T0zNxcsGns2bWG9xWfVjl5k7MF7VfuGQjP6BSorRCadJ8vflFUtWo5F3tS2g8ht2UuiyFdMmZHGwsyYvFa92_lamYOlihvEpvyigjFHcuJ3pLHuW0_X_Dmn-GTP7mN83i6Pg5POtxZDnuGP8CHoX2JTzpKkvevYLmPPyM5TKkmNjS5yiNMplcy4ioklymuhDBl9dDxGEOkb17DRfHRz8mJ6SvkEAc1_WaUBtkUDmdp_O1Q-glAqWxcsJHq6wQPiA-a1jjVC2013OuEXEFjv0u2MqzN7DTLtrwDsqIqpRNeMBKx6PkyuLR1g0LvPZz1fACPm_oZG66RBgmO7CZMh1VDVLVZKoaXcBhIuUwMiWxzg2L5aXpz4QJtWTW1VIJXaNYRF1TIMCciyic0k6KAvY3jDD9yVqZh31QwMehG89EcnTM27C4zWNQ9WaIfQp42_FtmAkTqMAKXhcgtzi6NdXtnvb6Kufd1gpnKnBtXza8f5jWv0mx9_9VvIenqYJ9suqwah921svb8AFxztqO8mYewe54PD2f4v_w6Ozbd2ydiMko2w7we8rVPVuh_kk
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Zb9QwEB6VrRC8IG4CLQQJniBqYjs-HhBqodWWtisErdQ3sz5SKqFs2d2q2j_V38jYOarleutrbEX2HJ5vPOMZgFeVpUQJdEsqbmR4kpNnhvAy8wLFiThHXIyYHoz48Ih9Oi6PV-CyewsT0iq7MzEe1G5iwx35BhECrRVak_L92c8sdI0K0dWuhUYjFnt-cYEu2-zd7kfk72tCdrYPPwyztqtAZpkq5hkxXngZS2BaV1iEK9wTUuWWu8pIw7nziGlKWlpZcOXUmClEKZ7huPUmdxT_ewNWGUVXZgCrW9ujz1_6W50QN2OFal_n5FRuzNBChldshGYEdYVlaskCxkYBf0O3fyZp_hapjQZw5y7caZFrutmI2j1Y8fV9uNn0slw8gPKr_1GlUx-ycFMX80LScMmbVohjs5PQicK79LTPcYxJuYuHcHQt1HsEg3pS-yeQVui8mYBAjLCsEkwaPExUST0r3FiWLIE3HZ30WVN6Q8eQOZW6oapGqupIVa0S2Aqk7GeGstnxw2R6olst1L4Q1NhCSK4KNMTo3XKEtGNecSuVFTyBtY4RutXlmb6SvARe9sOohSG0Mq795DzOQWefItpK4HHDt34llKPLzFmRgFji6NJSl0fq0--x0reSuFKOe3vb8f5qWf8mxdP_7-IF3BoeHuzr_d3R3jO4TYJo5uEScg0G8-m5X0eUNTfPW9FO4dt1a9Mvndo3Cw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Zb9QwEB6VIhAviJtAgSDBE0QbH_HxgBBQVi2FCgkq7ZtZH2krVdmyuxXav8avY-wc1XK99TW2InsOzzee8QzAs9oxqiW6JbWwKj7JKQtLRVUEieJEvac-RUw_7YudA_5hUk024Gf_FiamVfZnYjqo_czFO_IRlRKtFVqTalR3aRGft8evT78XsYNUjLT27TRaEdkLqx_ovi1e7W4jr59TOn7_9d1O0XUYKBzXZFlQG2RQqRym88QhdBGB0rp0wtdWWSF8QHxTscopIrTXU64RsQSO4y7Y0jP87yW4LFlFoo7JiRzud2IEjRPdvdMpmRot0FbG92yUFRS1hhd6zRamlgF_w7l_pmv-FrNNpnB8A653GDZ_0wrdTdgIzS240na1XN2G6ks4qfN5iPm4uU8ZInm87s1rRLTFYexJEXx-PGQ7pvTc1R04uBDa3YXNZtaE-5DX6MbZiEWsdLyWXFk8VnTFAid-qiqewYueTua0LcJhUvCcKdNS1SBVTaKq0Rm8jaQcZsYC2unDbH5oOn00gUhmHZFKaIImGf1cgeB2KmrhlHZSZLDVM8J0Wr0w5zKYwdNhGPUxBlmmTZidpTno9jPEXRnca_k2rIQJdJ4FJxnINY6uLXV9pDk-SjW_tcKVCtzby57358v6Nyke_H8XT-Aq6pD5uLu_9xCu0SiZZbyN3ILN5fwsPEK4tbSPk1zn8O2iFekXzUU52w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Self+reward+design+with+fine-grained+interpretability&rft.jtitle=Scientific+reports&rft.au=Tjoa%2C+Erico&rft.au=Guan%2C+Cuntai&rft.date=2023-01-30&rft.pub=Nature+Publishing+Group&rft.eissn=2045-2322&rft.volume=13&rft.issue=1&rft.spage=1638&rft_id=info:doi/10.1038%2Fs41598-023-28804-9&rft.externalDBID=HAS_PDF_LINK
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2045-2322&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2045-2322&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2045-2322&client=summon