Self reward design with fine-grained interpretability

The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumven...

Full description

Saved in:

Bibliographic Details
Published in	Scientific reports Vol. 13; no. 1; pp. 1638 - 10
Main Authors	Tjoa, Erico, Guan, Cuntai
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 30.01.2023 Nature Publishing Group Nature Portfolio
Subjects	639/705/117 639/705/258 Design Humanities and Social Sciences multidisciplinary Neural networks Reinforcement Science Science (multidisciplinary)
Online Access	Get full text

Cover

Loading…

Abstract	The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.
AbstractList	The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required. The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required. Abstract The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required.
ArticleNumber	1638
Author	Tjoa, Erico Guan, Cuntai
Author_xml	– sequence: 1 givenname: Erico surname: Tjoa fullname: Tjoa, Erico email: ericotjo001@e.ntu.edu.sg organization: Nanyang Technological University, Alibaba Group – sequence: 2 givenname: Cuntai surname: Guan fullname: Guan, Cuntai organization: Nanyang Technological University
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/36717641$$D View this record in MEDLINE/PubMed
BookMark	eNp9kstuFDEQRS0URELID7BALbFh0-D3Y4OEIh6RIrEA1pbbru541GMPdg9R_h5POoSEBd6U5Tp1q1S-z9FRygkQeknwW4KZflc5EUb3mLKeao15b56gE4q56Cmj9OjB_Rid1brB7QhqODHP0DGTiijJyQkS32AeuwLXroQuQI1T6q7jctWNMUE_FddC6GJaoOwKLG6Ic1xuXqCno5srnN3FU_Tj08fv51_6y6-fL84_XPaeG7L0dAAFuvWVxAfihRISKB2xl2Ec9CBlAMOoYMJrIk0wjhvNBPCW9zDgwE7RxaobstvYXYlbV25sdtHePuQyWVeW6GewQBQbPFFaGsKpIZrLtgAnR-m18Uo2rfer1m4_bKE1SEtx8yPRx5kUr-yUf1mjm6Y0TeDNnUDJP_dQF7uN1cM8uwR5Xy1VijDGNKUNff0Pusn7ktqqDhTWVEoiGvXq4UT3o_z5nQbQFfAl11pgvEcItgcX2NUFtrnA3rrAHsZka1FtcJqg_O39n6rfPnqygA
Cites_doi	10.1007/978-3-030-57321-8_5 10.1109/CVPR46437.2021.01549 10.1016/j.knosys.2020.106685 10.1109/DSAA.2018.00018 10.1609/aaai.v32i1.11694 10.1038/nature16961 10.1109/TNNLS.2020.3027314 10.1109/ICRA.2018.8460655 10.1016/j.inffus.2019.12.012 10.1109/IROS.2012.6386109 10.1038/nature14236 10.1098/rstb.2002.1099
ContentType	Journal Article
Copyright	The Author(s) 2023 2023. The Author(s). The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: The Author(s) 2023 – notice: 2023. The Author(s). – notice: The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	C6C AAYXX CITATION NPM 3V. 7X7 7XB 88A 88E 88I 8FE 8FH 8FI 8FJ 8FK ABUWG AEUYN AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO FYUFA GHDGH GNUQQ HCIFZ K9. LK8 M0S M1P M2P M7P PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS Q9U 7X8 5PM DOA
DOI	10.1038/s41598-023-28804-9
DatabaseName	Springer Nature OA Free Journals CrossRef PubMed ProQuest Central (Corporate) Health & Medical Collection ProQuest Central (purchase pre-March 2016) Biology Database (Alumni Edition) Medical Database (Alumni Edition) Science Database (Alumni Edition) ProQuest SciTech Collection ProQuest Natural Science Collection Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest One Sustainability ProQuest Central ProQuest Central Essentials Biological Science Collection ProQuest Central Natural Science Collection ProQuest One ProQuest Central Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student ProQuest SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Biological Sciences ProQuest Health & Medical Collection Medical Database Science Database Biological Science Database ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef PubMed Publicly Available Content Database ProQuest Central Student ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Central China ProQuest Biology Journals (Alumni Edition) ProQuest Central ProQuest One Applied & Life Sciences ProQuest One Sustainability ProQuest Health & Medical Research Collection Health Research Premium Collection Health and Medicine Complete (Alumni Edition) Natural Science Collection ProQuest Central Korea Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) ProQuest Science Journals (Alumni Edition) ProQuest Biological Science Collection ProQuest Central Basic ProQuest Science Journals ProQuest One Academic Eastern Edition ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest SciTech Collection ProQuest Hospital Collection (Alumni) ProQuest Health & Medical Complete ProQuest Medical Library ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic Publicly Available Content Database CrossRef PubMed
Database_xml	– sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 3 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 4 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	2045-2322
EndPage	10
ExternalDocumentID	oai_doaj_org_article_e173bc1786914291846045a6f6c89c76 PMC9886969 36717641 10_1038_s41598_023_28804_9
Genre	Journal Article
GroupedDBID	0R~ 3V. 4.4 53G 5VS 7X7 88A 88E 88I 8FE 8FH 8FI 8FJ AAFWJ AAJSJ AAKDD ABDBF ABUWG ACGFS ACSMW ACUHS ADBBV ADRAZ AENEX AEUYN AFKRA AJTQC ALIPV ALMA_UNASSIGNED_HOLDINGS AOIJS AZQEC BAWUL BBNVY BCNDV BENPR BHPHI BPHCQ BVXVI C6C CCPQU DIK DWQXO EBD EBLON EBS ESX FYUFA GNUQQ GROUPED_DOAJ GX1 HCIFZ HH5 HMCUK HYE KQ8 LK8 M0L M1P M2P M48 M7P M~E NAO OK1 PIMPY PQQKQ PROAC PSQYO RNT RNTTT RPM SNYQT UKHRP AASML AAYXX AFPKN CITATION PHGZM PHGZT NPM 7XB 8FK K9. PJZUB PKEHL PPXIY PQEST PQGLB PQUKI PRINS Q9U 7X8 5PM PUEGO
ID	FETCH-LOGICAL-c491t-2be7e800561cd1c5756e22f0c6dfb8b66de932535c8169d9a49835e4c6dceb0d3
IEDL.DBID	7X7
ISSN	2045-2322
IngestDate	Wed Aug 27 01:23:46 EDT 2025 Thu Aug 21 18:38:20 EDT 2025 Fri Jul 11 04:20:13 EDT 2025 Sat Aug 23 13:27:24 EDT 2025 Thu Jan 02 22:53:15 EST 2025 Tue Jul 01 00:55:50 EDT 2025 Fri Feb 21 02:39:45 EST 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
License	2023. The Author(s). Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c491t-2be7e800561cd1c5756e22f0c6dfb8b66de932535c8169d9a49835e4c6dceb0d3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
OpenAccessLink	https://www.proquest.com/docview/2770826615?pq-origsite=%requestingapplication%
PMID	36717641
PQID	2770826615
PQPubID	2041939
PageCount	10
ParticipantIDs	doaj_primary_oai_doaj_org_article_e173bc1786914291846045a6f6c89c76 pubmedcentral_primary_oai_pubmedcentral_nih_gov_9886969 proquest_miscellaneous_2771333822 proquest_journals_2770826615 pubmed_primary_36717641 crossref_primary_10_1038_s41598_023_28804_9 springer_journals_10_1038_s41598_023_28804_9
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-01-30
PublicationDateYYYYMMDD	2023-01-30
PublicationDate_xml	– month: 01 year: 2023 text: 2023-01-30 day: 30
PublicationDecade	2020
PublicationPlace	London
PublicationPlace_xml	– name: London – name: England
PublicationTitle	Scientific reports
PublicationTitleAbbrev	Sci Rep
PublicationTitleAlternate	Sci Rep
PublicationYear	2023
Publisher	Nature Publishing Group UK Nature Publishing Group Nature Portfolio
Publisher_xml	– name: Nature Publishing Group UK – name: Nature Publishing Group – name: Nature Portfolio
References	Shu, T., Xiong, C. & Socher, R. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. arXiv preprintarXiv:1712.07294 (2017). Puiutta, E. & Veith, E. M. S. P. Explainable reinforcement learning: A survey. In Holzinger, A., Kieseberg, P., Tjoa, A. M. & Weippl, E. (eds.) Machine Learning and Knowledge Extraction, 77–95 (Springer International Publishing, Cham, 2020). Verma, A., Murali, V., Singh, R., Kohli, P. & Chaudhuri, S. Programmatically interpretable reinforcement learning. In International Conference on Machine Learning, 5045–5054 (PMLR, 2018). Juozapaitis, Z., Koul, A., Fern, A., Erwig, M. & Doshi-Velez, F. Explainable reinforcement learning via reward decomposition. In Proceedings at the International Joint Conference on Artificial Intelligence. A Workshop on Explainable Artificial Intelligence. (2019). Clark, J. & Amodei, D. Faulty reward functions in the wild. Internet: https://blog.openai.com/faulty-reward-functions (2016). Zambaldi, V. et al. Deep reinforcement learning with relational inductive biases. In International Conference on Learning Representations (2019). MnihVHuman-level control through deep reinforcement learningNature20155185295332015Natur.518..529M1:CAS:528:DC%2BC2MXjsVagur0%3D10.1038/nature14236 Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. & Dragan, A. D. Inverse reward design. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 6768–6777 (Curran Associates Inc., Red Hook, NY, USA, 2017). Russell, S. J. Artificial Intelligence a Modern Approach (Pearson Education, Inc., 2010). MillerEKFreedmanDJWallisJDThe prefrontal cortex: Categories, concepts and cognitionPhilos. Trans. R. Soc. Lond. Ser. B Biol. Sci.20023571123113610.1098/rstb.2002.1099 Chen, X. & He, K. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15750–15758 (2021). Henderson, P. et al. Deep reinforcement learning that matters. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018). Hafner, D., Lillicrap, T., Ba, J. & Norouzi, M. Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations (2020). Kalweit, G. & Boedecker, J. Uncertainty-driven imagination for continuous deep reinforcement learning. In Levine, S., Vanhoucke, V. & Goldberg, K. (eds.) Proceedings of the 1st Annual Conference on Robot Learning, vol. 78 of Proceedings of Machine Learning Research, 195–206 (PMLR, 2017). Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. arXiv preprintarXiv:2003.04297 (2020). Gilpin, L. H. et al. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 80–89 (2018). Tjoa, E. & Guan, C. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Transactions on Neural Networks and Learning Systems 1–21, https://doi.org/10.1109/TNNLS.2020.3027314 (2020). Singh, S., Lewis, R. L. & Barto, A. G. Where do rewards come from. In Proceedings of the Annual Conference of the Cognitive Science Society, 2601–2606 (Cognitive Science Society, 2009). Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 1597–1607 (PMLR, 2020). Racanière, S. et al. Imagination-augmented agents for deep reinforcement learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 5694–5705 (Curran Associates Inc., Red Hook, NY, USA, 2017). Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT press, 2018). ArrietaABExplainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible aiInf. Fusion2020588211510.1016/j.inffus.2019.12.012 SilverDMastering the game of go with deep neural networks and tree searchNature20165294844892016Natur.529..484S1:CAS:528:DC%2BC28Xhs12is7w%3D10.1038/nature16961 Greydanus, S., Koul, A., Dodge, J. & Fern, A. Visualizing and understanding Atari agents. In Dy, J. & Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, 1792–1801 (PMLR, 2018). Oh, J., Singh, S. & Lee, H. Value prediction network. In Guyon, I. et al. (eds.) Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., 2017). Todorov, E., Erez, T. & Tassa, Y. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 5026–5033, https://doi.org/10.1109/IROS.2012.6386109 (2012). HeuilletACouthouisFDíaz-RodríguezNExplainability in deep reinforcement learningKnowledge-Based Syst.202121410.1016/j.knosys.2020.106685 Kahn, G., Villaflor, A., Ding, B., Abbeel, P. & Levine, S. Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 5129–5136, https://doi.org/10.1109/ICRA.2018.8460655 (2018). 28804_CR21 EK Miller (28804_CR20) 2002; 357 28804_CR25 V Mnih (28804_CR2) 2015; 518 28804_CR24 28804_CR23 28804_CR22 28804_CR28 28804_CR8 28804_CR27 28804_CR9 28804_CR26 28804_CR19 D Silver (28804_CR3) 2016; 529 28804_CR10 28804_CR14 28804_CR12 28804_CR11 28804_CR18 28804_CR17 AB Arrieta (28804_CR4) 2020; 58 28804_CR16 28804_CR15 28804_CR6 28804_CR7 28804_CR5 28804_CR1 A Heuillet (28804_CR13) 2021; 214
References_xml	– reference: Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT press, 2018). – reference: Tjoa, E. & Guan, C. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Transactions on Neural Networks and Learning Systems 1–21, https://doi.org/10.1109/TNNLS.2020.3027314 (2020). – reference: Russell, S. J. Artificial Intelligence a Modern Approach (Pearson Education, Inc., 2010). – reference: Racanière, S. et al. Imagination-augmented agents for deep reinforcement learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 5694–5705 (Curran Associates Inc., Red Hook, NY, USA, 2017). – reference: Oh, J., Singh, S. & Lee, H. Value prediction network. In Guyon, I. et al. (eds.) Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., 2017). – reference: Gilpin, L. H. et al. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 80–89 (2018). – reference: Verma, A., Murali, V., Singh, R., Kohli, P. & Chaudhuri, S. Programmatically interpretable reinforcement learning. In International Conference on Machine Learning, 5045–5054 (PMLR, 2018). – reference: Puiutta, E. & Veith, E. M. S. P. Explainable reinforcement learning: A survey. In Holzinger, A., Kieseberg, P., Tjoa, A. M. & Weippl, E. (eds.) Machine Learning and Knowledge Extraction, 77–95 (Springer International Publishing, Cham, 2020). – reference: MillerEKFreedmanDJWallisJDThe prefrontal cortex: Categories, concepts and cognitionPhilos. Trans. R. Soc. Lond. Ser. B Biol. Sci.20023571123113610.1098/rstb.2002.1099 – reference: Shu, T., Xiong, C. & Socher, R. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. arXiv preprintarXiv:1712.07294 (2017). – reference: Juozapaitis, Z., Koul, A., Fern, A., Erwig, M. & Doshi-Velez, F. Explainable reinforcement learning via reward decomposition. In Proceedings at the International Joint Conference on Artificial Intelligence. A Workshop on Explainable Artificial Intelligence. (2019). – reference: Zambaldi, V. et al. Deep reinforcement learning with relational inductive biases. In International Conference on Learning Representations (2019). – reference: MnihVHuman-level control through deep reinforcement learningNature20155185295332015Natur.518..529M1:CAS:528:DC%2BC2MXjsVagur0%3D10.1038/nature14236 – reference: Greydanus, S., Koul, A., Dodge, J. & Fern, A. Visualizing and understanding Atari agents. In Dy, J. & Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, 1792–1801 (PMLR, 2018). – reference: Kalweit, G. & Boedecker, J. Uncertainty-driven imagination for continuous deep reinforcement learning. In Levine, S., Vanhoucke, V. & Goldberg, K. (eds.) Proceedings of the 1st Annual Conference on Robot Learning, vol. 78 of Proceedings of Machine Learning Research, 195–206 (PMLR, 2017). – reference: Todorov, E., Erez, T. & Tassa, Y. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 5026–5033, https://doi.org/10.1109/IROS.2012.6386109 (2012). – reference: Kahn, G., Villaflor, A., Ding, B., Abbeel, P. & Levine, S. Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 5129–5136, https://doi.org/10.1109/ICRA.2018.8460655 (2018). – reference: SilverDMastering the game of go with deep neural networks and tree searchNature20165294844892016Natur.529..484S1:CAS:528:DC%2BC28Xhs12is7w%3D10.1038/nature16961 – reference: Singh, S., Lewis, R. L. & Barto, A. G. Where do rewards come from. In Proceedings of the Annual Conference of the Cognitive Science Society, 2601–2606 (Cognitive Science Society, 2009). – reference: Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 1597–1607 (PMLR, 2020). – reference: Chen, X. & He, K. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15750–15758 (2021). – reference: ArrietaABExplainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible aiInf. Fusion2020588211510.1016/j.inffus.2019.12.012 – reference: Henderson, P. et al. Deep reinforcement learning that matters. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018). – reference: Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. arXiv preprintarXiv:2003.04297 (2020). – reference: Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. & Dragan, A. D. Inverse reward design. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 6768–6777 (Curran Associates Inc., Red Hook, NY, USA, 2017). – reference: Hafner, D., Lillicrap, T., Ba, J. & Norouzi, M. Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations (2020). – reference: HeuilletACouthouisFDíaz-RodríguezNExplainability in deep reinforcement learningKnowledge-Based Syst.202121410.1016/j.knosys.2020.106685 – reference: Clark, J. & Amodei, D. Faulty reward functions in the wild. Internet: https://blog.openai.com/faulty-reward-functions (2016). – ident: 28804_CR18 – ident: 28804_CR14 doi: 10.1007/978-3-030-57321-8_5 – ident: 28804_CR24 – ident: 28804_CR26 – ident: 28804_CR22 doi: 10.1109/CVPR46437.2021.01549 – ident: 28804_CR28 – ident: 28804_CR8 – ident: 28804_CR12 – volume: 214 year: 2021 ident: 28804_CR13 publication-title: Knowledge-Based Syst. doi: 10.1016/j.knosys.2020.106685 – ident: 28804_CR10 – ident: 28804_CR16 – ident: 28804_CR5 doi: 10.1109/DSAA.2018.00018 – ident: 28804_CR17 – ident: 28804_CR19 – ident: 28804_CR9 doi: 10.1609/aaai.v32i1.11694 – volume: 529 start-page: 484 year: 2016 ident: 28804_CR3 publication-title: Nature doi: 10.1038/nature16961 – ident: 28804_CR6 doi: 10.1109/TNNLS.2020.3027314 – ident: 28804_CR21 doi: 10.1109/ICRA.2018.8460655 – ident: 28804_CR27 – volume: 58 start-page: 82 year: 2020 ident: 28804_CR4 publication-title: Inf. Fusion doi: 10.1016/j.inffus.2019.12.012 – ident: 28804_CR23 – ident: 28804_CR7 – ident: 28804_CR25 doi: 10.1109/IROS.2012.6386109 – ident: 28804_CR1 – ident: 28804_CR15 – volume: 518 start-page: 529 year: 2015 ident: 28804_CR2 publication-title: Nature doi: 10.1038/nature14236 – ident: 28804_CR11 – volume: 357 start-page: 1123 year: 2002 ident: 28804_CR20 publication-title: Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. doi: 10.1098/rstb.2002.1099
SSID	ssj0000529419
Score	2.3781698
Snippet	The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or... Abstract The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep...
SourceID	doaj pubmedcentral proquest pubmed crossref springer
SourceType	Open Website Open Access Repository Aggregation Database Index Database Publisher
StartPage	1638
SubjectTerms	639/705/117 639/705/258 Design Humanities and Social Sciences multidisciplinary Neural networks Reinforcement Science Science (multidisciplinary)
SummonAdditionalLinks	– databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT9wwEB4hJCQuqBTaBrZVkLi10caP-HFsURFCggustDdr_WpXqgKC5bD_nrGTDWxLxYVrnMPkGzvzjT3-BuA4Oka1xLQkCqvSlZy6slQ0VZA4naj31OcT04tLcTbh59Nm-qzVV6oJ6-SBO-DGgUhmHZFKaIL_TkxIBLKQmYjCKe1kFtvGmPcsmepUvanmRPe3ZGqmxvcYqdJtMsoqinOWV3otEmXB_pdY5r_Fkn-dmOZAdPoOdnoGWX7vLN-FjdC-h62up-RyD5qr8CeWdyFVw5Y-12eUabO1jMgnq1-pI0Tw5XyoNczFsct9mJz-vD45q_reCJXjmiwqaoMMKgt5Ok8cki4RKI21Ez5aZYXwAZlZwxqniNBez7hGrhU4jrtga88-wGZ704ZPUEZMomxiAlY6HiVXFhe1bljgxM9Uwwv4usLJ3HYSGCYfXTNlOlQNomoyqkYX8CNBObyZ5KvzA3Sq6Z1qXnNqAaOVI0y_pu4NlRKtRD7RFHA0DONqSEccszbcPOR3MOlmyHoK-Nj5bbCECUxdBScFyDWPrpm6PtLOf2fFba3QUoHf9m3l-yez_g_FwVtAcQjbqcN92vVh9Qg2F3cP4TPyoIX9kqf8I2db_1I priority: 102 providerName: Directory of Open Access Journals – databaseName: Scholars Portal Journals: Open Access dbid: M48 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1ZSxxBEC7UEPBFYg4zRsMEfIsTt4_p4yGICREJ6Euy4FuzfemCzMZ1Bfffp7pnZmWT1dfpHqiug_qquw6Ag-gY1RLDkiisSiU5g8pSUVdBojpR76nPL6bnF-JsyH9e1pdr0I876hh4tzK0S_OkhtObLw-382M0-K9tybg6ukMnlArFKKsoqiOv9Dq8QM8kk6Ged3C_7fVNNSe6q51Z_euSf8pt_Fdhz_9TKP95R83u6fQVbHW4sjxpFWEb1kLzGl62kybnb6D-FW5iOQ0pR7b0OWujTFewZUSUWV2lORHBl-NFBmJOmZ2_heHpj9_fz6puYkLluCazitogg8rtPZ0nDqGYCJTGgRM-WmWF8AHxWs1qp4jQXo-4RgQWOK67YAeevYONZtKE91BGDK1swgdWOh4lVxZNXdcscOJHquYFfO75ZP60jTFMftBmyrRcNchVk7lqdAHfEisXO1NT6_xhMr0ynY2YQCSzjkglNEE3ibGnQMA5ElE4pZ0UBez1gjC9ohgqJVKJKKMu4NNiGW0kPXyMmjC5z3swFGeIhQrYaeW2oIQJDGgFJwXIJYkukbq80oyvcx9urZBSgWc77GX_SNbTrNh9_hQfYDNNtE-3PGywBxuz6X3YR9wzsx-zMv8Frhb7qg priority: 102 providerName: Scholars Portal – databaseName: Springer Nature OA Free Journals dbid: C6C link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JaxVBEC5iRPAiieuYhRG86eD0MtXdR30kBEEvGsiteb3FgMwLLy-H_HurexZ5Gg9ep3ugpqp66quuDeBt8oIbRW5JQqdzSU7bOI5dExWpEw-BhxIx_fIVz87l54vuYgf4VAtTkvZLS8vym56ywz7ckKHJxWBcNJxUTjbmATzMrduzVi9wMd-r5MiVZGasj2mFvufVLRtUWvXfhy__TpP8I1ZaTNDpHjwZsWP9caB2H3Zi_xQeDdMk755B9y3-TPU65jzYOpTMjDpfs9aJkGRzmWdBxFBfzVmGJS327jmcn558X5w141SExkvDNg13UUVdWnj6wDzBLYycp9ZjSE47xBAJk3Wi85qhCWYpDaGsKGndR9cG8QJ2-1UfX0GdyH1yGQM45WVSUjs6zqYTUbKw1J2s4N3EJ3s9NL-wJWgttB24aomrtnDVmgo-ZVbOO3Pj6vJgtb60oyBtZEo4z5RGw8gUkn-JBCqXmNBr4xVWcDgJwo6n6cZypYhKQhJdBW_mZToHObix7OPqtuwhd1sQ3qng5SC3mRKB5LSiZBWoLYlukbq90l_9KL22jSZKkb7t_ST732T9mxWv_2_7ATzOU-zzzY5oD2F3s76NR4R1Nu64KPcvnlr2UQ priority: 102 providerName: Springer Nature
Title	Self reward design with fine-grained interpretability
URI	https://link.springer.com/article/10.1038/s41598-023-28804-9 https://www.ncbi.nlm.nih.gov/pubmed/36717641 https://www.proquest.com/docview/2770826615 https://www.proquest.com/docview/2771333822 https://pubmed.ncbi.nlm.nih.gov/PMC9886969 https://doaj.org/article/e173bc1786914291846045a6f6c89c76
Volume	13
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LaxsxEB7ahEIvpe9umpot9NaK7EpaPU7FMQnBkFCaBnwT1isJlHVqO4f8-4y06w3u67ILkg7SjKT55qEZgE_RMaolqiVRWJWe5FTEUtGQIHE7Ue-pzx7T0zNxcsGns2bWG9xWfVjl5k7MF7VfuGQjP6BSorRCadJ8vflFUtWo5F3tS2g8ht2UuiyFdMmZHGwsyYvFa92_lamYOlihvEpvyigjFHcuJ3pLHuW0_X_Dmn-GTP7mN83i6Pg5POtxZDnuGP8CHoX2JTzpKkvevYLmPPyM5TKkmNjS5yiNMplcy4ioklymuhDBl9dDxGEOkb17DRfHRz8mJ6SvkEAc1_WaUBtkUDmdp_O1Q-glAqWxcsJHq6wQPiA-a1jjVC2013OuEXEFjv0u2MqzN7DTLtrwDsqIqpRNeMBKx6PkyuLR1g0LvPZz1fACPm_oZG66RBgmO7CZMh1VDVLVZKoaXcBhIuUwMiWxzg2L5aXpz4QJtWTW1VIJXaNYRF1TIMCciyic0k6KAvY3jDD9yVqZh31QwMehG89EcnTM27C4zWNQ9WaIfQp42_FtmAkTqMAKXhcgtzi6NdXtnvb6Kufd1gpnKnBtXza8f5jWv0mx9_9VvIenqYJ9suqwah921svb8AFxztqO8mYewe54PD2f4v_w6Ozbd2ydiMko2w7we8rVPVuh_kk
linkProvider	ProQuest
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Zb9QwEB6VrRC8IG4CLQQJniBqYjs-HhBqodWWtisErdQ3sz5SKqFs2d2q2j_V38jYOarleutrbEX2HJ5vPOMZgFeVpUQJdEsqbmR4kpNnhvAy8wLFiThHXIyYHoz48Ih9Oi6PV-CyewsT0iq7MzEe1G5iwx35BhECrRVak_L92c8sdI0K0dWuhUYjFnt-cYEu2-zd7kfk72tCdrYPPwyztqtAZpkq5hkxXngZS2BaV1iEK9wTUuWWu8pIw7nziGlKWlpZcOXUmClEKZ7huPUmdxT_ewNWGUVXZgCrW9ujz1_6W50QN2OFal_n5FRuzNBChldshGYEdYVlaskCxkYBf0O3fyZp_hapjQZw5y7caZFrutmI2j1Y8fV9uNn0slw8gPKr_1GlUx-ycFMX80LScMmbVohjs5PQicK79LTPcYxJuYuHcHQt1HsEg3pS-yeQVui8mYBAjLCsEkwaPExUST0r3FiWLIE3HZ30WVN6Q8eQOZW6oapGqupIVa0S2Aqk7GeGstnxw2R6olst1L4Q1NhCSK4KNMTo3XKEtGNecSuVFTyBtY4RutXlmb6SvARe9sOohSG0Mq795DzOQWefItpK4HHDt34llKPLzFmRgFji6NJSl0fq0--x0reSuFKOe3vb8f5qWf8mxdP_7-IF3BoeHuzr_d3R3jO4TYJo5uEScg0G8-m5X0eUNTfPW9FO4dt1a9Mvndo3Cw
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Zb9QwEB6VIhAviJtAgSDBE0QbH_HxgBBQVi2FCgkq7ZtZH2krVdmyuxXav8avY-wc1XK99TW2InsOzzee8QzAs9oxqiW6JbWwKj7JKQtLRVUEieJEvac-RUw_7YudA_5hUk024Gf_FiamVfZnYjqo_czFO_IRlRKtFVqTalR3aRGft8evT78XsYNUjLT27TRaEdkLqx_ovi1e7W4jr59TOn7_9d1O0XUYKBzXZFlQG2RQqRym88QhdBGB0rp0wtdWWSF8QHxTscopIrTXU64RsQSO4y7Y0jP87yW4LFlFoo7JiRzud2IEjRPdvdMpmRot0FbG92yUFRS1hhd6zRamlgF_w7l_pmv-FrNNpnB8A653GDZ_0wrdTdgIzS240na1XN2G6ks4qfN5iPm4uU8ZInm87s1rRLTFYexJEXx-PGQ7pvTc1R04uBDa3YXNZtaE-5DX6MbZiEWsdLyWXFk8VnTFAid-qiqewYueTua0LcJhUvCcKdNS1SBVTaKq0Rm8jaQcZsYC2unDbH5oOn00gUhmHZFKaIImGf1cgeB2KmrhlHZSZLDVM8J0Wr0w5zKYwdNhGPUxBlmmTZidpTno9jPEXRnca_k2rIQJdJ4FJxnINY6uLXV9pDk-SjW_tcKVCtzby57358v6Nyke_H8XT-Aq6pD5uLu_9xCu0SiZZbyN3ILN5fwsPEK4tbSPk1zn8O2iFekXzUU52w
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Self+reward+design+with+fine-grained+interpretability&rft.jtitle=Scientific+reports&rft.au=Tjoa%2C+Erico&rft.au=Guan%2C+Cuntai&rft.date=2023-01-30&rft.pub=Nature+Publishing+Group&rft.eissn=2045-2322&rft.volume=13&rft.issue=1&rft.spage=1638&rft_id=info:doi/10.1038%2Fs41598-023-28804-9&rft.externalDBID=HAS_PDF_LINK
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2045-2322&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2045-2322&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2045-2322&client=summon