Toward Human-in-the-Loop AI: Enhancing Deep Reinforcement Learning via Real-Time Human Guidance for Autonomous Driving

Due to its limited intelligence and abilities, machine learning is currently unable to handle various situations thus cannot completely replace humans in real-world applications. Because humans exhibit robustness and adaptability in complex scenarios, it is crucial to introduce humans into the train...

Full description

Saved in:
Bibliographic Details
Published inEngineering (Beijing, China) Vol. 21; pp. 75 - 91
Main Authors Wu, Jingda, Huang, Zhiyu, Hu, Zhongxu, Lv, Chen
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.02.2023
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Due to its limited intelligence and abilities, machine learning is currently unable to handle various situations thus cannot completely replace humans in real-world applications. Because humans exhibit robustness and adaptability in complex scenarios, it is crucial to introduce humans into the training loop of artificial intelligence (AI), leveraging human intelligence to further advance machine learning algorithms. In this study, a real-time human-guidance-based (Hug)-deep reinforcement learning (DRL) method is developed for policy training in an end-to-end autonomous driving case. With our newly designed mechanism for control transfer between humans and automation, humans are able to intervene and correct the agent’s unreasonable actions in real time when necessary during the model training process. Based on this human-in-the-loop guidance mechanism, an improved actor-critic architecture with modified policy and value networks is developed. The fast convergence of the proposed Hug-DRL allows real-time human guidance actions to be fused into the agent’s training loop, further improving the efficiency and performance of DRL. The developed method is validated by human-in-the-loop experiments with 40 subjects and compared with other state-of-the-art learning approaches. The results suggest that the proposed method can effectively enhance the training efficiency and performance of the DRL algorithm under human guidance without imposing specific requirements on participants’ expertise or experience.
AbstractList Due to its limited intelligence and abilities, machine learning is currently unable to handle various situations thus cannot completely replace humans in real-world applications. Because humans exhibit robustness and adaptability in complex scenarios, it is crucial to introduce humans into the training loop of artificial intelligence (AI), leveraging human intelligence to further advance machine learning algorithms. In this study, a real-time human-guidance-based (Hug)-deep reinforcement learning (DRL) method is developed for policy training in an end-to-end autonomous driving case. With our newly designed mechanism for control transfer between humans and automation, humans are able to intervene and correct the agent’s unreasonable actions in real time when necessary during the model training process. Based on this human-in-the-loop guidance mechanism, an improved actor-critic architecture with modified policy and value networks is developed. The fast convergence of the proposed Hug-DRL allows real-time human guidance actions to be fused into the agent’s training loop, further improving the efficiency and performance of DRL. The developed method is validated by human-in-the-loop experiments with 40 subjects and compared with other state-of-the-art learning approaches. The results suggest that the proposed method can effectively enhance the training efficiency and performance of the DRL algorithm under human guidance without imposing specific requirements on participants’ expertise or experience.
Author Hu, Zhongxu
Lv, Chen
Huang, Zhiyu
Wu, Jingda
Author_xml – sequence: 1
  givenname: Jingda
  surname: Wu
  fullname: Wu, Jingda
– sequence: 2
  givenname: Zhiyu
  surname: Huang
  fullname: Huang, Zhiyu
– sequence: 3
  givenname: Zhongxu
  surname: Hu
  fullname: Hu, Zhongxu
– sequence: 4
  givenname: Chen
  surname: Lv
  fullname: Lv, Chen
  email: lyuchen@ntu.edu.sg
BookMark eNp9kM9OGzEQh32gEpTyANz8Arsde9f7pz1FQCFSpEoonK1ZezY4ytqR10nVt69DKg4cOI004-834-8ru_DBE2O3AkoBovm-LclvSglSlqBKEO0Fu5LQq6KDvr9kN_O8BQChBLTQXbHjOvzBaPnTYUJfOF-kVypWIez5YvmDP_hX9Mb5Db8n2vNncn4M0dBEPvEVYfSn2dFhHuGuWLuJzkn88eBsRonn93xxSMGHKRxmfh_dMTPf2JcRdzPd_K_X7OXXw_ruqVj9flzeLVaFqaFJhVTKUG2hs71pVC07JQkN2sooqHORMLaq6tQgzNB1lTWjzJ8bbF1XigixumbLc64NuNX76CaMf3VAp98aIW40xuTMjjSpSlVigMaiqftGDa0c82YjOtUA0JizxDnLxDDPkcb3PAH65F5vdXavT-41KJ3dZ6b9wBiXMLngU0S3-5T8eSYp6zk6ino2jrJR6yKZlO93n9D_AJFxooc
CitedBy_id crossref_primary_10_1109_TITS_2024_3420959
crossref_primary_10_1109_JIOT_2024_3479285
crossref_primary_10_3390_s25010211
crossref_primary_10_1016_j_enconman_2024_118499
crossref_primary_10_1109_TCYB_2024_3401014
crossref_primary_10_3390_s22197415
crossref_primary_10_1016_j_asoc_2024_112386
crossref_primary_10_1109_TITS_2023_3254579
crossref_primary_10_1016_j_measen_2024_101241
crossref_primary_10_1016_j_aap_2023_107372
crossref_primary_10_3390_ai5020041
crossref_primary_10_1016_j_apenergy_2024_125138
crossref_primary_10_3934_era_2024111
crossref_primary_10_1016_j_energy_2023_128928
crossref_primary_10_1016_j_eng_2023_10_005
crossref_primary_10_1109_TITS_2024_3375331
crossref_primary_10_1109_TIV_2023_3336768
crossref_primary_10_1016_j_energy_2023_130146
crossref_primary_10_1016_j_eswa_2024_126319
crossref_primary_10_3390_s25010026
crossref_primary_10_1016_j_matt_2024_01_005
crossref_primary_10_3390_app14010463
crossref_primary_10_1016_j_aei_2025_103188
crossref_primary_10_1080_10447318_2024_2413293
crossref_primary_10_1109_TITS_2024_3452480
crossref_primary_10_1007_s41870_023_01412_6
crossref_primary_10_1109_ACCESS_2023_3339631
crossref_primary_10_1109_TVT_2024_3355895
crossref_primary_10_1016_j_eng_2023_12_003
crossref_primary_10_1109_TITS_2023_3339125
crossref_primary_10_1109_TTE_2023_3339490
crossref_primary_10_1142_S2424905X24400105
crossref_primary_10_1080_13658816_2023_2279975
crossref_primary_10_3390_technologies12120259
crossref_primary_10_1109_TVT_2023_3307409
crossref_primary_10_1109_TITS_2024_3420894
crossref_primary_10_1016_j_ssci_2024_106770
crossref_primary_10_1109_TMC_2024_3501734
crossref_primary_10_1007_s10489_024_06054_0
crossref_primary_10_1016_j_eng_2024_10_021
crossref_primary_10_1016_j_apenergy_2024_123217
crossref_primary_10_1016_j_oceaneng_2025_120446
crossref_primary_10_1007_s10489_024_06098_2
crossref_primary_10_1109_ACCESS_2024_3401547
crossref_primary_10_1109_TITS_2022_3208004
crossref_primary_10_1080_08839514_2024_2349410
crossref_primary_10_1109_TASE_2023_3342419
crossref_primary_10_1109_JAS_2023_123477
crossref_primary_10_1016_j_rser_2023_114154
crossref_primary_10_1142_S2424905X23400044
crossref_primary_10_1016_j_automatica_2024_111558
crossref_primary_10_1109_TTE_2024_3421342
crossref_primary_10_1109_TIV_2022_3195635
crossref_primary_10_1109_TTE_2023_3290069
crossref_primary_10_1155_2023_1286977
crossref_primary_10_1016_j_energy_2023_128462
crossref_primary_10_1109_TPAMI_2023_3314762
crossref_primary_10_1109_JIOT_2024_3497185
crossref_primary_10_1016_j_geits_2023_100122
crossref_primary_10_1016_j_trc_2024_104654
crossref_primary_10_1109_TITS_2023_3316203
crossref_primary_10_1016_j_commtr_2024_100127
crossref_primary_10_1080_15472450_2024_2370010
crossref_primary_10_34133_research_0349
Cites_doi 10.1109/TCDS.2016.2628365
10.1038/s41467-021-21007-8
10.1038/nature14236
10.1038/s42256-019-0046-z
10.2352/ISSN.2470-1173.2017.19.AVM-023
10.1109/JSEN.2020.3003121
10.1038/nature14540
10.1038/s42256-019-0025-4
10.1109/JAS.2017.7510745
10.1109/LRA.2020.2967299
10.1038/nature16961
10.1515/eng-2020-0004
10.1038/nature24270
10.1126/science.aar6404
ContentType Journal Article
Copyright 2022 THE AUTHOR
Copyright_xml – notice: 2022 THE AUTHOR
DBID 6I.
AAFTH
AAYXX
CITATION
DOA
DOI 10.1016/j.eng.2022.05.017
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EndPage 91
ExternalDocumentID oai_doaj_org_article_e53531b06dac4965b72fd08c185600ef
10_1016_j_eng_2022_05_017
S2095809922004878
GroupedDBID 0R~
0SF
1-T
5VR
6I.
92H
92I
92R
93N
AACTN
AAEDW
AAFTH
AALRI
AAXUO
ABMAC
ACGFS
ACHIH
ADBBV
AEXQZ
AFTJW
AFUIB
AITUG
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
BCNDV
CCEZO
CEKLB
EBS
EJD
FDB
GROUPED_DOAJ
IPNFZ
M41
NCXOZ
O9-
OK1
RIG
ROL
SSZ
TCJ
TGT
-SC
-S~
AAYWO
AAYXX
ACVFH
ADCNI
ADVLN
AEUPX
AFJKZ
AFPUW
AIGII
AKBMS
AKRWK
AKYEP
CAJEC
CITATION
Q--
U1G
U5M
ID FETCH-LOGICAL-c406t-255ce4d08d9c6542852eacad3c504ad320f75385b1cb883dcf2000bd4435eeaa3
IEDL.DBID DOA
ISSN 2095-8099
IngestDate Wed Aug 27 01:16:33 EDT 2025
Thu Apr 24 22:54:05 EDT 2025
Tue Jul 01 02:18:54 EDT 2025
Thu Jul 20 20:08:50 EDT 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Human-in-the-loop AI
Deep reinforcement learning
Human guidance
Autonomous driving
Language English
License This is an open access article under the CC BY-NC-ND license.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c406t-255ce4d08d9c6542852eacad3c504ad320f75385b1cb883dcf2000bd4435eeaa3
OpenAccessLink https://doaj.org/article/e53531b06dac4965b72fd08c185600ef
PageCount 17
ParticipantIDs doaj_primary_oai_doaj_org_article_e53531b06dac4965b72fd08c185600ef
crossref_primary_10_1016_j_eng_2022_05_017
crossref_citationtrail_10_1016_j_eng_2022_05_017
elsevier_sciencedirect_doi_10_1016_j_eng_2022_05_017
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate February 2023
2023-02-00
2023-02-01
PublicationDateYYYYMMDD 2023-02-01
PublicationDate_xml – month: 02
  year: 2023
  text: February 2023
PublicationDecade 2020
PublicationTitle Engineering (Beijing, China)
PublicationYear 2023
Publisher Elsevier Ltd
Elsevier
Publisher_xml – name: Elsevier Ltd
– name: Elsevier
References Codevilla, Müller, López, Koltun, Dosovitskiy (b0025) 2018
Vecerik M, Hester T, Scholz J, Wang F, Pietquin O, Piot B, et al. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. 2017. arXiv:1707.08817.
Neftci, Averbeck (b0100) 2019; 1
Huang, Lv, Xing, Wu (b0110) 2021; 21
Nair, McGrew, Andrychowicz, Zaremba, Abbeel (b0170) 2018
Lv, Cao, Zhao, Auger, Sullman, Wang (b0115) 2018; 5
Ziebart, Maas, Bagnell, Dey (b0150) 2008
Machado, Bellemare, Bowling (b0195) 2020
Feng, Yan, Sun, Feng, Liu (b0020) 2021; 12
Mo X, Huang Z, Xing Y, Lv C. Multi-agent trajectory prediction with heterogeneous edge-enhanced graph attention network. IEEE Trans Intell Transp Syst. In press.
Huang Z, Wu J, Lv C. Efficient deep reinforcement learning with imitative expert priors for autonomous driving. IEEE Trans Neural Netw Learn Syst. In press.
Silver, Schrittwieser, Simonyan, Antonoglou, Huang, Guez (b0055) 2017; 550
Saunders, Sastry, Stuhlmüller, Evans (b0160) 2018
Haarnoja, Zhou, Abbeel, Levine (b0085) 2018
Rajeswaran, Kumar, Gupta, Vezzani, Schulman, Todorov (b0140) 2018
Codevilla, Santana, López, Gaidon (b0035) 2019
Hu Z, Zhang Y, Xing Y, Zhao Y, Cao D, Lv C. Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker. IEEE Veh Technol Mag. In press.
Cai, Mei, Tai, Sun, Liu (b0095) 2020; 5
Silver, Huang, Maddison, Guez, Sifre, van den Driessche (b0050) 2016; 529
Droździel, Tarkowski, Rybicka, Wrona (b0185) 2020; 10
Ibarz, Leike, Pohlen, Irving, Legg, Amodei (b0145) 2018
Stilgoe (b0005) 2019; 1
Wolf, Hubschneider, Weber, Bauer, Härtl, Dürr (b0075) 2017
MacGlashan, Ho, Loftin, Peng, Wang, Roberts (b0130) 2017
Sallab, Abdou, Perot, Yogamani (b0080) 2017; 29
Badia, Sprechmann, Vitvitskyi, Guo, Piot, Kapturowski (b0200) 2020
Ross, Gordon, Bagnell (b0040) 2011
Sutton, Barto (b0065) 2018
Mao, Gan, Kohli, Tenenbaum, Wu (b0120) 2019
Fujimoto, van Hoof, Meger (b0090) 2018
Knox, Stone (b0125) 2012
Harutyunyan, Dabney, Mesnard, Azar, Piot, Heess (b0105) 2019
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare (b0070) 2015; 518
Huang Z, Wu J, Lv C. Driving behavior modeling using naturalistic human driving data with inverse reinforcement learning. IEEE Trans Intell Transp Syst. In press.
Littman (b0180) 2015; 521
Silver, Hubert, Schrittwieser, Antonoglou, Lai, Guez (b0060) 2018; 362
Hester, Vecerik, Pietquin, Lanctot, Schaul, Piot (b0155) 2018
Krening, Harrison, Feigh, Isbell, Riedl, Thomaz (b0165) 2017; 9
Ho, Ermon (b0045) 2016
Wang, Zhou, Chen, Fan, Zhang, Li (b0175) 2018
Ross (10.1016/j.eng.2022.05.017_b0040) 2011
Saunders (10.1016/j.eng.2022.05.017_b0160) 2018
Silver (10.1016/j.eng.2022.05.017_b0055) 2017; 550
Huang (10.1016/j.eng.2022.05.017_b0110) 2021; 21
Droździel (10.1016/j.eng.2022.05.017_b0185) 2020; 10
Wolf (10.1016/j.eng.2022.05.017_b0075) 2017
Knox (10.1016/j.eng.2022.05.017_b0125) 2012
Feng (10.1016/j.eng.2022.05.017_b0020) 2021; 12
Fujimoto (10.1016/j.eng.2022.05.017_b0090) 2018
Mao (10.1016/j.eng.2022.05.017_b0120) 2019
Ho (10.1016/j.eng.2022.05.017_b0045) 2016
MacGlashan (10.1016/j.eng.2022.05.017_b0130) 2017
Wang (10.1016/j.eng.2022.05.017_b0175) 2018
Haarnoja (10.1016/j.eng.2022.05.017_b0085) 2018
Lv (10.1016/j.eng.2022.05.017_b0115) 2018; 5
Sutton (10.1016/j.eng.2022.05.017_b0065) 2018
Hester (10.1016/j.eng.2022.05.017_b0155) 2018
Krening (10.1016/j.eng.2022.05.017_b0165) 2017; 9
Sallab (10.1016/j.eng.2022.05.017_b0080) 2017; 29
Codevilla (10.1016/j.eng.2022.05.017_b0025) 2018
Codevilla (10.1016/j.eng.2022.05.017_b0035) 2019
Ziebart (10.1016/j.eng.2022.05.017_b0150) 2008
Harutyunyan (10.1016/j.eng.2022.05.017_b0105) 2019
Silver (10.1016/j.eng.2022.05.017_b0060) 2018; 362
Littman (10.1016/j.eng.2022.05.017_b0180) 2015; 521
Silver (10.1016/j.eng.2022.05.017_b0050) 2016; 529
Cai (10.1016/j.eng.2022.05.017_b0095) 2020; 5
Machado (10.1016/j.eng.2022.05.017_b0195) 2020
10.1016/j.eng.2022.05.017_b0015
10.1016/j.eng.2022.05.017_b0135
Stilgoe (10.1016/j.eng.2022.05.017_b0005) 2019; 1
10.1016/j.eng.2022.05.017_b0010
10.1016/j.eng.2022.05.017_b0030
Nair (10.1016/j.eng.2022.05.017_b0170) 2018
Badia (10.1016/j.eng.2022.05.017_b0200) 2020
Mnih (10.1016/j.eng.2022.05.017_b0070) 2015; 518
Rajeswaran (10.1016/j.eng.2022.05.017_b0140) 2018
10.1016/j.eng.2022.05.017_b0190
Neftci (10.1016/j.eng.2022.05.017_b0100) 2019; 1
Ibarz (10.1016/j.eng.2022.05.017_b0145) 2018
References_xml – volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: b0070
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– volume: 29
  start-page: 70
  year: 2017
  end-page: 76
  ident: b0080
  article-title: Deep reinforcement learning framework for autonomous driving
  publication-title: Electron Imaging
– reference: Hu Z, Zhang Y, Xing Y, Zhao Y, Cao D, Lv C. Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker. IEEE Veh Technol Mag. In press.
– reference: Huang Z, Wu J, Lv C. Efficient deep reinforcement learning with imitative expert priors for autonomous driving. IEEE Trans Neural Netw Learn Syst. In press.
– start-page: 9329
  year: 2019
  end-page: 9338
  ident: b0035
  article-title: Exploring the limitations of behavior cloning for autonomous driving
  publication-title: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2019 Oct 27–Nov 2; Seoul, Republic of Korea
– reference: Mo X, Huang Z, Xing Y, Lv C. Multi-agent trajectory prediction with heterogeneous edge-enhanced graph attention network. IEEE Trans Intell Transp Syst. In press.
– volume: 12
  start-page: 748
  year: 2021
  ident: b0020
  article-title: Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment
  publication-title: Nat Commun
– start-page: 12498
  year: 2019
  end-page: 12507
  ident: b0105
  article-title: Hindsight credit assignment
  publication-title: Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019); 2019 Dec 9–14; Vancouver, BC, Canada
– start-page: 1433
  year: 2008
  end-page: 1438
  ident: b0150
  article-title: Maximum entropy inverse reinforcement learning
  publication-title: Proceedings of the 23rd AAAI Conference on Artificial Intelligence; 2008 Jul 13–17; Chicago, IL, USA
– start-page: 627
  year: 2011
  end-page: 635
  ident: b0040
  article-title: A reduction of imitation learning and structured prediction to no-regret online learning
  publication-title: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS); 2011 Apr 11–13; Fort Lauderdale, FL, USA
– start-page: 244
  year: 2017
  end-page: 250
  ident: b0075
  article-title: Learning how to drive in a real world simulation with deep Q-Networks
  publication-title: Proceedings of 2017 IEEE Intelligent Vehicles Symposium (IV); 2017 Jun 11–14; Los Angeles, CA, USA
– start-page: 6292
  year: 2018
  end-page: 6299
  ident: b0170
  article-title: Overcoming exploration in reinforcement learning with demonstrations
  publication-title: Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA); 2018 May 21–25; Brisbane, QLD, Australia
– volume: 1
  start-page: 202
  year: 2019
  end-page: 203
  ident: b0005
  article-title: Self-driving cars will take a while to get right
  publication-title: Nat Mach Intell
– volume: 550
  start-page: 354
  year: 2017
  end-page: 359
  ident: b0055
  article-title: Mastering the game of Go without human knowledge
  publication-title: Nature
– start-page: 410
  year: 2018
  end-page: 421
  ident: b0175
  article-title: Intervention aided reinforcement learning for safe and practical policy optimization in navigation
  publication-title: Proceedings of the 2nd Conference on Robot Learning; 2018 Oct 29–31; Zürich, Switzerland
– start-page: 1587
  year: 2018
  end-page: 1596
  ident: b0090
  article-title: Addressing function approximation error in actor-critic methods
  publication-title: Proceedings of the 35th International Conference on Machine Learning; 2018 Jul 10–15; Stockholm, Sweden
– volume: 21
  start-page: 11781
  year: 2021
  end-page: 11790
  ident: b0110
  article-title: Multi-modal sensor fusion-based deep neural network for end-to-end autonomous driving with scene understanding
  publication-title: IEEE Sens J
– reference: Vecerik M, Hester T, Scholz J, Wang F, Pietquin O, Piot B, et al. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. 2017. arXiv:1707.08817.
– start-page: 2067
  year: 2018
  end-page: 2069
  ident: b0160
  article-title: Trial without error: towards safe reinforcement learning via human intervention
  publication-title: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems; 2018 Jul 10–15; Stockholm, Sweden
– volume: 5
  start-page: 1247
  year: 2020
  end-page: 1254
  ident: b0095
  article-title: High-speed autonomous drifting with deep reinforcement learning
  publication-title: IEEE Robot Autom Lett
– volume: 5
  start-page: 58
  year: 2018
  end-page: 68
  ident: b0115
  article-title: Analysis of autopilot disengagements occurring during autonomous vehicle testing
  publication-title: IEEE/CAA J Autom Sin
– volume: 9
  start-page: 44
  year: 2017
  end-page: 55
  ident: b0165
  article-title: Learning from explanations using sentiment and advice in RL
  publication-title: IEEE Trans Cogn Dev Syst
– start-page: 1
  year: 2016
  end-page: 9
  ident: b0045
  article-title: Generative adversarial imitation learning
  publication-title: Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016); 2016 Dec 5–10; Barcelona, Spain
– start-page: 5125
  year: 2020
  end-page: 5133
  ident: b0195
  article-title: Count-based exploration with the successor representation
  publication-title: Proceedings of the 34th AAAI Conference on Artificial Intelligence; 2020 Feb 7–12; New York City, NY, USA
– year: 2018
  ident: b0065
  article-title: Reinforcement learning: an introduction
– volume: 1
  start-page: 133
  year: 2019
  end-page: 143
  ident: b0100
  article-title: Reinforcement learning in artificial and biological systems
  publication-title: Nat Mach Intell
– start-page: 2285
  year: 2017
  end-page: 2294
  ident: b0130
  article-title: Interactive learning from policy-dependent human feedback
  publication-title: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6–11; Sydney, NSW, Australia
– start-page: 3223
  year: 2018
  end-page: 3230
  ident: b0155
  article-title: Deep Q-learning from demonstrations
  publication-title: Proceedings of the 32nd AAAI Conference on Artificial Intelligence; 2018 Feb 2–7; New Orleans, LA, USA
– start-page: 1
  year: 2018
  end-page: 9
  ident: b0140
  article-title: Learning complex dexterous manipulation with deep reinforcement learning and demonstrations
  publication-title: Proceedings of Robotics: Science and Systems; 2018 Jun 26–30; Pittsburgh, PA, USA
– start-page: 8011
  year: 2018
  end-page: 8023
  ident: b0145
  article-title: Reward learning from human preferences and demonstrations in Atari
  publication-title: Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS); 2018 Dec 3–8; Montreal, QC, Canada
– start-page: 1
  year: 2020
  end-page: 26
  ident: b0200
  article-title: Never give up: learning directed exploration strategies
  publication-title: Proceedings of the 8th International Conference on Learning Representations (ICLR 2020); 2020 Apr 26–May 1; Addis Ababa, Ethiopia
– start-page: 1861
  year: 2018
  end-page: 1870
  ident: b0085
  article-title: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor
  publication-title: Proceedings of the 35th International Conference on Machine Learning; 2018 Jul 10–15; Stockholm, Sweden
– volume: 529
  start-page: 484
  year: 2016
  end-page: 489
  ident: b0050
  article-title: Mastering the game of Go with deep neural networks and tree search
  publication-title: Nature
– start-page: 1
  year: 2019
  end-page: 28
  ident: b0120
  article-title: The neuro-symbolic concept learner: interpreting scenes, words, and sentences from natural supervision
  publication-title: Proceedings of the 7th International Conference on Learning Representations (ICLR); 2019 May 6–9; New Orleans, LA, USA
– start-page: 878
  year: 2012
  end-page: 885
  ident: b0125
  article-title: Reinforcement learning from human reward: discounting in episodic tasks
  publication-title: Proceedings of 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication; 2012 Sep 9–13; Paris, France
– start-page: 4693
  year: 2018
  end-page: 4700
  ident: b0025
  article-title: End-to-end driving via conditional imitation learning
  publication-title: Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA); 2018 May 21–25; Brisbane, QLD, Australia
– volume: 521
  start-page: 445
  year: 2015
  end-page: 451
  ident: b0180
  article-title: Reinforcement learning improves behaviour from evaluative feedback
  publication-title: Nature
– reference: Huang Z, Wu J, Lv C. Driving behavior modeling using naturalistic human driving data with inverse reinforcement learning. IEEE Trans Intell Transp Syst. In press.
– volume: 362
  start-page: 1140
  year: 2018
  end-page: 1144
  ident: b0060
  article-title: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
  publication-title: Science
– volume: 10
  start-page: 35
  year: 2020
  end-page: 47
  ident: b0185
  article-title: Drivers’ reaction time research in the conditions in the real traffic
  publication-title: Open Eng
– volume: 9
  start-page: 44
  issue: 1
  year: 2017
  ident: 10.1016/j.eng.2022.05.017_b0165
  article-title: Learning from explanations using sentiment and advice in RL
  publication-title: IEEE Trans Cogn Dev Syst
  doi: 10.1109/TCDS.2016.2628365
– ident: 10.1016/j.eng.2022.05.017_b0015
– volume: 12
  start-page: 748
  year: 2021
  ident: 10.1016/j.eng.2022.05.017_b0020
  article-title: Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment
  publication-title: Nat Commun
  doi: 10.1038/s41467-021-21007-8
– start-page: 9329
  year: 2019
  ident: 10.1016/j.eng.2022.05.017_b0035
  article-title: Exploring the limitations of behavior cloning for autonomous driving
– volume: 518
  start-page: 529
  issue: 7540
  year: 2015
  ident: 10.1016/j.eng.2022.05.017_b0070
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 1
  start-page: 202
  issue: 5
  year: 2019
  ident: 10.1016/j.eng.2022.05.017_b0005
  article-title: Self-driving cars will take a while to get right
  publication-title: Nat Mach Intell
  doi: 10.1038/s42256-019-0046-z
– volume: 29
  start-page: 70
  year: 2017
  ident: 10.1016/j.eng.2022.05.017_b0080
  article-title: Deep reinforcement learning framework for autonomous driving
  publication-title: Electron Imaging
  doi: 10.2352/ISSN.2470-1173.2017.19.AVM-023
– start-page: 244
  year: 2017
  ident: 10.1016/j.eng.2022.05.017_b0075
  article-title: Learning how to drive in a real world simulation with deep Q-Networks
– volume: 21
  start-page: 11781
  issue: 10
  year: 2021
  ident: 10.1016/j.eng.2022.05.017_b0110
  article-title: Multi-modal sensor fusion-based deep neural network for end-to-end autonomous driving with scene understanding
  publication-title: IEEE Sens J
  doi: 10.1109/JSEN.2020.3003121
– start-page: 12498
  year: 2019
  ident: 10.1016/j.eng.2022.05.017_b0105
  article-title: Hindsight credit assignment
– start-page: 1
  year: 2019
  ident: 10.1016/j.eng.2022.05.017_b0120
  article-title: The neuro-symbolic concept learner: interpreting scenes, words, and sentences from natural supervision
– start-page: 5125
  year: 2020
  ident: 10.1016/j.eng.2022.05.017_b0195
  article-title: Count-based exploration with the successor representation
– start-page: 1433
  year: 2008
  ident: 10.1016/j.eng.2022.05.017_b0150
  article-title: Maximum entropy inverse reinforcement learning
– ident: 10.1016/j.eng.2022.05.017_b0010
– start-page: 1861
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0085
  article-title: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor
– start-page: 410
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0175
  article-title: Intervention aided reinforcement learning for safe and practical policy optimization in navigation
– start-page: 1587
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0090
  article-title: Addressing function approximation error in actor-critic methods
– volume: 521
  start-page: 445
  issue: 7553
  year: 2015
  ident: 10.1016/j.eng.2022.05.017_b0180
  article-title: Reinforcement learning improves behaviour from evaluative feedback
  publication-title: Nature
  doi: 10.1038/nature14540
– start-page: 4693
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0025
  article-title: End-to-end driving via conditional imitation learning
– start-page: 2067
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0160
  article-title: Trial without error: towards safe reinforcement learning via human intervention
– volume: 1
  start-page: 133
  issue: 3
  year: 2019
  ident: 10.1016/j.eng.2022.05.017_b0100
  article-title: Reinforcement learning in artificial and biological systems
  publication-title: Nat Mach Intell
  doi: 10.1038/s42256-019-0025-4
– start-page: 878
  year: 2012
  ident: 10.1016/j.eng.2022.05.017_b0125
  article-title: Reinforcement learning from human reward: discounting in episodic tasks
– volume: 5
  start-page: 58
  issue: 1
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0115
  article-title: Analysis of autopilot disengagements occurring during autonomous vehicle testing
  publication-title: IEEE/CAA J Autom Sin
  doi: 10.1109/JAS.2017.7510745
– start-page: 1
  year: 2020
  ident: 10.1016/j.eng.2022.05.017_b0200
  article-title: Never give up: learning directed exploration strategies
– ident: 10.1016/j.eng.2022.05.017_b0190
– volume: 5
  start-page: 1247
  issue: 2
  year: 2020
  ident: 10.1016/j.eng.2022.05.017_b0095
  article-title: High-speed autonomous drifting with deep reinforcement learning
  publication-title: IEEE Robot Autom Lett
  doi: 10.1109/LRA.2020.2967299
– start-page: 2285
  year: 2017
  ident: 10.1016/j.eng.2022.05.017_b0130
  article-title: Interactive learning from policy-dependent human feedback
– start-page: 627
  year: 2011
  ident: 10.1016/j.eng.2022.05.017_b0040
  article-title: A reduction of imitation learning and structured prediction to no-regret online learning
– volume: 529
  start-page: 484
  issue: 7587
  year: 2016
  ident: 10.1016/j.eng.2022.05.017_b0050
  article-title: Mastering the game of Go with deep neural networks and tree search
  publication-title: Nature
  doi: 10.1038/nature16961
– start-page: 3223
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0155
  article-title: Deep Q-learning from demonstrations
– ident: 10.1016/j.eng.2022.05.017_b0135
– volume: 10
  start-page: 35
  issue: 1
  year: 2020
  ident: 10.1016/j.eng.2022.05.017_b0185
  article-title: Drivers’ reaction time research in the conditions in the real traffic
  publication-title: Open Eng
  doi: 10.1515/eng-2020-0004
– ident: 10.1016/j.eng.2022.05.017_b0030
– start-page: 1
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0140
  article-title: Learning complex dexterous manipulation with deep reinforcement learning and demonstrations
– start-page: 1
  year: 2016
  ident: 10.1016/j.eng.2022.05.017_b0045
  article-title: Generative adversarial imitation learning
– year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0065
– volume: 550
  start-page: 354
  issue: 7676
  year: 2017
  ident: 10.1016/j.eng.2022.05.017_b0055
  article-title: Mastering the game of Go without human knowledge
  publication-title: Nature
  doi: 10.1038/nature24270
– start-page: 6292
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0170
  article-title: Overcoming exploration in reinforcement learning with demonstrations
– volume: 362
  start-page: 1140
  issue: 6419
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0060
  article-title: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
  publication-title: Science
  doi: 10.1126/science.aar6404
– start-page: 8011
  year: 2018
  ident: 10.1016/j.eng.2022.05.017_b0145
  article-title: Reward learning from human preferences and demonstrations in Atari
SSID ssj0001510708
Score 2.5119383
Snippet Due to its limited intelligence and abilities, machine learning is currently unable to handle various situations thus cannot completely replace humans in...
SourceID doaj
crossref
elsevier
SourceType Open Website
Enrichment Source
Index Database
Publisher
StartPage 75
SubjectTerms Autonomous driving
Deep reinforcement learning
Human guidance
Human-in-the-loop AI
Title Toward Human-in-the-Loop AI: Enhancing Deep Reinforcement Learning via Real-Time Human Guidance for Autonomous Driving
URI https://dx.doi.org/10.1016/j.eng.2022.05.017
https://doaj.org/article/e53531b06dac4965b72fd08c185600ef
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT8MwDI4QJzggnmK8lAMnpIg-15TbYOMlxAGBxK3KwxlFUzeNbb8fu-lQOQAXTpXaNK0cN_7s2p8ZOwW0YRqkEy5InEjCzAlEIaHIIu2cVI7CbpRt8di9fUnuX9PXVqsvygnz9MBecOeQxqgmOuhaZYjbXGeRs4E0aGfQVoOj3RdtXsuZ8vXB6Nb4dnSIIXAbzvPlL806uQuqIfqGUeRZO7NvRqnm7m_Zppa9ud5kGw1Q5D3_gltsBapttt6iD9xhi-c655XXgXhRVgLBnHgYjye8d3fBB9UbcWlUQ94HmPAnqDlSTR0O5A2t6pAvSoWX1EhQLYifid_MS0vKwHE8781nVPcwnn_w_rSk8MMue7kePF_diqaPgjBormcCvQYDCQrN5ob6U8k0wu1W2dikQYKHKHDotMhUh0ZLGVvjqH5H2wShFIBS8R5brcYV7DMuMwh1GEuDSCYJbKS0zEFp24UoS3MDHRYsBVmYhmScel2MimU22Tt-jsOCZF8EaYGy77Czr1smnmHjt8GXtDpfA4kcuz6BKlM0KlP8pTIdlizXtmhwhscPOFX587MP_uPZh2yNGtb7vO8jtjqbzuEYYc1Mn9Qa_AkR4vKx
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+Human-in-the-Loop+AI%3A+Enhancing+Deep+Reinforcement+Learning+via+Real-Time+Human+Guidance+for+Autonomous+Driving&rft.jtitle=Engineering+%28Beijing%2C+China%29&rft.au=Jingda+Wu&rft.au=Zhiyu+Huang&rft.au=Zhongxu+Hu&rft.au=Chen+Lv&rft.date=2023-02-01&rft.pub=Elsevier&rft.issn=2095-8099&rft.volume=21&rft.spage=75&rft.epage=91&rft_id=info:doi/10.1016%2Fj.eng.2022.05.017&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_e53531b06dac4965b72fd08c185600ef
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2095-8099&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2095-8099&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2095-8099&client=summon