Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real-Time

This paper investigates how to utilize different forms of human interaction to safely train autonomous systems in real-time by learning from both human demonstrations and interventions. We implement two components of the Cycle-of-Learning for Autonomous Systems, which is our framework for combining...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Goecks, Vinicius G, Gremillion, Gregory M, Lawhern, Vernon J, Valasek, John, Waytowich, Nicholas R
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 28.11.2018
Subjects	Human behavior Imagery Learning Military technology Pitch (inclination) Real time Rolling motion Throttles Training Yaw
Online Access	Get full text

Cover

Loading…

Abstract	This paper investigates how to utilize different forms of human interaction to safely train autonomous systems in real-time by learning from both human demonstrations and interventions. We implement two components of the Cycle-of-Learning for Autonomous Systems, which is our framework for combining multiple modalities of human interaction. The current effort employs human demonstrations to teach a desired behavior via imitation learning, then leverages intervention data to correct for undesired behaviors produced by the imitation learner to teach novel tasks to an autonomous agent safely, after only minutes of training. We demonstrate this method in an autonomous perching task using a quadrotor with continuous roll, pitch, yaw, and throttle commands and imagery captured from a downward-facing camera in a high-fidelity simulated environment. Our method improves task completion performance for the same amount of human interaction when compared to learning from demonstrations alone, while also requiring on average 32% less data to achieve that performance. This provides evidence that combining multiple modes of human interaction can increase both the training speed and overall performance of policies for autonomous systems.
AbstractList	This paper investigates how to utilize different forms of human interaction to safely train autonomous systems in real-time by learning from both human demonstrations and interventions. We implement two components of the Cycle-of-Learning for Autonomous Systems, which is our framework for combining multiple modalities of human interaction. The current effort employs human demonstrations to teach a desired behavior via imitation learning, then leverages intervention data to correct for undesired behaviors produced by the imitation learner to teach novel tasks to an autonomous agent safely, after only minutes of training. We demonstrate this method in an autonomous perching task using a quadrotor with continuous roll, pitch, yaw, and throttle commands and imagery captured from a downward-facing camera in a high-fidelity simulated environment. Our method improves task completion performance for the same amount of human interaction when compared to learning from demonstrations alone, while also requiring on average 32% less data to achieve that performance. This provides evidence that combining multiple modes of human interaction can increase both the training speed and overall performance of policies for autonomous systems.
Author	Valasek, John Goecks, Vinicius G Lawhern, Vernon J Waytowich, Nicholas R Gremillion, Gregory M
Author_xml	– sequence: 1 givenname: Vinicius surname: Goecks middlename: G fullname: Goecks, Vinicius G – sequence: 2 givenname: Gregory surname: Gremillion middlename: M fullname: Gremillion, Gregory M – sequence: 3 givenname: Vernon surname: Lawhern middlename: J fullname: Lawhern, Vernon J – sequence: 4 givenname: John surname: Valasek fullname: Valasek, John – sequence: 5 givenname: Nicholas surname: Waytowich middlename: R fullname: Waytowich, Nicholas R
BookMark	eNqNi8sKwjAQRYMo-PyHAdeFNlWrS_GBbm33EnUikWZG8xD8e4v6Aa4OnHtuX7SJCVuiJ_M8S-YTKbti5P0tTVM5K-R0mvfEY6O1ORukUL9gxfZkyNAVdtEqgjVaJh-cCqYhKLrAngK6Z5N_jGYHpdIIlVPfI2tYxsDElqOH8uUDWg-G4ICqTipjcSg6WtUeRz8OxHi7qVa75O74EdGH442jo2Y6ykwWabHIsiL_r3oDF_NNRA
ContentType	Paper
Copyright	2018. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: 2018. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS
DatabaseName	ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials AUTh Library subscriptions: ProQuest Central Technology Collection ProQuest One Community College ProQuest Central SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Engineering Collection ProQuest Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering collection
DatabaseTitle	Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection
DatabaseTitleList	Publicly Available Content Database
Database_xml	– sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Physics
EISSN	2331-8422
Genre	Working Paper/Pre-Print
GroupedDBID	8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS
ID	FETCH-proquest_journals_21270791173
IEDL.DBID	BENPR
IngestDate	Thu Oct 10 18:34:33 EDT 2024
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-proquest_journals_21270791173
OpenAccessLink	https://www.proquest.com/docview/2127079117?pq-origsite=%requestingapplication%
PQID	2127079117
PQPubID	2050157
ParticipantIDs	proquest_journals_2127079117
PublicationCentury	2000
PublicationDate	20181128
PublicationDateYYYYMMDD	2018-11-28
PublicationDate_xml	– month: 11 year: 2018 text: 20181128 day: 28
PublicationDecade	2010
PublicationPlace	Ithaca
PublicationPlace_xml	– name: Ithaca
PublicationTitle	arXiv.org
PublicationYear	2018
Publisher	Cornell University Library, arXiv.org
Publisher_xml	– name: Cornell University Library, arXiv.org
SSID	ssj0002672553
Score	3.1829488
SecondaryResourceType	preprint
Snippet	This paper investigates how to utilize different forms of human interaction to safely train autonomous systems in real-time by learning from both human...
SourceID	proquest
SourceType	Aggregation Database
SubjectTerms	Human behavior Imagery Learning Military technology Pitch (inclination) Real time Rolling motion Throttles Training Yaw
Title	Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real-Time
URI	https://www.proquest.com/docview/2127079117
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1bS8MwFD64FcE3r3iZ44C-BtumS7sn8dI5hY0xJ-xtpG0Kgrbb2j344m_3JGYqCHsMgZDryZeTc74P4DIRoivzsMPcKM9ZIHjGJCFTRsstlSJ82jWUQoOh6L8ET9PO1DrcKhtWubaJxlBnZap95FeaidwN6WiG1_MF06pR-nfVSmg0wPG9QH_TOrfxcDT-8bL4IiTMzP8ZWnN79HbBGcm5Wu7Blir2YdsEXabVASxiw99AZv_tA-lgJkasAY1fHe_Vu4Zu3wtUIT348fFPfGKFhDbxWeYKJ1blAcscb1a1zlKg5zxaLnJ8LXBMaJDpZI9DuOjFk7s-W_dzZvdSNfsdOT-CZlEW6hhQi1rmktANF2kgEjeJMk9zh3k84ZEM0xNobWrpdHP1GewQMIh0zp0ftaBZL1fqnC7fOmlDI-o9tO08U2nwGX8BSlaQfw
link.rule.ids	786,790,12792,21416,33406,33777,43633,43838
linkProvider	ProQuest
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8NAEB40QfTmEx9VB_S6mDbtJj2Jj5RU21BqhN7CJtmAoOkj6cF_7-y6VUHoeWHZ18x-OzvzfQDXKeddUXgd5vhFwdrczZkgZMpou4WUhE-7mlJoGPHwtf006UxMwK0yaZUrn6gddT7NVIz8RjGROx6Zpnc7mzOlGqV-V42ExibYinLTt8C-D6LR-CfK0uIeYWb3n6PVt0dvF-yRmMnFHmzIch-2dNJlVh3APND8DeT23z-RDDPVYg2o4-r4KD8UdPveoArpwY_9P_mJFRLaxBdRSIyNygNOC7xb1qpKgZ7zaLjI8a3EMaFBpoo9DuGqF8QPIVuNMzFnqUp-Z-4egVVOS3kMqEQtC0HoxuVZm6dO6udNxR3WdFPXF152Ao11PZ2ub76E7TAeDpJBP3o-gx0CCb6qv2v5DbDqxVKe00Vcpxdmtb8Act2RXg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficiently+Combining+Human+Demonstrations+and+Interventions+for+Safe+Training+of+Autonomous+Systems+in+Real-Time&rft.jtitle=arXiv.org&rft.au=Goecks%2C+Vinicius+G&rft.au=Gremillion%2C+Gregory+M&rft.au=Lawhern%2C+Vernon+J&rft.au=Valasek%2C+John&rft.date=2018-11-28&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422