Adversarial recovery of agent rewards from latent spaces of the limit order book

Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert agents by recovering their underlying reward functions in increasingly challenging environments. Recent advances in adversarial learning have allowed extending inverse RL to applications with non-sta...

Full description

Saved in:
Bibliographic Details
Published inIDEAS Working Paper Series from RePEc
Main Authors Roa-Vicens, Jacobo, Wang, Yuanbo, Mison, Virgile, Yarin Gal, Silva, Ricardo
Format Paper
LanguageEnglish
Published St. Louis Federal Reserve Bank of St. Louis 01.01.2019
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert agents by recovering their underlying reward functions in increasingly challenging environments. Recent advances in adversarial learning have allowed extending inverse RL to applications with non-stationary environment dynamics unknown to the agents, arbitrary structures of reward functions and improved handling of the ambiguities inherent to the ill-posed nature of inverse RL. This is particularly relevant in real time applications on stochastic environments involving risk, like volatile financial markets. Moreover, recent work on simulation of complex environments enable learning algorithms to engage with real market data through simulations of its latent space representations, avoiding a costly exploration of the original environment. In this paper, we explore whether adversarial inverse RL algorithms can be adapted and trained within such latent space simulations from real market data, while maintaining their ability to recover agent rewards robust to variations in the underlying dynamics, and transfer them to new regimes of the original environment.
AbstractList Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert agents by recovering their underlying reward functions in increasingly challenging environments. Recent advances in adversarial learning have allowed extending inverse RL to applications with non-stationary environment dynamics unknown to the agents, arbitrary structures of reward functions and improved handling of the ambiguities inherent to the ill-posed nature of inverse RL. This is particularly relevant in real time applications on stochastic environments involving risk, like volatile financial markets. Moreover, recent work on simulation of complex environments enable learning algorithms to engage with real market data through simulations of its latent space representations, avoiding a costly exploration of the original environment. In this paper, we explore whether adversarial inverse RL algorithms can be adapted and trained within such latent space simulations from real market data, while maintaining their ability to recover agent rewards robust to variations in the underlying dynamics, and transfer them to new regimes of the original environment.
Author Silva, Ricardo
Wang, Yuanbo
Yarin Gal
Mison, Virgile
Roa-Vicens, Jacobo
Author_xml – sequence: 1
  givenname: Jacobo
  surname: Roa-Vicens
  fullname: Roa-Vicens, Jacobo
– sequence: 2
  givenname: Yuanbo
  surname: Wang
  fullname: Wang, Yuanbo
– sequence: 3
  givenname: Virgile
  surname: Mison
  fullname: Mison, Virgile
– sequence: 4
  fullname: Yarin Gal
– sequence: 5
  givenname: Ricardo
  surname: Silva
  fullname: Silva, Ricardo
BookMark eNqNijsOwjAQBV1Awe8OK1EjoQSEW4RAlBT00RKvweB4w9oBcXsciQNQPb2ZGatB4EAjddqaF0lEcehBqOb8PsAW8EohZfJGMRGscAMeU89iizXFvkk3Au8al4DFkMCF-TFVQ4s-0uy3EzU_7M-746IVfnYUU3XnTkJWVbHWerXc6LIo_6u-5mI7rQ
ContentType Paper
Copyright 2019. Notwithstanding the ProQuest Terms and conditions, you may use this content in accordance with the associated terms available at https://research.stlouisfed.org/research_terms.html .
Copyright_xml – notice: 2019. Notwithstanding the ProQuest Terms and conditions, you may use this content in accordance with the associated terms available at https://research.stlouisfed.org/research_terms.html .
DBID 3V.
7WY
7WZ
7XB
87Z
8FK
8FL
AAFGM
ABLUL
ABPUF
ABSSA
ABUWG
ACIOU
ADZZV
AFKRA
AGAJT
AGSBL
AJNOY
AQTIP
AZQEC
BENPR
BEZIV
BOUDT
CBHQV
CCPQU
DWQXO
FRNLG
F~G
K60
K6~
L.-
M0C
PIMPY
PQBIZ
PQBZA
PQCXX
PQEST
PQQKQ
PQUKI
PRINS
Q9U
DatabaseName ProQuest Central (Corporate)
ABI商业信息数据库
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni Edition)
ProQuest Central Korea - hybrid linking
Business Premium Collection - hybrid linking
ABI/INFORM Collection (Alumni) - hybrid linking
ABI/INFORM Collection - hybrid linking
ProQuest Central (Alumni)
ABI/INFORM Global - hybrid linking
ProQuest Central (Alumni) - hybrid linking
ProQuest Central UK/Ireland
ProQuest Central Essentials - hybrid linking
ABI/INFORM Global (Alumni) - hybrid linking
Business Premium Collection (Alumni) - hybrid linking
ProQuest Women's & Gender Studies - hybrid linking
ProQuest Central Essentials
AUTh Library subscriptions: ProQuest Central
ProQuest Business Premium Collection
ProQuest One Business - hybrid linking
ProQuest One Business (Alumni) - hybrid linking
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
ABI/INFORM Professional Advanced
ABI/INFORM Global (ProQuest)
Publicly Available Content Database
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest Central - hybrid linking
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
DatabaseTitle Publicly Available Content Database
Business Premium Collection
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest One Business
ABI/INFORM Global
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Business Collection
ProQuest Central China
ABI/INFORM Complete
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest One Business (Alumni)
ProQuest One Academic
ABI/INFORM Complete (Alumni Edition)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: BENPR
  name: AUTh Library subscriptions: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Genre Working Paper/Pre-Print
GroupedDBID 3V.
7WY
7XB
8FK
8FL
ABUWG
AFKRA
AZQEC
BENPR
BEZIV
CCPQU
DWQXO
FRNLG
K60
K6~
L.-
M0C
PIMPY
PQBIZ
PQBZA
PQEST
PQQKQ
PQUKI
PRINS
Q9U
ID FETCH-proquest_journals_25884078323
IEDL.DBID BENPR
IngestDate Thu Oct 10 22:20:32 EDT 2024
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-proquest_journals_25884078323
OpenAccessLink https://www.proquest.com/docview/2588407832?pq-origsite=%requestingapplication%
PQID 2588407832
PQPubID 2036240
ParticipantIDs proquest_journals_2588407832
PublicationCentury 2000
PublicationDate 20190101
PublicationDateYYYYMMDD 2019-01-01
PublicationDate_xml – month: 01
  year: 2019
  text: 20190101
  day: 01
PublicationDecade 2010
PublicationPlace St. Louis
PublicationPlace_xml – name: St. Louis
PublicationTitle IDEAS Working Paper Series from RePEc
PublicationYear 2019
Publisher Federal Reserve Bank of St. Louis
Publisher_xml – name: Federal Reserve Bank of St. Louis
Score 3.1839495
Snippet Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert agents by recovering their underlying reward functions in...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Algorithms
Title Adversarial recovery of agent rewards from latent spaces of the limit order book
URI https://www.proquest.com/docview/2588407832
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMU9LMwK2Oix1jSxSLXRNUowsgeVgkqFummGqUWKqsWlaogVoc7Kvn5lHqIlXhGkEdMCtGLqsElYmggvqlPxk0Bi5vhFoRyVozsnIvqBQF3RrFGh2FXqFBjMDqxGwp2DAwsDq5OoXEIRRsIJrCzdBBtaAxILUIiEGptQ8EYYA8J3HxYmgmFYA9T-BvEqF_DSFRNCuJqAIaN1qsQJon4dCDrDlBxQD5nJg9gWpATbPFHJAe5AUwGdkKoDaxKIMym6uIc4eujDL46EJojge4XxjMQYWYM8-VYJBIQ1YeYNuxDM1SbM0MU01TjJMNklNATZpUswSgZVIkiSDDD6TpPBLSzNwAWt3S8h4gQwDS0lRaaossAYtSZKDBhMAAUl8Nw
link.rule.ids 783,787,21400,33756,43817
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQMU9LMwK2Oix1jSxSLXRNUowsgeVgkqFummGqUWKqsWlaogVoc7Kvn5lHqIlXhGkEdMCtGLqsElYmggvqlPxk0Bi5vhFoRyVozsnIvqBQF3RrFGh2FXqFBjMDK-ioKmDni9XJ1S8gCKNgBdcWboIMrAGJBalFQgxMqXkiDAHgO4-LE0ExrQDqfwJ5lQr5aQqJoF1NQBHQutViBdA-D4UcYMsPKAbM5cDsC1IDbJ4p5ID2ICmAz8hUALWJRRmU3VxDnD10YZbHQxNEcTzC-cZiDCzAnn2qBINCGrDyBt2IZ2qSZmlimmqcZJhskpoCbNKkmCUCK5EkSQYZfCZJ4ZeWZ-D0CPH1iffx9POWZuAC1vSWkLEDGQaWkqLSVFlgbVqSJAcNMgAHcH8x
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adversarial+recovery+of+agent+rewards+from+latent+spaces+of+the+limit+order+book&rft.jtitle=IDEAS+Working+Paper+Series+from+RePEc&rft.au=Roa-Vicens%2C+Jacobo&rft.au=Wang%2C+Yuanbo&rft.au=Mison%2C+Virgile&rft.au=Yarin+Gal&rft.date=2019-01-01&rft.pub=Federal+Reserve+Bank+of+St.+Louis