Random-TD Function Approximator

In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant...

Full description

Saved in:
Bibliographic Details
Published inJournal of advanced computational intelligence and intelligent informatics Vol. 13; no. 2; pp. 155 - 161
Main Author Osman, Hassab Elgawi
Format Journal Article
LanguageEnglish
Published 01.03.2009
Online AccessGet full text

Cover

Loading…
Abstract In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant parts of action and is able to learn completely in on-line mode. Such capability of on-line adaptation would take us closer to the goal of more robust and adaptable control. To illustrate this and to demonstrate the applicability of the approach, it has been applied to a non-linear, non-stationary control task, Cart-Pole balancing and on high-dimensional control problems –Ailerons, Elevator, Kinematics, and Friedman–. The results demonstrate that our hybrid approach is adaptable and can significantly improves the performance of TD methods while speeding up the learning process.
AbstractList In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant parts of action and is able to learn completely in on-line mode. Such capability of on-line adaptation would take us closer to the goal of more robust and adaptable control. To illustrate this and to demonstrate the applicability of the approach, it has been applied to a non-linear, non-stationary control task, Cart-Pole balancing and on high-dimensional control problems –Ailerons, Elevator, Kinematics, and Friedman–. The results demonstrate that our hybrid approach is adaptable and can significantly improves the performance of TD methods while speeding up the learning process.
Author Osman, Hassab Elgawi
Author_xml – sequence: 1
  givenname: Hassab Elgawi
  surname: Osman
  fullname: Osman, Hassab Elgawi
BookMark eNp9j81KAzEUhYNUsNa-gBv7Aqk3uUkmsyz1FwqC1HXIJBlIaSdDZgR9e9PWlQtX99zFdzjfNZl0qQuE3DJYcqiVvN9ZF2MsD9TLHpiUF2TKtEaqgYlJySiQAkO4IvNh2AGUzBUINiV377bz6UC3D4unz86NMXWLVd_n9BUPdkz5hly2dj-E-e-dkY-nx-36hW7enl_Xqw11iHKkztdceKmVk2gFBCkkoMemqVwjKwlatYqD9rxRTtXBV4KzULe-8q0sWxBnhJ97XU7DkENr-lwW5G_DwJwszdnSHC3NybJA-g_k4miPDmO2cf8f-gN1HloO
CitedBy_id crossref_primary_10_1016_j_neucom_2016_08_155
Cites_doi 10.1023/A:1010933404324
10.1109/TNN.1998.712192
10.1109/CVPRW.2008.4563065
10.1145/1143844.1143901
10.1177/105971230501300301
10.1016/j.neucom.2007.11.026
10.1007/BF00115009
10.1214/aos/1176347963
10.1109/TSMC.1983.6313077
ContentType Journal Article
CorporateAuthor Image Science and Engineering Lab, Tokyo Institute of Technology, 4259 Nagatsuta, Midori-ku, Yokohama 226-8503, Japan
CorporateAuthor_xml – name: Image Science and Engineering Lab, Tokyo Institute of Technology, 4259 Nagatsuta, Midori-ku, Yokohama 226-8503, Japan
DBID AAYXX
CITATION
DOI 10.20965/jaciii.2009.p0155
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1883-8014
EndPage 161
ExternalDocumentID 10_20965_jaciii_2009_p0155
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
GROUPED_DOAJ
P2P
ID FETCH-LOGICAL-c335t-cd924d586c53a40e54503d3bb7cb575086f6208d2b6c69ed7421e9fd7df513233
ISSN 1343-0130
IngestDate Tue Jul 01 04:30:39 EDT 2025
Thu Apr 24 22:59:36 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c335t-cd924d586c53a40e54503d3bb7cb575086f6208d2b6c69ed7421e9fd7df513233
OpenAccessLink https://doi.org/10.20965/jaciii.2009.p0155
PageCount 7
ParticipantIDs crossref_primary_10_20965_jaciii_2009_p0155
crossref_citationtrail_10_20965_jaciii_2009_p0155
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2009-03-01
PublicationDateYYYYMMDD 2009-03-01
PublicationDate_xml – month: 03
  year: 2009
  text: 2009-03-01
  day: 01
PublicationDecade 2000
PublicationTitle Journal of advanced computational intelligence and intelligent informatics
PublicationYear 2009
References key-10.20965/jaciii.2009.p0155-3
key-10.20965/jaciii.2009.p0155-2
key-10.20965/jaciii.2009.p0155-1
key-10.20965/jaciii.2009.p0155-11
key-10.20965/jaciii.2009.p0155-12
key-10.20965/jaciii.2009.p0155-7
key-10.20965/jaciii.2009.p0155-6
key-10.20965/jaciii.2009.p0155-5
key-10.20965/jaciii.2009.p0155-4
key-10.20965/jaciii.2009.p0155-9
key-10.20965/jaciii.2009.p0155-8
key-10.20965/jaciii.2009.p0155-10
References_xml – ident: key-10.20965/jaciii.2009.p0155-2
  doi: 10.1023/A:1010933404324
– ident: key-10.20965/jaciii.2009.p0155-4
– ident: key-10.20965/jaciii.2009.p0155-9
  doi: 10.1109/TNN.1998.712192
– ident: key-10.20965/jaciii.2009.p0155-5
  doi: 10.1109/CVPRW.2008.4563065
– ident: key-10.20965/jaciii.2009.p0155-6
  doi: 10.1145/1143844.1143901
– ident: key-10.20965/jaciii.2009.p0155-11
  doi: 10.1177/105971230501300301
– ident: key-10.20965/jaciii.2009.p0155-10
– ident: key-10.20965/jaciii.2009.p0155-12
  doi: 10.1016/j.neucom.2007.11.026
– ident: key-10.20965/jaciii.2009.p0155-7
  doi: 10.1007/BF00115009
– ident: key-10.20965/jaciii.2009.p0155-8
– ident: key-10.20965/jaciii.2009.p0155-3
  doi: 10.1214/aos/1176347963
– ident: key-10.20965/jaciii.2009.p0155-1
  doi: 10.1109/TSMC.1983.6313077
SSID ssj0001326041
ssib051641541
Score 1.7600027
Snippet In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF)...
SourceID crossref
SourceType Enrichment Source
Index Database
StartPage 155
Title Random-TD Function Approximator
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF60Xrz4FuszB28lmuwmaXIs2lIEFaSF3sK-IoVai1YUf72zjzxaq1gvIQzZIZuZzGN35xuEzqXPsScxd0MmiBtEjLg08KRLBAN_EhFBqKp3vr2Luv3gZhAOym6burpkyi7458K6kv9IFWggV1Ulu4RkC6ZAgHuQL1xBwnD9k4wf6FiAjetdNzrgnrQkWwoj_GP4RA2K8KLAs9j257qjQ74aOKyCcxpMppyg2gjY4LY8G3__ahdPuxB_U9Zojx7p-3BmFSEpj1FZw0cCRbB7JNLQ4pgoDxbMWEtS0QpcMX2-gdu1XtQ3EOvzBhorsBndGYAr7AwNFzrxiqFVNOw5L1WcHYSsRXNJDQ_VRzNJNY9VtIYhWagm1mBVQkgIIU70y5U3iFi9wGTidtKmmkqzvfz2apWIpRJ69LbQhhWd0zIKsI1W5HgHbeb9OBxrnnfRWaEPTq4PTlUf9lC_0-5ddV3bAMPlhIRTlwvIjkUYRzwk6g-CaNeDv4exJmcQZkM2mkXYiwVmEY8SKZoB9mWSiabIQpglIfuoNn4eywPkSMGbiQdj1b5zkvlxwhJBOYU7zLyI1pGfTzLlFh1eNSkZpT9_7zpqFGMmBhvll6cPl3r6CK2XanqMatOXN3kC4d-UnWrpfgFL3lif
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Random-TD+Function+Approximator&rft.jtitle=Journal+of+advanced+computational+intelligence+and+intelligent+informatics&rft.au=Osman%2C+Hassab+Elgawi&rft.date=2009-03-01&rft.issn=1343-0130&rft.eissn=1883-8014&rft.volume=13&rft.issue=2&rft.spage=155&rft.epage=161&rft_id=info:doi/10.20965%2Fjaciii.2009.p0155&rft.externalDBID=n%2Fa&rft.externalDocID=10_20965_jaciii_2009_p0155
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1343-0130&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1343-0130&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1343-0130&client=summon