Random-TD Function Approximator

In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant...

Full description

Saved in:

Bibliographic Details
Published in	Journal of advanced computational intelligence and intelligent informatics Vol. 13; no. 2; pp. 155 - 161
Main Author	Osman, Hassab Elgawi
Format	Journal Article
Language	English
Published	01.03.2009
Online Access	Get full text

Cover

Loading…

Abstract	In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant parts of action and is able to learn completely in on-line mode. Such capability of on-line adaptation would take us closer to the goal of more robust and adaptable control. To illustrate this and to demonstrate the applicability of the approach, it has been applied to a non-linear, non-stationary control task, Cart-Pole balancing and on high-dimensional control problems –Ailerons, Elevator, Kinematics, and Friedman–. The results demonstrate that our hybrid approach is adaptable and can significantly improves the performance of TD methods while speeding up the learning process.
AbstractList	In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant parts of action and is able to learn completely in on-line mode. Such capability of on-line adaptation would take us closer to the goal of more robust and adaptable control. To illustrate this and to demonstrate the applicability of the approach, it has been applied to a non-linear, non-stationary control task, Cart-Pole balancing and on high-dimensional control problems –Ailerons, Elevator, Kinematics, and Friedman–. The results demonstrate that our hybrid approach is adaptable and can significantly improves the performance of TD methods while speeding up the learning process.
Author	Osman, Hassab Elgawi
Author_xml	– sequence: 1 givenname: Hassab Elgawi surname: Osman fullname: Osman, Hassab Elgawi
BookMark	eNp9j81KAzEUhYNUsNa-gBv7Aqk3uUkmsyz1FwqC1HXIJBlIaSdDZgR9e9PWlQtX99zFdzjfNZl0qQuE3DJYcqiVvN9ZF2MsD9TLHpiUF2TKtEaqgYlJySiQAkO4IvNh2AGUzBUINiV377bz6UC3D4unz86NMXWLVd_n9BUPdkz5hly2dj-E-e-dkY-nx-36hW7enl_Xqw11iHKkztdceKmVk2gFBCkkoMemqVwjKwlatYqD9rxRTtXBV4KzULe-8q0sWxBnhJ97XU7DkENr-lwW5G_DwJwszdnSHC3NybJA-g_k4miPDmO2cf8f-gN1HloO
CitedBy_id	crossref_primary_10_1016_j_neucom_2016_08_155
Cites_doi	10.1023/A:1010933404324 10.1109/TNN.1998.712192 10.1109/CVPRW.2008.4563065 10.1145/1143844.1143901 10.1177/105971230501300301 10.1016/j.neucom.2007.11.026 10.1007/BF00115009 10.1214/aos/1176347963 10.1109/TSMC.1983.6313077
ContentType	Journal Article
CorporateAuthor	Image Science and Engineering Lab, Tokyo Institute of Technology, 4259 Nagatsuta, Midori-ku, Yokohama 226-8503, Japan
CorporateAuthor_xml	– name: Image Science and Engineering Lab, Tokyo Institute of Technology, 4259 Nagatsuta, Midori-ku, Yokohama 226-8503, Japan
DBID	AAYXX CITATION
DOI	10.20965/jaciii.2009.p0155
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList	CrossRef
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1883-8014
EndPage	161
ExternalDocumentID	10_20965_jaciii_2009_p0155
GroupedDBID	AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION GROUPED_DOAJ P2P
ID	FETCH-LOGICAL-c335t-cd924d586c53a40e54503d3bb7cb575086f6208d2b6c69ed7421e9fd7df513233
ISSN	1343-0130
IngestDate	Tue Jul 01 04:30:39 EDT 2025 Thu Apr 24 22:59:36 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	2
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c335t-cd924d586c53a40e54503d3bb7cb575086f6208d2b6c69ed7421e9fd7df513233
OpenAccessLink	https://doi.org/10.20965/jaciii.2009.p0155
PageCount	7
ParticipantIDs	crossref_primary_10_20965_jaciii_2009_p0155 crossref_citationtrail_10_20965_jaciii_2009_p0155
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2009-03-01
PublicationDateYYYYMMDD	2009-03-01
PublicationDate_xml	– month: 03 year: 2009 text: 2009-03-01 day: 01
PublicationDecade	2000
PublicationTitle	Journal of advanced computational intelligence and intelligent informatics
PublicationYear	2009
References	key-10.20965/jaciii.2009.p0155-3 key-10.20965/jaciii.2009.p0155-2 key-10.20965/jaciii.2009.p0155-1 key-10.20965/jaciii.2009.p0155-11 key-10.20965/jaciii.2009.p0155-12 key-10.20965/jaciii.2009.p0155-7 key-10.20965/jaciii.2009.p0155-6 key-10.20965/jaciii.2009.p0155-5 key-10.20965/jaciii.2009.p0155-4 key-10.20965/jaciii.2009.p0155-9 key-10.20965/jaciii.2009.p0155-8 key-10.20965/jaciii.2009.p0155-10
References_xml	– ident: key-10.20965/jaciii.2009.p0155-2 doi: 10.1023/A:1010933404324 – ident: key-10.20965/jaciii.2009.p0155-4 – ident: key-10.20965/jaciii.2009.p0155-9 doi: 10.1109/TNN.1998.712192 – ident: key-10.20965/jaciii.2009.p0155-5 doi: 10.1109/CVPRW.2008.4563065 – ident: key-10.20965/jaciii.2009.p0155-6 doi: 10.1145/1143844.1143901 – ident: key-10.20965/jaciii.2009.p0155-11 doi: 10.1177/105971230501300301 – ident: key-10.20965/jaciii.2009.p0155-10 – ident: key-10.20965/jaciii.2009.p0155-12 doi: 10.1016/j.neucom.2007.11.026 – ident: key-10.20965/jaciii.2009.p0155-7 doi: 10.1007/BF00115009 – ident: key-10.20965/jaciii.2009.p0155-8 – ident: key-10.20965/jaciii.2009.p0155-3 doi: 10.1214/aos/1176347963 – ident: key-10.20965/jaciii.2009.p0155-1 doi: 10.1109/TSMC.1983.6313077
SSID	ssj0001326041 ssib051641541
Score	1.7600027
Snippet	In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF)...
SourceID	crossref
SourceType	Enrichment Source Index Database
StartPage	155
Title	Random-TD Function Approximator
Volume	13
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF60Xrz4FuszB28lmuwmaXIs2lIEFaSF3sK-IoVai1YUf72zjzxaq1gvIQzZIZuZzGN35xuEzqXPsScxd0MmiBtEjLg08KRLBAN_EhFBqKp3vr2Luv3gZhAOym6burpkyi7458K6kv9IFWggV1Ulu4RkC6ZAgHuQL1xBwnD9k4wf6FiAjetdNzrgnrQkWwoj_GP4RA2K8KLAs9j257qjQ74aOKyCcxpMppyg2gjY4LY8G3__ahdPuxB_U9Zojx7p-3BmFSEpj1FZw0cCRbB7JNLQ4pgoDxbMWEtS0QpcMX2-gdu1XtQ3EOvzBhorsBndGYAr7AwNFzrxiqFVNOw5L1WcHYSsRXNJDQ_VRzNJNY9VtIYhWagm1mBVQkgIIU70y5U3iFi9wGTidtKmmkqzvfz2apWIpRJ69LbQhhWd0zIKsI1W5HgHbeb9OBxrnnfRWaEPTq4PTlUf9lC_0-5ddV3bAMPlhIRTlwvIjkUYRzwk6g-CaNeDv4exJmcQZkM2mkXYiwVmEY8SKZoB9mWSiabIQpglIfuoNn4eywPkSMGbiQdj1b5zkvlxwhJBOYU7zLyI1pGfTzLlFh1eNSkZpT9_7zpqFGMmBhvll6cPl3r6CK2XanqMatOXN3kC4d-UnWrpfgFL3lif
linkProvider	Directory of Open Access Journals
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Random-TD+Function+Approximator&rft.jtitle=Journal+of+advanced+computational+intelligence+and+intelligent+informatics&rft.au=Osman%2C+Hassab+Elgawi&rft.date=2009-03-01&rft.issn=1343-0130&rft.eissn=1883-8014&rft.volume=13&rft.issue=2&rft.spage=155&rft.epage=161&rft_id=info:doi/10.20965%2Fjaciii.2009.p0155&rft.externalDBID=n%2Fa&rft.externalDocID=10_20965_jaciii_2009_p0155
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1343-0130&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1343-0130&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1343-0130&client=summon