Random-TD Function Approximator
In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant...
Saved in:
Published in | Journal of advanced computational intelligence and intelligent informatics Vol. 13; no. 2; pp. 155 - 161 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
01.03.2009
|
Online Access | Get full text |
Cover
Loading…
Abstract | In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant parts of action and is able to learn completely in on-line mode. Such capability of on-line adaptation would take us closer to the goal of more robust and adaptable control. To illustrate this and to demonstrate the applicability of the approach, it has been applied to a non-linear, non-stationary control task, Cart-Pole balancing and on high-dimensional control problems –Ailerons, Elevator, Kinematics, and Friedman–. The results demonstrate that our hybrid approach is adaptable and can significantly improves the performance of TD methods while speeding up the learning process. |
---|---|
AbstractList | In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF) classifier is proposed. We call this implementation Random-TD. The approach iteratively improves its control strategies by exploiting only relevant parts of action and is able to learn completely in on-line mode. Such capability of on-line adaptation would take us closer to the goal of more robust and adaptable control. To illustrate this and to demonstrate the applicability of the approach, it has been applied to a non-linear, non-stationary control task, Cart-Pole balancing and on high-dimensional control problems –Ailerons, Elevator, Kinematics, and Friedman–. The results demonstrate that our hybrid approach is adaptable and can significantly improves the performance of TD methods while speeding up the learning process. |
Author | Osman, Hassab Elgawi |
Author_xml | – sequence: 1 givenname: Hassab Elgawi surname: Osman fullname: Osman, Hassab Elgawi |
BookMark | eNp9j81KAzEUhYNUsNa-gBv7Aqk3uUkmsyz1FwqC1HXIJBlIaSdDZgR9e9PWlQtX99zFdzjfNZl0qQuE3DJYcqiVvN9ZF2MsD9TLHpiUF2TKtEaqgYlJySiQAkO4IvNh2AGUzBUINiV377bz6UC3D4unz86NMXWLVd_n9BUPdkz5hly2dj-E-e-dkY-nx-36hW7enl_Xqw11iHKkztdceKmVk2gFBCkkoMemqVwjKwlatYqD9rxRTtXBV4KzULe-8q0sWxBnhJ97XU7DkENr-lwW5G_DwJwszdnSHC3NybJA-g_k4miPDmO2cf8f-gN1HloO |
CitedBy_id | crossref_primary_10_1016_j_neucom_2016_08_155 |
Cites_doi | 10.1023/A:1010933404324 10.1109/TNN.1998.712192 10.1109/CVPRW.2008.4563065 10.1145/1143844.1143901 10.1177/105971230501300301 10.1016/j.neucom.2007.11.026 10.1007/BF00115009 10.1214/aos/1176347963 10.1109/TSMC.1983.6313077 |
ContentType | Journal Article |
CorporateAuthor | Image Science and Engineering Lab, Tokyo Institute of Technology, 4259 Nagatsuta, Midori-ku, Yokohama 226-8503, Japan |
CorporateAuthor_xml | – name: Image Science and Engineering Lab, Tokyo Institute of Technology, 4259 Nagatsuta, Midori-ku, Yokohama 226-8503, Japan |
DBID | AAYXX CITATION |
DOI | 10.20965/jaciii.2009.p0155 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1883-8014 |
EndPage | 161 |
ExternalDocumentID | 10_20965_jaciii_2009_p0155 |
GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION GROUPED_DOAJ P2P |
ID | FETCH-LOGICAL-c335t-cd924d586c53a40e54503d3bb7cb575086f6208d2b6c69ed7421e9fd7df513233 |
ISSN | 1343-0130 |
IngestDate | Tue Jul 01 04:30:39 EDT 2025 Thu Apr 24 22:59:36 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c335t-cd924d586c53a40e54503d3bb7cb575086f6208d2b6c69ed7421e9fd7df513233 |
OpenAccessLink | https://doi.org/10.20965/jaciii.2009.p0155 |
PageCount | 7 |
ParticipantIDs | crossref_primary_10_20965_jaciii_2009_p0155 crossref_citationtrail_10_20965_jaciii_2009_p0155 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2009-03-01 |
PublicationDateYYYYMMDD | 2009-03-01 |
PublicationDate_xml | – month: 03 year: 2009 text: 2009-03-01 day: 01 |
PublicationDecade | 2000 |
PublicationTitle | Journal of advanced computational intelligence and intelligent informatics |
PublicationYear | 2009 |
References | key-10.20965/jaciii.2009.p0155-3 key-10.20965/jaciii.2009.p0155-2 key-10.20965/jaciii.2009.p0155-1 key-10.20965/jaciii.2009.p0155-11 key-10.20965/jaciii.2009.p0155-12 key-10.20965/jaciii.2009.p0155-7 key-10.20965/jaciii.2009.p0155-6 key-10.20965/jaciii.2009.p0155-5 key-10.20965/jaciii.2009.p0155-4 key-10.20965/jaciii.2009.p0155-9 key-10.20965/jaciii.2009.p0155-8 key-10.20965/jaciii.2009.p0155-10 |
References_xml | – ident: key-10.20965/jaciii.2009.p0155-2 doi: 10.1023/A:1010933404324 – ident: key-10.20965/jaciii.2009.p0155-4 – ident: key-10.20965/jaciii.2009.p0155-9 doi: 10.1109/TNN.1998.712192 – ident: key-10.20965/jaciii.2009.p0155-5 doi: 10.1109/CVPRW.2008.4563065 – ident: key-10.20965/jaciii.2009.p0155-6 doi: 10.1145/1143844.1143901 – ident: key-10.20965/jaciii.2009.p0155-11 doi: 10.1177/105971230501300301 – ident: key-10.20965/jaciii.2009.p0155-10 – ident: key-10.20965/jaciii.2009.p0155-12 doi: 10.1016/j.neucom.2007.11.026 – ident: key-10.20965/jaciii.2009.p0155-7 doi: 10.1007/BF00115009 – ident: key-10.20965/jaciii.2009.p0155-8 – ident: key-10.20965/jaciii.2009.p0155-3 doi: 10.1214/aos/1176347963 – ident: key-10.20965/jaciii.2009.p0155-1 doi: 10.1109/TSMC.1983.6313077 |
SSID | ssj0001326041 ssib051641541 |
Score | 1.7600027 |
Snippet | In this paper, adaptive controller architecture based on a combination of temporal-difference (TD) learning and an on-line variant of Random Forest (RF)... |
SourceID | crossref |
SourceType | Enrichment Source Index Database |
StartPage | 155 |
Title | Random-TD Function Approximator |
Volume | 13 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF60Xrz4FuszB28lmuwmaXIs2lIEFaSF3sK-IoVai1YUf72zjzxaq1gvIQzZIZuZzGN35xuEzqXPsScxd0MmiBtEjLg08KRLBAN_EhFBqKp3vr2Luv3gZhAOym6burpkyi7458K6kv9IFWggV1Ulu4RkC6ZAgHuQL1xBwnD9k4wf6FiAjetdNzrgnrQkWwoj_GP4RA2K8KLAs9j257qjQ74aOKyCcxpMppyg2gjY4LY8G3__ahdPuxB_U9Zojx7p-3BmFSEpj1FZw0cCRbB7JNLQ4pgoDxbMWEtS0QpcMX2-gdu1XtQ3EOvzBhorsBndGYAr7AwNFzrxiqFVNOw5L1WcHYSsRXNJDQ_VRzNJNY9VtIYhWagm1mBVQkgIIU70y5U3iFi9wGTidtKmmkqzvfz2apWIpRJ69LbQhhWd0zIKsI1W5HgHbeb9OBxrnnfRWaEPTq4PTlUf9lC_0-5ddV3bAMPlhIRTlwvIjkUYRzwk6g-CaNeDv4exJmcQZkM2mkXYiwVmEY8SKZoB9mWSiabIQpglIfuoNn4eywPkSMGbiQdj1b5zkvlxwhJBOYU7zLyI1pGfTzLlFh1eNSkZpT9_7zpqFGMmBhvll6cPl3r6CK2XanqMatOXN3kC4d-UnWrpfgFL3lif |
linkProvider | Directory of Open Access Journals |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Random-TD+Function+Approximator&rft.jtitle=Journal+of+advanced+computational+intelligence+and+intelligent+informatics&rft.au=Osman%2C+Hassab+Elgawi&rft.date=2009-03-01&rft.issn=1343-0130&rft.eissn=1883-8014&rft.volume=13&rft.issue=2&rft.spage=155&rft.epage=161&rft_id=info:doi/10.20965%2Fjaciii.2009.p0155&rft.externalDBID=n%2Fa&rft.externalDocID=10_20965_jaciii_2009_p0155 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1343-0130&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1343-0130&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1343-0130&client=summon |