Ensembles of Neural Networks for Robust Reinforcement Learning

Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their training and the validation of final policies can be cumbersome as neural networks can suffer from problems like local minima...

Full description

Saved in:

Bibliographic Details
Published in	2010 International Conference on Machine Learning and Applications pp. 401 - 406
Main Authors	Hans, A, Udluft, S
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2010
Subjects	Approximation algorithms Artificial neural networks ensemble methods Function approximation Network topology neural fitted Q-iteration neural networks Neurons reinforcement learning robustness Training
Online Access	Get full text
ISBN	1424492114 9781424492114
DOI	10.1109/ICMLA.2010.66

Cover

Loading…

Abstract	Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their training and the validation of final policies can be cumbersome as neural networks can suffer from problems like local minima or over fitting. When using iterative methods, such as neural fitted Q-iteration, the problem becomes even more pronounced since the network has to be trained multiple times and the training process in one iteration builds on the network trained in the previous iteration. Therefore errors can accumulate. In this paper we propose to use ensembles of networks to make the learning process more robust and produce near-optimal policies more reliably. We name various ways of combining single networks to an ensemble that results in a final ensemble policy and show the potential of the approach using a benchmark application. Our experiments indicate that majority voting is superior to Q-averaging and using heterogeneous ensembles (different network topologies) is advisable.
AbstractList	Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their training and the validation of final policies can be cumbersome as neural networks can suffer from problems like local minima or over fitting. When using iterative methods, such as neural fitted Q-iteration, the problem becomes even more pronounced since the network has to be trained multiple times and the training process in one iteration builds on the network trained in the previous iteration. Therefore errors can accumulate. In this paper we propose to use ensembles of networks to make the learning process more robust and produce near-optimal policies more reliably. We name various ways of combining single networks to an ensemble that results in a final ensemble policy and show the potential of the approach using a benchmark application. Our experiments indicate that majority voting is superior to Q-averaging and using heterogeneous ensembles (different network topologies) is advisable.
Author	Udluft, S Hans, A
Author_xml	– sequence: 1 givenname: A surname: Hans fullname: Hans, A email: alexander.hans.ext@siemens.com organization: Neuroinformatics & Cognitive Robot. Lab., Ilmenau Univ. of Technol., Ilmenau, Germany – sequence: 2 givenname: S surname: Udluft fullname: Udluft, S email: steffen.udluft@siemens.com organization: Corp. Technol., Intell. Syst. & Control, Siemens AG, Munich, Germany
BookMark	eNotjEFLwzAYQAMq6OaOnrzkD3Qm39c0zUUYZc5BVRi7j6T9ItU2laRD_PcW9F0e7_IW7DKMgRi7k2ItpTAP--ql3qxBzF0UF2whc8hzA1Lm12yV0oeYUaC1xhv2uA2JBtdT4qPnr3SOtp81fY_xM3E_Rn4Y3TlN_EBdmLOhgcLEa7IxdOH9ll152yda_XvJjk_bY_Wc1W-7fbWps86IKTNALXrthWqRPDiVoyrRelVA4VCSIGEb60rTmMbLEo1vHAFCC9ZqBMIlu__bdkR0-ordYOPPSWlRlgXiLwxjSF0
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICMLA.2010.66
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EndPage	406
ExternalDocumentID	5708863
Genre	orig-research
GroupedDBID	6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ADFMO ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK IERZE OCL RIE RIL
ID	FETCH-LOGICAL-i90t-92ed3f7f05d3ef2b543583af5626b31e0e0acab89c9cf1839fcbe232d2aa732e3
IEDL.DBID	RIE
ISBN	1424492114 9781424492114
IngestDate	Wed Aug 27 03:16:58 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i90t-92ed3f7f05d3ef2b543583af5626b31e0e0acab89c9cf1839fcbe232d2aa732e3
PageCount	6
ParticipantIDs	ieee_primary_5708863
PublicationCentury	2000
PublicationDate	2010-Dec.
PublicationDateYYYYMMDD	2010-12-01
PublicationDate_xml	– month: 12 year: 2010 text: 2010-Dec.
PublicationDecade	2010
PublicationTitle	2010 International Conference on Machine Learning and Applications
PublicationTitleAbbrev	ICMLA
PublicationYear	2010
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0000527773
Score	1.564164
Snippet	Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems....
SourceID	ieee
SourceType	Publisher
StartPage	401
SubjectTerms	Approximation algorithms Artificial neural networks ensemble methods Function approximation Network topology neural fitted Q-iteration neural networks Neurons reinforcement learning robustness Training
Title	Ensembles of Neural Networks for Robust Reinforcement Learning
URI	https://ieeexplore.ieee.org/document/5708863
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA1zJ08qm_ibHDza2TZpulwEGRtTnMiYsNtImi8izlZse_Gv92u6diIePPXHKR8lfO99fe-FkEtpOHCN-xtswJGgMOVpwwMPwbjBdjaUwhmFZ49i-szvl9GyQ65aLwwAOPEZDKpb9y_fZElZjcqQvOOeEGyH7CBxq71a7TzFj8I4jlnj3ZJIbHgT6dQ8bzM2r-9Gs4fbWtlVBST-OFnFNZbJHpk1S6r1JG-DstCD5OtXWuN_17xP-lsLH31qm9MB6UDaIzfjNId3vYacZpZWuRxqjRcnBM8pwlc6z3SZF3QOLlA1cbNDuslgfemTxWS8GE29zQEK3qv0C0-GYJiNrR8ZBjbUEUKjIVMWIY_QLAAffJUoPZSJTGyFlGyiARGWCZWKWQjskHTTLIUjQo3PJMcyAoMETIVGKRFoFgnLIyt4EByTXlX66qOOyFhtqj75-_Up2Q1bVcgZ6RafJZxjby_0hfuo36dwn9U
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwFLRKGWAC1CK-8cBIShw7Sb0goaqohaZCVZG6VXb8jBAlQSRZ-PU4TpMixMCUj8lPkXX3Xu7OCF1xxYBJs79BE2YaFCocqRhxDBlXBs76PLBG4WgajJ7Zw8JftNB144UBACs-g155a__lqzQuylGZad7NngjoFto2uO-Tyq3VTFRc3wvDkNbuLW5aG1aHOtXPm5TNm_EgmtxV2q4yIvHH2SoWWu73UFQvqlKUvPWKXPbir195jf9d9T7qbkx8-KmBpwPUgqSDbodJBu9yBRlONS6TOcTKXKwUPMOGwOJZKossxzOwkaqxnR7idQrrSxfN74fzwchZH6HgvHI3d7gHiupQu76ioD3pG3LUp0Ib0hNISsAFV8RC9nnMY11yJR1LMBxLeUKE1AN6iNpJmsARwsqlnJkyiDItmPCUEAGR1A8083XACDlGnbL05UcVkrFcV33y9-tLtDOaR5PlZDx9PEW7XqMROUPt_LOAc4P0ubywH_gbNOijHg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2010+International+Conference+on+Machine+Learning+and+Applications&rft.atitle=Ensembles+of+Neural+Networks+for+Robust+Reinforcement+Learning&rft.au=Hans%2C+A&rft.au=Udluft%2C+S&rft.date=2010-12-01&rft.pub=IEEE&rft.isbn=9781424492114&rft.spage=401&rft.epage=406&rft_id=info:doi/10.1109%2FICMLA.2010.66&rft.externalDocID=5708863
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781424492114/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781424492114/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781424492114/sc.gif&client=summon&freeimage=true