Adaptive Learning in Tracking Control Based on the Dual Critic Network Design

In this paper, we present a new adaptive dynamic programming approach by integrating a reference network that provides an internal goal representation to help the systems learning and optimization. Specifically, we build the reference network on top of the critic network to form a dual critic networ...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 24; no. 6; pp. 913 - 928
Main Authors	Ni, Zhen, He, Haibo, Wen, Jinyu
Format	Journal Article
Language	English
Published	New York, NY IEEE 01.06.2013 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptive critic design (ACD) adaptive dynamic programming (ADP) Adaptive systems Applied sciences Artificial intelligence Computer science; control theory; systems Computer simulation Computer systems and distributed systems. User interface Connectionism. Neural networks Design engineering Dynamic programming Dynamical systems Erbium Exact sciences and technology internal goal Learning Lyapunov methods lyapunov stability analysis Networks Nickel online learning Optimization Reinforcement reinforcement learning Representations Simulation Software Studies Tracking control Vectors virtual reality Cloud computing Network structure Tracking Lyapunov method Adaptive control Adaptive method Optimization Adaptive critic design (ACD) User interface Virtual reality User assistance Dynamic programming Generic programming lyapunov stability analysis Stability Computer simulation Reinforcement learning Neural network Distributed system Real time adaptive dynamic programming (ADP) tracking control Value function internal goal Heuristic method Lyapunov function online learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we present a new adaptive dynamic programming approach by integrating a reference network that provides an internal goal representation to help the systems learning and optimization. Specifically, we build the reference network on top of the critic network to form a dual critic network design that contains the detailed internal goal representation to help approximate the value function. This internal goal signal, working as the reinforcement signal for the critic network in our design, is adaptively generated by the reference network and can also be adjusted automatically. In this way, we provide an alternative choice rather than crafting the reinforcement signal manually from prior knowledge. In this paper, we adopt the online action-dependent heuristic dynamic programming (ADHDP) design and provide the detailed design of the dual critic network structure. Detailed Lyapunov stability analysis for our proposed approach is presented to support the proposed structure from a theoretical point of view. Furthermore, we also develop a virtual reality platform to demonstrate the real-time simulation of our approach under different disturbance situations. The overall adaptive learning performance has been tested on two tracking control benchmarks with a tracking filter. For comparative studies, we also present the tracking performance with the typical ADHDP, and the simulation results justify the improved performance with our approach.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2013.2247627