Bionic autonomous learning control of a two-wheeled self-balancing flexible robot

This paper presents an OCPA （operant conditioning probabilistic automaton） bionic autonomous learning system based on Skinner＇s operant conditioning theory for solving the balance control problem of a two-wheeled flexible robot. The OCPA learning system consists of two stages： in the first stage, an...

Full description

Saved in:

Bibliographic Details
Published in	Journal of control theory and applications Vol. 9; no. 4; pp. 521 - 528
Main Authors	Cai, Jianxian, Ruan, Xiaogang
Format	Journal Article
Language	English
Published	Heidelberg South China University of Technology and Academy of Mathematics and Systems Science, CAS 01.11.2011 Institute of Disaster Prevention, Sanhe Hebei 065201, China%School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China
Subjects	Autonomous Balancing Complexity Computational Intelligence Conditioning Control Control and Systems Theory Control systems Engineering Entropy Learning Mechatronics Optimization Posture Robotics Robots Systems Theory 三轮仿生可操作性学习控制学习系统平衡控制柔性机器人概率自动机 Operant conditioning Bionic autonomous learning Probabilistic automaton Two-wheeled flexible robot Poster balance control
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents an OCPA （operant conditioning probabilistic automaton） bionic autonomous learning system based on Skinner＇s operant conditioning theory for solving the balance control problem of a two-wheeled flexible robot. The OCPA learning system consists of two stages： in the first stage, an operant action is selected stochastically from a set of operant actions and then used as the input of the control system; in the second stage, the learning system gathers the orientation information of the system and uses it for optimization until achieves control target. At the same time, the size of the operant action set can be automatically reduced during the learning process for avoiding little probability event. Theory analysis is made for the designed OCPA learning system in the paper, which theoretically proves the convergence of operant conditioning learning mechanism in OCPA learning system, namely the operant action entropy will converge to minimum with the learning process. And then OCPA learning system is applied to posture balanced control of two-wheeled flexible self-balanced robots. Robot does not have posutre balanced skill in initial state and the selecting probability of each operant in operant sets is equal. With the learning proceeding, the selected probabilities of optimal operant gradually tend to one and the operant action entropy gradually tends to minimum, and so robot gradually learned the posture balanced skill.
Bibliography:	Two-wheeled flexible robot; Poster balance control; Operant conditioning; Probabilistic automaton; Bionic autonomous learning 44-1600/TP This paper presents an OCPA （operant conditioning probabilistic automaton） bionic autonomous learning system based on Skinner＇s operant conditioning theory for solving the balance control problem of a two-wheeled flexible robot. The OCPA learning system consists of two stages： in the first stage, an operant action is selected stochastically from a set of operant actions and then used as the input of the control system; in the second stage, the learning system gathers the orientation information of the system and uses it for optimization until achieves control target. At the same time, the size of the operant action set can be automatically reduced during the learning process for avoiding little probability event. Theory analysis is made for the designed OCPA learning system in the paper, which theoretically proves the convergence of operant conditioning learning mechanism in OCPA learning system, namely the operant action entropy will converge to minimum with the learning process. And then OCPA learning system is applied to posture balanced control of two-wheeled flexible self-balanced robots. Robot does not have posutre balanced skill in initial state and the selecting probability of each operant in operant sets is equal. With the learning proceeding, the selected probabilities of optimal operant gradually tend to one and the operant action entropy gradually tends to minimum, and so robot gradually learned the posture balanced skill. ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1672-6340 1993-0623
DOI:	10.1007/s11768-011-9277-1