Kernel-Based Actor-Critic Learning Framework for Autonomous Brain Control on Trajectory

Reinforcement learning (RL)-based brain-machine interfaces (BMIs) hold promise for restoring motor functions in paralyzed individuals. These interfaces interpret neural activity to control external devices through trial-and-error. In brain control (BC) tasks, subjects control the device continuously...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on cognitive and developmental systems Vol. 17; no. 3; pp. 554 - 563
Main Authors	Song, Zhiwei, Zhang, Xiang, Chen, Shuhang, Tan, Jieyuan, Wang, Yiwen
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.06.2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Aerospace electronics Algorithms Brain Brain–machine interfaces (BMIs) Decoders Decoding Firing Kernel kernel-based actor–critic learning framework Man-machine interfaces Movement multistep autonomous brain control (BC) Neural activity Neurons Prosthetics Rats Reinforcement learning Target acquisition Trajectory Trajectory control
Online Access	Get full text
ISSN	2379-8920 2379-8939
DOI	10.1109/TCDS.2024.3485078

Cover

Loading…

More Information
Summary:	Reinforcement learning (RL)-based brain-machine interfaces (BMIs) hold promise for restoring motor functions in paralyzed individuals. These interfaces interpret neural activity to control external devices through trial-and-error. In brain control (BC) tasks, subjects control the device continuously moving in space by imagining their own limb movement, in which the subject can change direction at any position before reaching the target. Such multistep BC tasks span a large space both in neural state and over a sequence of movements. However, conventional RL decoders face challenges in efficient exploration and limited guidance from delayed rewards. In this article, we propose a kernel-based actor-critic learning framework for multistep BC tasks. Our framework integrates continuous trajectory control (actor) and internal continuous state value estimation (critic) from medial prefrontal cortex (mPFC) activity. We evaluate our algorithm's performance in a BC three-lever discrimination task using data from two rats, comparing it to a kernel RL decoder with internal binary rewards and delayed external rewards. Experimental results show that our approach achieves faster convergence, shorter target-acquisition time, and shorter distances to targets. These findings highlight the potential of our algorithm for clinical applications in multistep BC tasks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2379-8920 2379-8939
DOI:	10.1109/TCDS.2024.3485078