Hybrid Reinforcement Learning for STAR-RISs: A Coupled Phase-Shift Model Based Beamformer

A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-s...

Full description

Saved in:

Bibliographic Details
Published in	IEEE journal on selected areas in communications Vol. 40; no. 9; pp. 2556 - 2569
Main Authors	Zhong, Ruikang, Liu, Yuanwei, Mu, Xidong, Chen, Yue, Wang, Xianbin, Hanzo, Lajos
Format	Journal Article
Language	English
Published	New York IEEE 01.09.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Array signal processing Beamforming Channel estimation Communications systems Computational modeling deep reinforcement learning (DRL) Energy consumption Hybrid control Machine learning Markov processes MISO (control systems) Optimization Phase shift Power consumption reconfigurable intelligent surfaces (RISs) Reinforcement learning simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) Stars Surface waves
Online Access	Get full text

Cover

Loading…

Abstract	A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-shift control, a practical coupled phase-shift model is considered. Then, a joint active and passive beamforming optimization problem is formulated for minimizing the long-term transmission power consumption, subject to the coupled phase-shift constraint and the minimum data rate constraint. Despite the coupled nature of the phase-shift model, the formulated problem is solved by invoking a hybrid continuous and discrete phase-shift control policy. Inspired by this observation, a pair of hybrid reinforcement learning (RL) algorithms, namely the hybrid deep deterministic policy gradient (hybrid DDPG) algorithm and the joint DDPG & deep-Q network (DDPG-DQN) based algorithm are proposed. The hybrid DDPG algorithm controls the associated high-dimensional continuous and discrete actions by relying on the hybrid action mapping. By contrast, the joint DDPG-DQN algorithm constructs two Markov decision processes (MDPs) relying on an inner and an outer environment, thereby amalgamating the two agents to accomplish a joint hybrid control. Simulation results demonstrate that the STAR-RIS has superiority over other conventional RISs in terms of its energy consumption. Furthermore, both the proposed algorithms outperform the baseline DDPG algorithm, and the joint DDPG-DQN algorithm achieves a superior performance, albeit at an increased computational complexity.
AbstractList	A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-shift control, a practical coupled phase-shift model is considered. Then, a joint active and passive beamforming optimization problem is formulated for minimizing the long-term transmission power consumption, subject to the coupled phase-shift constraint and the minimum data rate constraint. Despite the coupled nature of the phase-shift model, the formulated problem is solved by invoking a hybrid continuous and discrete phase-shift control policy. Inspired by this observation, a pair of hybrid reinforcement learning (RL) algorithms, namely the hybrid deep deterministic policy gradient (hybrid DDPG) algorithm and the joint DDPG & deep-Q network (DDPG-DQN) based algorithm are proposed. The hybrid DDPG algorithm controls the associated high-dimensional continuous and discrete actions by relying on the hybrid action mapping. By contrast, the joint DDPG-DQN algorithm constructs two Markov decision processes (MDPs) relying on an inner and an outer environment, thereby amalgamating the two agents to accomplish a joint hybrid control. Simulation results demonstrate that the STAR-RIS has superiority over other conventional RISs in terms of its energy consumption. Furthermore, both the proposed algorithms outperform the baseline DDPG algorithm, and the joint DDPG-DQN algorithm achieves a superior performance, albeit at an increased computational complexity.
Author	Mu, Xidong Zhong, Ruikang Hanzo, Lajos Liu, Yuanwei Chen, Yue Wang, Xianbin
Author_xml	– sequence: 1 givenname: Ruikang orcidid: 0000-0003-4914-6425 surname: Zhong fullname: Zhong, Ruikang email: r.zhong@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 2 givenname: Yuanwei orcidid: 0000-0002-6389-8941 surname: Liu fullname: Liu, Yuanwei email: yuanwei.liu@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 3 givenname: Xidong orcidid: 0000-0001-8351-360X surname: Mu fullname: Mu, Xidong email: xidong.mu@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 4 givenname: Yue surname: Chen fullname: Chen, Yue email: yue.chen@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 5 givenname: Xianbin orcidid: 0000-0003-4890-0748 surname: Wang fullname: Wang, Xianbin email: xianbin.wang@uwo.ca organization: Department of Electrical and Computer Engineering, Western University, London, Canada – sequence: 6 givenname: Lajos orcidid: 0000-0002-2636-5214 surname: Hanzo fullname: Hanzo, Lajos email: lh@ecs.soton.ac.uk organization: School of Electronics and Computer Science, University of Southampton, Southampton, U.K
BookMark	eNp9kD1PwzAQhi1UJErhByAWS8wpZzsfNlsaAS0qAjVlYIqc5ExdNUlx0qH_nlStGBiYTnp1z72n55IM6qZGQm4YjBkDdf-SxsmYA-djwRSHQJyRIQsC6QGAHJAhREJ4MmLhBbls2zUA833Jh-Rzus-dLekCbW0aV2CFdUfnqF1t6y_aRzRdxgtvMUvbBxrTpNltN1jS95Vu0UtX1nT0tSlxQyd9UNIJ6qqHKnRX5NzoTYvXpzkiH0-Py2Tqzd-eZ0k89wquROfxQqpQFyqXITAVAY84CMl9LVGY3Jg8zDE3Ouw_Rq59zUuugGOpSmn8MAcxInfHu1vXfO-w7bJ1s3N1X5nxCHxQURD6_RY7bhWuaVuHJts6W2m3zxhkB4PZwWB2MJidDPZM9IcpbKc729Sd03bzL3l7JC0i_jYpKSIlAvEDINd-Wg
CODEN	ISACEM
CitedBy_id	crossref_primary_10_1109_JIOT_2024_3376543 crossref_primary_10_1109_TCOMM_2024_3418910 crossref_primary_10_1109_JPROC_2024_3405351 crossref_primary_10_1109_TCOMM_2024_3364988 crossref_primary_10_1109_TSP_2024_3413017 crossref_primary_10_1109_LCOMM_2023_3324488 crossref_primary_10_1109_TWC_2023_3321395 crossref_primary_10_1016_j_adhoc_2023_103370 crossref_primary_10_1109_TVT_2024_3349509 crossref_primary_10_1109_MNET_129_2200389 crossref_primary_10_3390_e27020210 crossref_primary_10_1016_j_comnet_2024_110960 crossref_primary_10_1109_LWC_2023_3242449 crossref_primary_10_1109_JSTSP_2024_3449124 crossref_primary_10_1016_j_jksuci_2024_102215 crossref_primary_10_1109_TVT_2023_3336260 crossref_primary_10_1145_3571072 crossref_primary_10_1109_LCOMM_2024_3462798 crossref_primary_10_1109_JIOT_2023_3309859 crossref_primary_10_1109_MNET_004_2300271 crossref_primary_10_1109_JIOT_2023_3279357 crossref_primary_10_1109_TWC_2024_3476383 crossref_primary_10_1109_TVT_2024_3419554 crossref_primary_10_1109_LWC_2023_3251357 crossref_primary_10_1109_TWC_2023_3349230 crossref_primary_10_26599_TST_2024_9010086 crossref_primary_10_1109_JIOT_2024_3416334 crossref_primary_10_1109_TCCN_2024_3384500 crossref_primary_10_1109_JIOT_2023_3297241 crossref_primary_10_1016_j_jiixd_2023_06_003 crossref_primary_10_1109_TCOMM_2023_3335411
Cites_doi	10.1109/TVT.2021.3058995 10.1109/LCOMM.2021.3063464 10.1109/JSAC.2020.3018823 10.1109/TWC.2019.2922609 10.1109/COMST.2020.3004197 10.1109/JSAC.2020.3000814 10.1038/s41598-021-99722-x 10.1109/TAP.2015.2481479 10.1109/COMST.2021.3063822 10.1109/TCOMM.2020.3001125 10.1109/TCOMM.2021.3106686 10.1109/LCOMM.2020.3025345 10.1109/LWC.2021.3107547 10.1038/srep04971 10.1109/icc45855.2022.9838767 10.1109/MWC.011.2100016 10.1109/LCOMM.2021.3082214 10.1109/JIOT.2019.2921159 10.1109/TWC.2020.3006915 10.1109/LCOMM.2020.3041510 10.1109/JSAC.2020.3000835 10.1109/TWC.2020.3024860 10.1109/TVT.2021.3109786 10.1109/LCOMM.2021.3091807 10.1109/MSP.2017.2743240 10.1109/TVT.2020.3024756 10.1109/ACCESS.2019.2957706 10.1109/TCCN.2020.2992604 10.1109/TASE.2020.3043636 10.1109/TWC.2021.3118225 10.1109/TVT.2021.3063953 10.1109/SPAWC51858.2021.9593172 10.1109/COMST.2020.2965856
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID	97E RIA RIE AAYXX CITATION 7SP 8FD L7M
DOI	10.1109/JSAC.2022.3192053
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace
DatabaseTitle	CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-0008
EndPage	2569
ExternalDocumentID	10_1109_JSAC_2022_3192053 9837935
Genre	orig-research
GrantInformation_xml	– fundername: China Scholarship Council grantid: 201908610187 funderid: 10.13039/501100004543 – fundername: Engineering and Physical Sciences Research Council grantid: EP/P034284/1; EP/P003990/1 (COALESCE) funderid: 10.13039/501100000266 – fundername: European Research Council’s Advanced Fellow grantid: QuantCom (789028) funderid: 10.13039/501100000781 – fundername: Engineering and Physical Sciences Research Council grantid: EP/W035588/1 funderid: 10.13039/501100000266
GroupedDBID	-~X .DC 0R~ 29I 3EH 4.4 41~ 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT ADRHT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IBMZZ ICLAB IES IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYOK AAYXX CITATION RIG 7SP 8FD L7M
ID	FETCH-LOGICAL-c293t-2c896ac9b860197027203824a8e3fbffb6bebfa6001e2a4a2d2902ed9d8f46b03
IEDL.DBID	RIE
ISSN	0733-8716
IngestDate	Mon Jun 30 10:20:21 EDT 2025 Tue Jul 01 02:06:32 EDT 2025 Thu Apr 24 23:06:29 EDT 2025 Wed Aug 27 02:22:58 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	9
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c293t-2c896ac9b860197027203824a8e3fbffb6bebfa6001e2a4a2d2902ed9d8f46b03
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-6389-8941 0000-0003-4914-6425 0000-0001-8351-360X 0000-0002-2636-5214 0000-0003-4890-0748
PQID	2704097564
PQPubID	85481
PageCount	14
ParticipantIDs	ieee_primary_9837935 crossref_primary_10_1109_JSAC_2022_3192053 crossref_citationtrail_10_1109_JSAC_2022_3192053 proquest_journals_2704097564
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-09-01
PublicationDateYYYYMMDD	2022-09-01
PublicationDate_xml	– month: 09 year: 2022 text: 2022-09-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE journal on selected areas in communications
PublicationTitleAbbrev	J-SAC
PublicationYear	2022
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref12 ref15 ref37 ref14 ref36 Delalleau (ref30) 2019 ref11 ref10 ref32 ref2 ref1 ref17 ref39 ref16 ref38 ref19 ref18 Neunert (ref29); 100 Li (ref31) 2021 Ni (ref28) 2021 ref24 ref23 ref26 ref25 ref20 (ref33) 2018 ref22 ref21 ref27 Li (ref35) 2021 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 Lillicrap (ref34) 2015
References_xml	– ident: ref40 doi: 10.1109/TVT.2021.3058995 – ident: ref22 doi: 10.1109/LCOMM.2021.3063464 – ident: ref26 doi: 10.1109/JSAC.2020.3018823 – ident: ref6 doi: 10.1109/TWC.2019.2922609 – ident: ref11 doi: 10.1109/COMST.2020.3004197 – ident: ref32 doi: 10.1109/JSAC.2020.3000814 – ident: ref7 doi: 10.1038/s41598-021-99722-x – ident: ref15 doi: 10.1109/TAP.2015.2481479 – ident: ref36 doi: 10.1109/COMST.2021.3063822 – ident: ref20 doi: 10.1109/TCOMM.2020.3001125 – ident: ref5 doi: 10.1109/TCOMM.2021.3106686 – ident: ref13 doi: 10.1109/LCOMM.2020.3025345 – ident: ref4 doi: 10.1109/LWC.2021.3107547 – ident: ref14 doi: 10.1038/srep04971 – ident: ref19 doi: 10.1109/icc45855.2022.9838767 – ident: ref10 doi: 10.1109/MWC.011.2100016 – volume: 100 start-page: 735 volume-title: Proc. CoRL ident: ref29 article-title: Continuous-discrete reinforcement learning for hybrid control in robotics – volume-title: Study on 3D Channel Model for LTE year: 2018 ident: ref33 – ident: ref16 doi: 10.1109/LCOMM.2021.3082214 – ident: ref39 doi: 10.1109/JIOT.2019.2921159 – ident: ref2 doi: 10.1109/TWC.2020.3006915 – ident: ref23 doi: 10.1109/LCOMM.2020.3041510 – ident: ref24 doi: 10.1109/JSAC.2020.3000835 – ident: ref25 doi: 10.1109/TWC.2020.3024860 – ident: ref9 doi: 10.1109/TVT.2021.3109786 – ident: ref18 doi: 10.1109/LCOMM.2021.3091807 – ident: ref38 doi: 10.1109/MSP.2017.2743240 – ident: ref17 doi: 10.1109/TVT.2020.3024756 – year: 2019 ident: ref30 article-title: Discrete and continuous action representation for practical RL in video games publication-title: arXiv:1912.11077 – ident: ref1 doi: 10.1109/ACCESS.2019.2957706 – year: 2021 ident: ref28 article-title: STAR-RIS integrated non-orthogonal multiple access and over-the-air federated learning: Framework, analysis, and optimization publication-title: arXiv:2106.08592 – ident: ref3 doi: 10.1109/TCCN.2020.2992604 – ident: ref37 doi: 10.1109/TASE.2020.3043636 – year: 2015 ident: ref34 article-title: Continuous control with deep reinforcement learning publication-title: arXiv:1509.02971 – ident: ref12 doi: 10.1109/TWC.2021.3118225 – year: 2021 ident: ref35 article-title: Radio resource management for cellular-connected UAV: A DRL solution publication-title: arXiv:2102.13222 – ident: ref27 doi: 10.1109/TVT.2021.3063953 – ident: ref8 doi: 10.1109/SPAWC51858.2021.9593172 – ident: ref21 doi: 10.1109/COMST.2020.2965856 – year: 2021 ident: ref31 article-title: HyAR: Addressing discrete-continuous action reinforcement learning via hybrid action representation publication-title: arXiv:2109.05490
SSID	ssj0014482
Score	2.6034582
Snippet	A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO)...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	2556
SubjectTerms	Algorithms Array signal processing Beamforming Channel estimation Communications systems Computational modeling deep reinforcement learning (DRL) Energy consumption Hybrid control Machine learning Markov processes MISO (control systems) Optimization Phase shift Power consumption reconfigurable intelligent surfaces (RISs) Reinforcement learning simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) Stars Surface waves
Title	Hybrid Reinforcement Learning for STAR-RISs: A Coupled Phase-Shift Model Based Beamformer
URI	https://ieeexplore.ieee.org/document/9837935 https://www.proquest.com/docview/2704097564
Volume	40
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTxsxEB5BTuXQFmjVUIp84FR1g-N9ZM0tREQBiapKQAqnlR9jQIQEkc0Bfj0e7yZqAVXcVpa9a3nsnfk8M98A7Ftl05SoLy3FVnj7HyPltUBE6XomyRAVUqLw2e9scJGcjtPxGvxa5cIgYgg-wxY9Bl--nZkFXZUdSI-mZJyuw7oHblWu1spj4D8TPAadOI4IBNQezDaXB6ejbs8jQSE8QJWCp_E_OigUVXn1Jw7qpf8JzpYTq6JKbluLUrfM0wvOxvfO_DN8rO1M1q02xias4XQLNv5iH9yGy8EjpWuxIQb2VBMuCllNuHrFfBMbnXeH0fBkND9kXdabLe4naNmfa6_5otH1jSsZlVKbsCPfYNkRqjsygfHhC1z0j897g6iutBAZr-7LSJhcZspInXt8Jjs8OGdzkagcY6ed05lG7RQZRyhUooQVkgu00uYuyTSPv0JjOpviN2DO93AqMbrtbMKNVG3dMTaTTscojcEm8OXaF6amIadqGJMiwBEuCxJXQeIqanE14edqyH3FwfG_ztu0_KuO9co3YXcp4KI-pfNCdDjRfaVZsvP2qO_wgd5dxZTtQqN8WOAPb4SUei_svmdeldej
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTxsxEB5ReqAc2vJSA7T1oSfUDY73kXVvISoKlCCUBAlOKz_GpCIkCDaH8uvr8W4i-lDV28oay9aMvTPjmfkG4JNVNk0J-tJSboW3_zFSXgtEVK5nkgxRIRUK98-z3mVyepVercDnZS0MIobkM2zSZ4jl25mZ01PZofTelIzTF_DS6_20VVVrLWMGfqEQM2jHcURuQB3DbHF5eDrsdL0vKIR3UaXgafyLFgptVf74FwcFc_wG-outVXklt815qZvm6TfUxv_d-1t4XVuarFMdjQ1YwekmrD_DH9yC694PKthiAwz4qSY8FbIacvWG-SE2HHUG0eBk-PiFdVh3Nr-foGUXY6_7ouH4uysZNVObsCM_YNkRqjsygvFhGy6Pv466vajutRAZr_DLSJhcZspInXsPTbZ5CM_mIlE5xk47pzON2ikyj1CoRAkrJBdopc1dkmke78DqdDbFd8Ccp3AqMbrlbMKNVC3dNjaTTscojcEG8AXvC1MDkVM_jEkRHBIuCxJXQeIqanE14GA55b5C4fgX8Raxf0lYc74B-wsBF_U9fSxEmxPgV5olu3-f9RHWeqP-WXF2cv5tD17ROlWG2T6slg9zfO9NklJ_CCfxJ8j-2uw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hybrid+Reinforcement+Learning+for+STAR-RISs%3A+A+Coupled+Phase-Shift+Model+Based+Beamformer&rft.jtitle=IEEE+journal+on+selected+areas+in+communications&rft.au=Zhong%2C+Ruikang&rft.au=Liu%2C+Yuanwei&rft.au=Mu%2C+Xidong&rft.au=Chen%2C+Yue&rft.date=2022-09-01&rft.pub=IEEE&rft.issn=0733-8716&rft.volume=40&rft.issue=9&rft.spage=2556&rft.epage=2569&rft_id=info:doi/10.1109%2FJSAC.2022.3192053&rft.externalDocID=9837935
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0733-8716&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0733-8716&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0733-8716&client=summon