Hybrid Reinforcement Learning for STAR-RISs: A Coupled Phase-Shift Model Based Beamformer
A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-s...
Saved in:
Published in | IEEE journal on selected areas in communications Vol. 40; no. 9; pp. 2556 - 2569 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.09.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-shift control, a practical coupled phase-shift model is considered. Then, a joint active and passive beamforming optimization problem is formulated for minimizing the long-term transmission power consumption, subject to the coupled phase-shift constraint and the minimum data rate constraint. Despite the coupled nature of the phase-shift model, the formulated problem is solved by invoking a hybrid continuous and discrete phase-shift control policy. Inspired by this observation, a pair of hybrid reinforcement learning (RL) algorithms, namely the hybrid deep deterministic policy gradient (hybrid DDPG) algorithm and the joint DDPG & deep-Q network (DDPG-DQN) based algorithm are proposed. The hybrid DDPG algorithm controls the associated high-dimensional continuous and discrete actions by relying on the hybrid action mapping. By contrast, the joint DDPG-DQN algorithm constructs two Markov decision processes (MDPs) relying on an inner and an outer environment, thereby amalgamating the two agents to accomplish a joint hybrid control. Simulation results demonstrate that the STAR-RIS has superiority over other conventional RISs in terms of its energy consumption. Furthermore, both the proposed algorithms outperform the baseline DDPG algorithm, and the joint DDPG-DQN algorithm achieves a superior performance, albeit at an increased computational complexity. |
---|---|
AbstractList | A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO) communication system is investigated. In contrast to the existing ideal STAR-RIS model assuming an independent transmission and reflection phase-shift control, a practical coupled phase-shift model is considered. Then, a joint active and passive beamforming optimization problem is formulated for minimizing the long-term transmission power consumption, subject to the coupled phase-shift constraint and the minimum data rate constraint. Despite the coupled nature of the phase-shift model, the formulated problem is solved by invoking a hybrid continuous and discrete phase-shift control policy. Inspired by this observation, a pair of hybrid reinforcement learning (RL) algorithms, namely the hybrid deep deterministic policy gradient (hybrid DDPG) algorithm and the joint DDPG & deep-Q network (DDPG-DQN) based algorithm are proposed. The hybrid DDPG algorithm controls the associated high-dimensional continuous and discrete actions by relying on the hybrid action mapping. By contrast, the joint DDPG-DQN algorithm constructs two Markov decision processes (MDPs) relying on an inner and an outer environment, thereby amalgamating the two agents to accomplish a joint hybrid control. Simulation results demonstrate that the STAR-RIS has superiority over other conventional RISs in terms of its energy consumption. Furthermore, both the proposed algorithms outperform the baseline DDPG algorithm, and the joint DDPG-DQN algorithm achieves a superior performance, albeit at an increased computational complexity. |
Author | Mu, Xidong Zhong, Ruikang Hanzo, Lajos Liu, Yuanwei Chen, Yue Wang, Xianbin |
Author_xml | – sequence: 1 givenname: Ruikang orcidid: 0000-0003-4914-6425 surname: Zhong fullname: Zhong, Ruikang email: r.zhong@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 2 givenname: Yuanwei orcidid: 0000-0002-6389-8941 surname: Liu fullname: Liu, Yuanwei email: yuanwei.liu@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 3 givenname: Xidong orcidid: 0000-0001-8351-360X surname: Mu fullname: Mu, Xidong email: xidong.mu@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 4 givenname: Yue surname: Chen fullname: Chen, Yue email: yue.chen@qmul.ac.uk organization: School of Electronic Engineering and Computer Science, Queen Mary University of London, London, U.K – sequence: 5 givenname: Xianbin orcidid: 0000-0003-4890-0748 surname: Wang fullname: Wang, Xianbin email: xianbin.wang@uwo.ca organization: Department of Electrical and Computer Engineering, Western University, London, Canada – sequence: 6 givenname: Lajos orcidid: 0000-0002-2636-5214 surname: Hanzo fullname: Hanzo, Lajos email: lh@ecs.soton.ac.uk organization: School of Electronics and Computer Science, University of Southampton, Southampton, U.K |
BookMark | eNp9kD1PwzAQhi1UJErhByAWS8wpZzsfNlsaAS0qAjVlYIqc5ExdNUlx0qH_nlStGBiYTnp1z72n55IM6qZGQm4YjBkDdf-SxsmYA-djwRSHQJyRIQsC6QGAHJAhREJ4MmLhBbls2zUA833Jh-Rzus-dLekCbW0aV2CFdUfnqF1t6y_aRzRdxgtvMUvbBxrTpNltN1jS95Vu0UtX1nT0tSlxQyd9UNIJ6qqHKnRX5NzoTYvXpzkiH0-Py2Tqzd-eZ0k89wquROfxQqpQFyqXITAVAY84CMl9LVGY3Jg8zDE3Ouw_Rq59zUuugGOpSmn8MAcxInfHu1vXfO-w7bJ1s3N1X5nxCHxQURD6_RY7bhWuaVuHJts6W2m3zxhkB4PZwWB2MJidDPZM9IcpbKc729Sd03bzL3l7JC0i_jYpKSIlAvEDINd-Wg |
CODEN | ISACEM |
CitedBy_id | crossref_primary_10_1109_JIOT_2024_3376543 crossref_primary_10_1109_TCOMM_2024_3418910 crossref_primary_10_1109_JPROC_2024_3405351 crossref_primary_10_1109_TCOMM_2024_3364988 crossref_primary_10_1109_TSP_2024_3413017 crossref_primary_10_1109_LCOMM_2023_3324488 crossref_primary_10_1109_TWC_2023_3321395 crossref_primary_10_1016_j_adhoc_2023_103370 crossref_primary_10_1109_TVT_2024_3349509 crossref_primary_10_1109_MNET_129_2200389 crossref_primary_10_3390_e27020210 crossref_primary_10_1016_j_comnet_2024_110960 crossref_primary_10_1109_LWC_2023_3242449 crossref_primary_10_1109_JSTSP_2024_3449124 crossref_primary_10_1016_j_jksuci_2024_102215 crossref_primary_10_1109_TVT_2023_3336260 crossref_primary_10_1145_3571072 crossref_primary_10_1109_LCOMM_2024_3462798 crossref_primary_10_1109_JIOT_2023_3309859 crossref_primary_10_1109_MNET_004_2300271 crossref_primary_10_1109_JIOT_2023_3279357 crossref_primary_10_1109_TWC_2024_3476383 crossref_primary_10_1109_TVT_2024_3419554 crossref_primary_10_1109_LWC_2023_3251357 crossref_primary_10_1109_TWC_2023_3349230 crossref_primary_10_26599_TST_2024_9010086 crossref_primary_10_1109_JIOT_2024_3416334 crossref_primary_10_1109_TCCN_2024_3384500 crossref_primary_10_1109_JIOT_2023_3297241 crossref_primary_10_1016_j_jiixd_2023_06_003 crossref_primary_10_1109_TCOMM_2023_3335411 |
Cites_doi | 10.1109/TVT.2021.3058995 10.1109/LCOMM.2021.3063464 10.1109/JSAC.2020.3018823 10.1109/TWC.2019.2922609 10.1109/COMST.2020.3004197 10.1109/JSAC.2020.3000814 10.1038/s41598-021-99722-x 10.1109/TAP.2015.2481479 10.1109/COMST.2021.3063822 10.1109/TCOMM.2020.3001125 10.1109/TCOMM.2021.3106686 10.1109/LCOMM.2020.3025345 10.1109/LWC.2021.3107547 10.1038/srep04971 10.1109/icc45855.2022.9838767 10.1109/MWC.011.2100016 10.1109/LCOMM.2021.3082214 10.1109/JIOT.2019.2921159 10.1109/TWC.2020.3006915 10.1109/LCOMM.2020.3041510 10.1109/JSAC.2020.3000835 10.1109/TWC.2020.3024860 10.1109/TVT.2021.3109786 10.1109/LCOMM.2021.3091807 10.1109/MSP.2017.2743240 10.1109/TVT.2020.3024756 10.1109/ACCESS.2019.2957706 10.1109/TCCN.2020.2992604 10.1109/TASE.2020.3043636 10.1109/TWC.2021.3118225 10.1109/TVT.2021.3063953 10.1109/SPAWC51858.2021.9593172 10.1109/COMST.2020.2965856 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD L7M |
DOI | 10.1109/JSAC.2022.3192053 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
DatabaseTitle | CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-0008 |
EndPage | 2569 |
ExternalDocumentID | 10_1109_JSAC_2022_3192053 9837935 |
Genre | orig-research |
GrantInformation_xml | – fundername: China Scholarship Council grantid: 201908610187 funderid: 10.13039/501100004543 – fundername: Engineering and Physical Sciences Research Council grantid: EP/P034284/1; EP/P003990/1 (COALESCE) funderid: 10.13039/501100000266 – fundername: European Research Council’s Advanced Fellow grantid: QuantCom (789028) funderid: 10.13039/501100000781 – fundername: Engineering and Physical Sciences Research Council grantid: EP/W035588/1 funderid: 10.13039/501100000266 |
GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 41~ 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT ADRHT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IBMZZ ICLAB IES IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYOK AAYXX CITATION RIG 7SP 8FD L7M |
ID | FETCH-LOGICAL-c293t-2c896ac9b860197027203824a8e3fbffb6bebfa6001e2a4a2d2902ed9d8f46b03 |
IEDL.DBID | RIE |
ISSN | 0733-8716 |
IngestDate | Mon Jun 30 10:20:21 EDT 2025 Tue Jul 01 02:06:32 EDT 2025 Thu Apr 24 23:06:29 EDT 2025 Wed Aug 27 02:22:58 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 9 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c293t-2c896ac9b860197027203824a8e3fbffb6bebfa6001e2a4a2d2902ed9d8f46b03 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-6389-8941 0000-0003-4914-6425 0000-0001-8351-360X 0000-0002-2636-5214 0000-0003-4890-0748 |
PQID | 2704097564 |
PQPubID | 85481 |
PageCount | 14 |
ParticipantIDs | ieee_primary_9837935 crossref_primary_10_1109_JSAC_2022_3192053 crossref_citationtrail_10_1109_JSAC_2022_3192053 proquest_journals_2704097564 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-09-01 |
PublicationDateYYYYMMDD | 2022-09-01 |
PublicationDate_xml | – month: 09 year: 2022 text: 2022-09-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE journal on selected areas in communications |
PublicationTitleAbbrev | J-SAC |
PublicationYear | 2022 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 ref37 ref14 ref36 Delalleau (ref30) 2019 ref11 ref10 ref32 ref2 ref1 ref17 ref39 ref16 ref38 ref19 ref18 Neunert (ref29); 100 Li (ref31) 2021 Ni (ref28) 2021 ref24 ref23 ref26 ref25 ref20 (ref33) 2018 ref22 ref21 ref27 Li (ref35) 2021 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 Lillicrap (ref34) 2015 |
References_xml | – ident: ref40 doi: 10.1109/TVT.2021.3058995 – ident: ref22 doi: 10.1109/LCOMM.2021.3063464 – ident: ref26 doi: 10.1109/JSAC.2020.3018823 – ident: ref6 doi: 10.1109/TWC.2019.2922609 – ident: ref11 doi: 10.1109/COMST.2020.3004197 – ident: ref32 doi: 10.1109/JSAC.2020.3000814 – ident: ref7 doi: 10.1038/s41598-021-99722-x – ident: ref15 doi: 10.1109/TAP.2015.2481479 – ident: ref36 doi: 10.1109/COMST.2021.3063822 – ident: ref20 doi: 10.1109/TCOMM.2020.3001125 – ident: ref5 doi: 10.1109/TCOMM.2021.3106686 – ident: ref13 doi: 10.1109/LCOMM.2020.3025345 – ident: ref4 doi: 10.1109/LWC.2021.3107547 – ident: ref14 doi: 10.1038/srep04971 – ident: ref19 doi: 10.1109/icc45855.2022.9838767 – ident: ref10 doi: 10.1109/MWC.011.2100016 – volume: 100 start-page: 735 volume-title: Proc. CoRL ident: ref29 article-title: Continuous-discrete reinforcement learning for hybrid control in robotics – volume-title: Study on 3D Channel Model for LTE year: 2018 ident: ref33 – ident: ref16 doi: 10.1109/LCOMM.2021.3082214 – ident: ref39 doi: 10.1109/JIOT.2019.2921159 – ident: ref2 doi: 10.1109/TWC.2020.3006915 – ident: ref23 doi: 10.1109/LCOMM.2020.3041510 – ident: ref24 doi: 10.1109/JSAC.2020.3000835 – ident: ref25 doi: 10.1109/TWC.2020.3024860 – ident: ref9 doi: 10.1109/TVT.2021.3109786 – ident: ref18 doi: 10.1109/LCOMM.2021.3091807 – ident: ref38 doi: 10.1109/MSP.2017.2743240 – ident: ref17 doi: 10.1109/TVT.2020.3024756 – year: 2019 ident: ref30 article-title: Discrete and continuous action representation for practical RL in video games publication-title: arXiv:1912.11077 – ident: ref1 doi: 10.1109/ACCESS.2019.2957706 – year: 2021 ident: ref28 article-title: STAR-RIS integrated non-orthogonal multiple access and over-the-air federated learning: Framework, analysis, and optimization publication-title: arXiv:2106.08592 – ident: ref3 doi: 10.1109/TCCN.2020.2992604 – ident: ref37 doi: 10.1109/TASE.2020.3043636 – year: 2015 ident: ref34 article-title: Continuous control with deep reinforcement learning publication-title: arXiv:1509.02971 – ident: ref12 doi: 10.1109/TWC.2021.3118225 – year: 2021 ident: ref35 article-title: Radio resource management for cellular-connected UAV: A DRL solution publication-title: arXiv:2102.13222 – ident: ref27 doi: 10.1109/TVT.2021.3063953 – ident: ref8 doi: 10.1109/SPAWC51858.2021.9593172 – ident: ref21 doi: 10.1109/COMST.2020.2965856 – year: 2021 ident: ref31 article-title: HyAR: Addressing discrete-continuous action reinforcement learning via hybrid action representation publication-title: arXiv:2109.05490 |
SSID | ssj0014482 |
Score | 2.6034582 |
Snippet | A simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user downlink multiple-input single-output (MISO)... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 2556 |
SubjectTerms | Algorithms Array signal processing Beamforming Channel estimation Communications systems Computational modeling deep reinforcement learning (DRL) Energy consumption Hybrid control Machine learning Markov processes MISO (control systems) Optimization Phase shift Power consumption reconfigurable intelligent surfaces (RISs) Reinforcement learning simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) Stars Surface waves |
Title | Hybrid Reinforcement Learning for STAR-RISs: A Coupled Phase-Shift Model Based Beamformer |
URI | https://ieeexplore.ieee.org/document/9837935 https://www.proquest.com/docview/2704097564 |
Volume | 40 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTxsxEB5BTuXQFmjVUIp84FR1g-N9ZM0tREQBiapKQAqnlR9jQIQEkc0Bfj0e7yZqAVXcVpa9a3nsnfk8M98A7Ftl05SoLy3FVnj7HyPltUBE6XomyRAVUqLw2e9scJGcjtPxGvxa5cIgYgg-wxY9Bl--nZkFXZUdSI-mZJyuw7oHblWu1spj4D8TPAadOI4IBNQezDaXB6ejbs8jQSE8QJWCp_E_OigUVXn1Jw7qpf8JzpYTq6JKbluLUrfM0wvOxvfO_DN8rO1M1q02xias4XQLNv5iH9yGy8EjpWuxIQb2VBMuCllNuHrFfBMbnXeH0fBkND9kXdabLe4naNmfa6_5otH1jSsZlVKbsCPfYNkRqjsygfHhC1z0j897g6iutBAZr-7LSJhcZspInXt8Jjs8OGdzkagcY6ed05lG7RQZRyhUooQVkgu00uYuyTSPv0JjOpviN2DO93AqMbrtbMKNVG3dMTaTTscojcEm8OXaF6amIadqGJMiwBEuCxJXQeIqanE14edqyH3FwfG_ztu0_KuO9co3YXcp4KI-pfNCdDjRfaVZsvP2qO_wgd5dxZTtQqN8WOAPb4SUei_svmdeldej |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTxsxEB5ReqAc2vJSA7T1oSfUDY73kXVvISoKlCCUBAlOKz_GpCIkCDaH8uvr8W4i-lDV28oay9aMvTPjmfkG4JNVNk0J-tJSboW3_zFSXgtEVK5nkgxRIRUK98-z3mVyepVercDnZS0MIobkM2zSZ4jl25mZ01PZofTelIzTF_DS6_20VVVrLWMGfqEQM2jHcURuQB3DbHF5eDrsdL0vKIR3UaXgafyLFgptVf74FwcFc_wG-outVXklt815qZvm6TfUxv_d-1t4XVuarFMdjQ1YwekmrD_DH9yC694PKthiAwz4qSY8FbIacvWG-SE2HHUG0eBk-PiFdVh3Nr-foGUXY6_7ouH4uysZNVObsCM_YNkRqjsygvFhGy6Pv466vajutRAZr_DLSJhcZspInXsPTbZ5CM_mIlE5xk47pzON2ikyj1CoRAkrJBdopc1dkmke78DqdDbFd8Ccp3AqMbrlbMKNVC3dNjaTTscojcEG8AXvC1MDkVM_jEkRHBIuCxJXQeIqanE14GA55b5C4fgX8Raxf0lYc74B-wsBF_U9fSxEmxPgV5olu3-f9RHWeqP-WXF2cv5tD17ROlWG2T6slg9zfO9NklJ_CCfxJ8j-2uw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hybrid+Reinforcement+Learning+for+STAR-RISs%3A+A+Coupled+Phase-Shift+Model+Based+Beamformer&rft.jtitle=IEEE+journal+on+selected+areas+in+communications&rft.au=Zhong%2C+Ruikang&rft.au=Liu%2C+Yuanwei&rft.au=Mu%2C+Xidong&rft.au=Chen%2C+Yue&rft.date=2022-09-01&rft.pub=IEEE&rft.issn=0733-8716&rft.volume=40&rft.issue=9&rft.spage=2556&rft.epage=2569&rft_id=info:doi/10.1109%2FJSAC.2022.3192053&rft.externalDocID=9837935 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0733-8716&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0733-8716&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0733-8716&client=summon |