Adaptive Cyber Defense Against Multi-Stage Attacks Using Learning-Based POMDP

Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown l...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on privacy and security Vol. 24; no. 1; pp. 1 - 25
Main Authors Hu, Zhisheng, Zhu, Minghui, Liu, Peng
Format Journal Article
LanguageEnglish
Published 28.02.2021
Online AccessGet full text

Cover

Loading…
Abstract Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown likelihoods and unknown impacts of successful exploits. In this article, we leverage reinforcement learning to develop an innovative adaptive cyber defense to maximize the cost-effectiveness subject to the aforementioned challenges. In particular, we use Bayesian attack graphs to model the interactions between the attacker and networks. Then we formulate the defense problem of interest as a partially observable Markov decision process problem where the defender maintains belief states to estimate system states, leverages Thompson sampling to estimate transition probabilities, and utilizes reinforcement learning to choose optimal defense actions using measured utility values. The algorithm performance is verified via numerical simulations based on real-world attacks.
AbstractList Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown likelihoods and unknown impacts of successful exploits. In this article, we leverage reinforcement learning to develop an innovative adaptive cyber defense to maximize the cost-effectiveness subject to the aforementioned challenges. In particular, we use Bayesian attack graphs to model the interactions between the attacker and networks. Then we formulate the defense problem of interest as a partially observable Markov decision process problem where the defender maintains belief states to estimate system states, leverages Thompson sampling to estimate transition probabilities, and utilizes reinforcement learning to choose optimal defense actions using measured utility values. The algorithm performance is verified via numerical simulations based on real-world attacks.
Author Liu, Peng
Zhu, Minghui
Hu, Zhisheng
Author_xml – sequence: 1
  givenname: Zhisheng
  orcidid: 0000-0003-1940-9829
  surname: Hu
  fullname: Hu, Zhisheng
  organization: Baidu Security, Sunnyvale, CA
– sequence: 2
  givenname: Minghui
  surname: Zhu
  fullname: Zhu, Minghui
  organization: Pennsylvania State University, PA
– sequence: 3
  givenname: Peng
  surname: Liu
  fullname: Liu, Peng
  organization: Pennsylvania State University, PA
BookMark eNplkM1KAzEYRYNUsNbiK2TnKppvJpmZLGvrH7S0oF0P3-SnRGtakij07R2xuNDVPVwOd3HPySDsgiXkEvg1gJA3pYCmUfUJGRaiBlbIWgx-uarOyDilV845VEoJAUOymBjcZ_9p6fTQ2Uhn1tmQLJ1s0IeU6eJjmz17zrjpu5xRvyW6Tj5s6NxiDD2wW0zW0NVyMVtdkFOH22THxxyR9f3dy_SRzZcPT9PJnOlCNpkZKDqhZSmwbqRTinPT8dI413GnTQm6lIhdXXRQg1ZOVAoqEL0k0EgtXDki7GdXx11K0bpW-4zZ70KO6Lct8Pb7jvZ4R-9f_fH30b9jPPwzvwAfMl9L
CitedBy_id crossref_primary_10_1109_JIOT_2024_3423022
crossref_primary_10_1109_TNSM_2023_3293413
crossref_primary_10_1109_TNSM_2022_3176781
crossref_primary_10_1111_risa_13837
crossref_primary_10_1016_j_cosrev_2023_100544
crossref_primary_10_1109_TNSM_2024_3481662
Cites_doi 10.1145/586110.586130
10.1109/TDSC.2011.34
10.1145/2810103.2813691
10.1109/MSP.2017.2743240
10.5555/1622519.1622525
10.1109/TIFS.2018.2819967
10.1007/s10458-012-9200-2
10.1145/1456362.1456368
10.1145/3140549.3140562
10.1109/CDC.2009.5399894
10.1145/2808475.2808482
10.1007/978-3-319-13841-1_1
10.1093/biomet/25.3-4.285
10.1287/opre.26.2.282
10.1007/BF00992698
10.1016/j.automatica.2019.02.032
10.1109/SP.2014.25
10.5555/1949317.1949338
10.1137/110843332
10.1049/iet-ifs.2014.0272
10.1145/3140549.3140556
10.1109/TAC.1985.1103963
10.1016/0022-247X(65)90154-X
10.1186/s42400-018-0003-x
10.1007/978-3-319-25594-1_1
10.1117/12.604240
10.1145/2663474.2663481
10.1109/MC.2008.295
10.1145/1813654.1813655
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.1145/3418897
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2471-2574
EndPage 25
ExternalDocumentID 10_1145_3418897
GroupedDBID .4S
5VS
6KP
AAKMM
AALFJ
AAYFX
AAYXX
ACM
ADL
ADMLS
AEBYY
AEFXT
AEJOY
AENSD
AFWIH
AFWXC
AIKLT
AKRVB
ALMA_UNASSIGNED_HOLDINGS
ARCSS
ASPBG
AVWKF
CCLIF
CITATION
EBS
EDO
EIS
GUFHI
LHSKQ
PQQKQ
ROL
TH9
TUS
ID FETCH-LOGICAL-c258t-d12b4c534a785f9900db03dffb0fcd31c35aab72b171c9f46916140db4ad5c4f3
ISSN 2471-2566
IngestDate Thu Apr 24 23:09:36 EDT 2025
Thu Jul 03 08:40:12 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c258t-d12b4c534a785f9900db03dffb0fcd31c35aab72b171c9f46916140db4ad5c4f3
ORCID 0000-0003-1940-9829
OpenAccessLink https://dl.acm.org/doi/pdf/10.1145/3418897
PageCount 25
ParticipantIDs crossref_citationtrail_10_1145_3418897
crossref_primary_10_1145_3418897
PublicationCentury 2000
PublicationDate 2021-02-28
PublicationDateYYYYMMDD 2021-02-28
PublicationDate_xml – month: 02
  year: 2021
  text: 2021-02-28
  day: 28
PublicationDecade 2020
PublicationTitle ACM transactions on privacy and security
PublicationYear 2021
References Zambon Emmanuele (e_1_2_1_51_1) 2006
Bellman Richard E. (e_1_2_1_4_1) 1962
e_1_2_1_20_1
e_1_2_1_41_1
e_1_2_1_22_1
Shani Guy (e_1_2_1_38_1)
e_1_2_1_28_1
e_1_2_1_49_1
Friedman Nir (e_1_2_1_10_1) 1999
e_1_2_1_47_1
Mnih Volodymyr (e_1_2_1_23_1) 2015
Zhu Quanyan (e_1_2_1_56_1) 2013; 17
Poupart Pascal (e_1_2_1_33_1) 2004
(e_1_2_1_43_1) 2015
Zhou Chenfeng Vincent (e_1_2_1_52_1) 2010
e_1_2_1_31_1
e_1_2_1_54_1
e_1_2_1_8_1
e_1_2_1_6_1
Virin Yan (e_1_2_1_46_1); 22
e_1_2_1_2_1
e_1_2_1_39_1
e_1_2_1_14_1
Johansen Håvard (e_1_2_1_17_1)
Iannucci S. (e_1_2_1_16_1)
e_1_2_1_18_1
Ou Xinming (e_1_2_1_27_1)
Schiffman Mike (e_1_2_1_37_1) 2017
Yu Lu (e_1_2_1_50_1)
e_1_2_1_40_1
Strens Malcolm J. A. (e_1_2_1_42_1) 2000
e_1_2_1_21_1
e_1_2_1_44_1
e_1_2_1_25_1
Sarraute Carlos (e_1_2_1_36_1) 2012
Tokic Michel (e_1_2_1_45_1)
Lippmann R. (e_1_2_1_19_1)
Mohurle Savita (e_1_2_1_24_1) 2017; 8
Ossenbuhl S. (e_1_2_1_26_1)
Papernot N. (e_1_2_1_29_1)
Pineau Joelle (e_1_2_1_30_1) 2003
Xie Peng (e_1_2_1_48_1)
e_1_2_1_7_1
Russo Daniel (e_1_2_1_35_1) 2017
e_1_2_1_55_1
e_1_2_1_5_1
e_1_2_1_57_1
e_1_2_1_3_1
e_1_2_1_13_1
e_1_2_1_34_1
e_1_2_1_1_1
e_1_2_1_11_1
e_1_2_1_32_1
e_1_2_1_53_1
e_1_2_1_15_1
e_1_2_1_9_1
References_xml – ident: e_1_2_1_57_1
  doi: 10.1145/586110.586130
– ident: e_1_2_1_31_1
  doi: 10.1109/TDSC.2011.34
– ident: e_1_2_1_5_1
  doi: 10.1145/2810103.2813691
– ident: e_1_2_1_1_1
  doi: 10.1109/MSP.2017.2743240
– volume-title: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI’12)
  year: 2012
  ident: e_1_2_1_36_1
– volume: 8
  start-page: 1938
  year: 2017
  ident: e_1_2_1_24_1
  article-title: A brief study of wannacry threat: Ransomware attack 2017
  publication-title: International Journal of Advanced Research in Computer Science
– ident: e_1_2_1_41_1
  doi: 10.5555/1622519.1622525
– ident: e_1_2_1_22_1
  doi: 10.1109/TIFS.2018.2819967
– ident: e_1_2_1_39_1
  doi: 10.1007/s10458-012-9200-2
– volume-title: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03)
  year: 2003
  ident: e_1_2_1_30_1
– volume-title: Proceedings of the 2016 25th International Conference on Computer Communication and Networks (ICCCN’16)
  ident: e_1_2_1_16_1
– ident: e_1_2_1_11_1
  doi: 10.1145/1456362.1456368
– volume-title: FirePatch: Secure and Time-Critical Dissemination of Software Patches
  ident: e_1_2_1_17_1
– volume-title: Proceedings of the 8th Annual Cyber Security and Information Intelligence Research Workshop (CSIIRW’13)
  ident: e_1_2_1_50_1
– ident: e_1_2_1_25_1
  doi: 10.1145/3140549.3140562
– ident: e_1_2_1_55_1
  doi: 10.1109/CDC.2009.5399894
– volume-title: et al
  year: 2015
  ident: e_1_2_1_23_1
– ident: e_1_2_1_21_1
  doi: 10.1145/2808475.2808482
– volume-title: Dreyfus
  year: 1962
  ident: e_1_2_1_4_1
– volume: 17
  start-page: 305
  year: 2013
  ident: e_1_2_1_56_1
  article-title: Hybrid learning in stochastic games and its applications in network security
  publication-title: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
– volume-title: Learning and Solving Partially Observable Markov Decision Processes
  ident: e_1_2_1_38_1
– ident: e_1_2_1_8_1
  doi: 10.1007/978-3-319-13841-1_1
– volume-title: Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II (NIPS’98)
  year: 1999
  ident: e_1_2_1_10_1
– ident: e_1_2_1_32_1
– volume-title: Abbas Kazerouni, and Ian Osband.
  year: 2017
  ident: e_1_2_1_35_1
– volume-title: Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroSP’18)
  ident: e_1_2_1_29_1
– volume: 22
  volume-title: Proceedings of the National Conference on Artificial Intelligence (AAAI’07)
  ident: e_1_2_1_46_1
– ident: e_1_2_1_44_1
  doi: 10.1093/biomet/25.3-4.285
– volume-title: VDCBPI: An approximate scalable algorithm for large POMDPs. In Advances in Neural Information Processing Systems (NIPS’04). 1081--1088.
  year: 2004
  ident: e_1_2_1_33_1
– volume-title: Proceedings of the 17th International Conference on Machine Learning (ICML’00)
  year: 2000
  ident: e_1_2_1_42_1
– ident: e_1_2_1_40_1
  doi: 10.1287/opre.26.2.282
– volume-title: Retrieved
  year: 2015
  ident: e_1_2_1_43_1
– ident: e_1_2_1_47_1
  doi: 10.1007/BF00992698
– ident: e_1_2_1_13_1
  doi: 10.1016/j.automatica.2019.02.032
– volume-title: A survey of coordinated attacks and collaborative intrusion detection. Computers 8 Security 29, 1
  year: 2010
  ident: e_1_2_1_52_1
– volume-title: Proceedings of the 2006 IEEE Military Communications Conference (MILCOM’06)
  ident: e_1_2_1_19_1
– volume-title: Proceedings of the 2015 9th International Conference on IT Security Incident Management IT Forensics (IMF’15)
  ident: e_1_2_1_26_1
– volume-title: Retrieved
  year: 2017
  ident: e_1_2_1_37_1
– volume-title: Proceedings of the 2010 IEEE/IFIP International Conference on Dependable Systems Networks (DSN’10)
  ident: e_1_2_1_48_1
– ident: e_1_2_1_18_1
  doi: 10.1109/SP.2014.25
– volume-title: Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS’06)
  ident: e_1_2_1_27_1
– ident: e_1_2_1_49_1
  doi: 10.5555/1949317.1949338
– ident: e_1_2_1_54_1
  doi: 10.1137/110843332
– volume-title: KI 2010: Advances in Artificial Intelligence
  ident: e_1_2_1_45_1
– ident: e_1_2_1_9_1
  doi: 10.1049/iet-ifs.2014.0272
– volume-title: Retrieved
  year: 2006
  ident: e_1_2_1_51_1
– ident: e_1_2_1_14_1
  doi: 10.1145/3140549.3140556
– ident: e_1_2_1_3_1
  doi: 10.1109/TAC.1985.1103963
– ident: e_1_2_1_2_1
  doi: 10.1016/0022-247X(65)90154-X
– ident: e_1_2_1_6_1
  doi: 10.1186/s42400-018-0003-x
– ident: e_1_2_1_7_1
  doi: 10.1007/978-3-319-25594-1_1
– ident: e_1_2_1_20_1
  doi: 10.1117/12.604240
– ident: e_1_2_1_53_1
  doi: 10.1145/2663474.2663481
– ident: e_1_2_1_15_1
  doi: 10.1109/MC.2008.295
– ident: e_1_2_1_34_1
  doi: 10.1145/1813654.1813655
– ident: e_1_2_1_28_1
SSID ssj0001699441
Score 2.3136191
Snippet Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able...
SourceID crossref
SourceType Enrichment Source
Index Database
StartPage 1
Title Adaptive Cyber Defense Against Multi-Stage Attacks Using Learning-Based POMDP
Volume 24
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9tAEF5a55JL01dI0gd7KL2YbaTVaqU9qk5LKFVrqAO5mX1ItqG4xpED7a_v7EOy7Baa9CLMsiuQ5vO8NN8MQm9EmiXwB4yIrBIIULjhRGhBCTeCKk5jSY3NQ5Zf-OUV-3SdXm_Lmh27pFHv9K-_8kr-R6qwBnK1LNl7SLa7KSzAb5AvXEHCcL2TjAsjV670Z_RTVWvQHTUEpdWwmEG4f9MMHbmWgDs5g7WmsWz6oS8RCF1VZ-Q9GDEzHH8tL8Z9N7UYlXZ4RDtJ3H1SWK0Xt3Y2vMu0h6l3W1C4jxxzW2AfTKFLRm98Zf5yNt8sutKfxcYXBoeNIeVA4x6F22kmChaNgK8Uelj31_zInVa1enr0DoS8nox7BtcTn_9U5cx2vQAjm-e-gne3WfaeEetKCz3ROp2Ggw_RAYUAgg7QQXFRfv62zb9xIZgbbNo9jedU29Pn4XTPWel5HZPH6FEIF3DhZf8EPaiWT9FRO4oDB838DJUtFLCDAg5QwAEKuAcFHKCAHRTwLhSwg8JzdPXxw2R0ScKkDKJpmjfExFQxnSZMZnlag4MRGRUlpq5VVGuTxDpJpVQZVXEWa1EzDkEBRNZGMWlSzerkGA2WP5bVCcKScSlyHWmeZ0wLoyBmzeokVjqSUlN6it62r2SqQxt5O83k-3TvzZ8i3G1c-c4p-1vO_r3lBTrcIvAlGjTrTfUK3MBGvQ4S_Q2heFt5
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adaptive+Cyber+Defense+Against+Multi-Stage+Attacks+Using+Learning-Based+POMDP&rft.jtitle=ACM+transactions+on+privacy+and+security&rft.au=Hu%2C+Zhisheng&rft.au=Zhu%2C+Minghui&rft.au=Liu%2C+Peng&rft.date=2021-02-28&rft.issn=2471-2566&rft.eissn=2471-2574&rft.volume=24&rft.issue=1&rft.spage=1&rft.epage=25&rft_id=info:doi/10.1145%2F3418897&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3418897
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2471-2566&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2471-2566&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2471-2566&client=summon