Adaptive Cyber Defense Against Multi-Stage Attacks Using Learning-Based POMDP

Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown l...

Full description

Saved in:

Bibliographic Details
Published in	ACM transactions on privacy and security Vol. 24; no. 1; pp. 1 - 25
Main Authors	Hu, Zhisheng, Zhu, Minghui, Liu, Peng
Format	Journal Article
Language	English
Published	28.02.2021
Online Access	Get full text

Cover

Loading…

Abstract	Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown likelihoods and unknown impacts of successful exploits. In this article, we leverage reinforcement learning to develop an innovative adaptive cyber defense to maximize the cost-effectiveness subject to the aforementioned challenges. In particular, we use Bayesian attack graphs to model the interactions between the attacker and networks. Then we formulate the defense problem of interest as a partially observable Markov decision process problem where the defender maintains belief states to estimate system states, leverages Thompson sampling to estimate transition probabilities, and utilizes reinforcement learning to choose optimal defense actions using measured utility values. The algorithm performance is verified via numerical simulations based on real-world attacks.
AbstractList	Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown likelihoods and unknown impacts of successful exploits. In this article, we leverage reinforcement learning to develop an innovative adaptive cyber defense to maximize the cost-effectiveness subject to the aforementioned challenges. In particular, we use Bayesian attack graphs to model the interactions between the attacker and networks. Then we formulate the defense problem of interest as a partially observable Markov decision process problem where the defender maintains belief states to estimate system states, leverages Thompson sampling to estimate transition probabilities, and utilizes reinforcement learning to choose optimal defense actions using measured utility values. The algorithm performance is verified via numerical simulations based on real-world attacks.
Author	Liu, Peng Zhu, Minghui Hu, Zhisheng
Author_xml	– sequence: 1 givenname: Zhisheng orcidid: 0000-0003-1940-9829 surname: Hu fullname: Hu, Zhisheng organization: Baidu Security, Sunnyvale, CA – sequence: 2 givenname: Minghui surname: Zhu fullname: Zhu, Minghui organization: Pennsylvania State University, PA – sequence: 3 givenname: Peng surname: Liu fullname: Liu, Peng organization: Pennsylvania State University, PA
BookMark	eNplkM1KAzEYRYNUsNbiK2TnKppvJpmZLGvrH7S0oF0P3-SnRGtakij07R2xuNDVPVwOd3HPySDsgiXkEvg1gJA3pYCmUfUJGRaiBlbIWgx-uarOyDilV845VEoJAUOymBjcZ_9p6fTQ2Uhn1tmQLJ1s0IeU6eJjmz17zrjpu5xRvyW6Tj5s6NxiDD2wW0zW0NVyMVtdkFOH22THxxyR9f3dy_SRzZcPT9PJnOlCNpkZKDqhZSmwbqRTinPT8dI413GnTQm6lIhdXXRQg1ZOVAoqEL0k0EgtXDki7GdXx11K0bpW-4zZ70KO6Lct8Pb7jvZ4R-9f_fH30b9jPPwzvwAfMl9L
CitedBy_id	crossref_primary_10_1109_JIOT_2024_3423022 crossref_primary_10_1109_TNSM_2023_3293413 crossref_primary_10_1109_TNSM_2022_3176781 crossref_primary_10_1111_risa_13837 crossref_primary_10_1016_j_cosrev_2023_100544 crossref_primary_10_1109_TNSM_2024_3481662
Cites_doi	10.1145/586110.586130 10.1109/TDSC.2011.34 10.1145/2810103.2813691 10.1109/MSP.2017.2743240 10.5555/1622519.1622525 10.1109/TIFS.2018.2819967 10.1007/s10458-012-9200-2 10.1145/1456362.1456368 10.1145/3140549.3140562 10.1109/CDC.2009.5399894 10.1145/2808475.2808482 10.1007/978-3-319-13841-1_1 10.1093/biomet/25.3-4.285 10.1287/opre.26.2.282 10.1007/BF00992698 10.1016/j.automatica.2019.02.032 10.1109/SP.2014.25 10.5555/1949317.1949338 10.1137/110843332 10.1049/iet-ifs.2014.0272 10.1145/3140549.3140556 10.1109/TAC.1985.1103963 10.1016/0022-247X(65)90154-X 10.1186/s42400-018-0003-x 10.1007/978-3-319-25594-1_1 10.1117/12.604240 10.1145/2663474.2663481 10.1109/MC.2008.295 10.1145/1813654.1813655
ContentType	Journal Article
DBID	AAYXX CITATION
DOI	10.1145/3418897
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList	CrossRef
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	2471-2574
EndPage	25
ExternalDocumentID	10_1145_3418897
GroupedDBID	.4S 5VS 6KP AAKMM AALFJ AAYFX AAYXX ACM ADL ADMLS AEBYY AEFXT AEJOY AENSD AFWIH AFWXC AIKLT AKRVB ALMA_UNASSIGNED_HOLDINGS ARCSS ASPBG AVWKF CCLIF CITATION EBS EDO EIS GUFHI LHSKQ PQQKQ ROL TH9 TUS
ID	FETCH-LOGICAL-c258t-d12b4c534a785f9900db03dffb0fcd31c35aab72b171c9f46916140db4ad5c4f3
ISSN	2471-2566
IngestDate	Thu Apr 24 23:09:36 EDT 2025 Thu Jul 03 08:40:12 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c258t-d12b4c534a785f9900db03dffb0fcd31c35aab72b171c9f46916140db4ad5c4f3
ORCID	0000-0003-1940-9829
OpenAccessLink	https://dl.acm.org/doi/pdf/10.1145/3418897
PageCount	25
ParticipantIDs	crossref_citationtrail_10_1145_3418897 crossref_primary_10_1145_3418897
PublicationCentury	2000
PublicationDate	2021-02-28
PublicationDateYYYYMMDD	2021-02-28
PublicationDate_xml	– month: 02 year: 2021 text: 2021-02-28 day: 28
PublicationDecade	2020
PublicationTitle	ACM transactions on privacy and security
PublicationYear	2021
References	Zambon Emmanuele (e_1_2_1_51_1) 2006 Bellman Richard E. (e_1_2_1_4_1) 1962 e_1_2_1_20_1 e_1_2_1_41_1 e_1_2_1_22_1 Shani Guy (e_1_2_1_38_1) e_1_2_1_28_1 e_1_2_1_49_1 Friedman Nir (e_1_2_1_10_1) 1999 e_1_2_1_47_1 Mnih Volodymyr (e_1_2_1_23_1) 2015 Zhu Quanyan (e_1_2_1_56_1) 2013; 17 Poupart Pascal (e_1_2_1_33_1) 2004 (e_1_2_1_43_1) 2015 Zhou Chenfeng Vincent (e_1_2_1_52_1) 2010 e_1_2_1_31_1 e_1_2_1_54_1 e_1_2_1_8_1 e_1_2_1_6_1 Virin Yan (e_1_2_1_46_1); 22 e_1_2_1_2_1 e_1_2_1_39_1 e_1_2_1_14_1 Johansen Håvard (e_1_2_1_17_1) Iannucci S. (e_1_2_1_16_1) e_1_2_1_18_1 Ou Xinming (e_1_2_1_27_1) Schiffman Mike (e_1_2_1_37_1) 2017 Yu Lu (e_1_2_1_50_1) e_1_2_1_40_1 Strens Malcolm J. A. (e_1_2_1_42_1) 2000 e_1_2_1_21_1 e_1_2_1_44_1 e_1_2_1_25_1 Sarraute Carlos (e_1_2_1_36_1) 2012 Tokic Michel (e_1_2_1_45_1) Lippmann R. (e_1_2_1_19_1) Mohurle Savita (e_1_2_1_24_1) 2017; 8 Ossenbuhl S. (e_1_2_1_26_1) Papernot N. (e_1_2_1_29_1) Pineau Joelle (e_1_2_1_30_1) 2003 Xie Peng (e_1_2_1_48_1) e_1_2_1_7_1 Russo Daniel (e_1_2_1_35_1) 2017 e_1_2_1_55_1 e_1_2_1_5_1 e_1_2_1_57_1 e_1_2_1_3_1 e_1_2_1_13_1 e_1_2_1_34_1 e_1_2_1_1_1 e_1_2_1_11_1 e_1_2_1_32_1 e_1_2_1_53_1 e_1_2_1_15_1 e_1_2_1_9_1
References_xml	– ident: e_1_2_1_57_1 doi: 10.1145/586110.586130 – ident: e_1_2_1_31_1 doi: 10.1109/TDSC.2011.34 – ident: e_1_2_1_5_1 doi: 10.1145/2810103.2813691 – ident: e_1_2_1_1_1 doi: 10.1109/MSP.2017.2743240 – volume-title: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI’12) year: 2012 ident: e_1_2_1_36_1 – volume: 8 start-page: 1938 year: 2017 ident: e_1_2_1_24_1 article-title: A brief study of wannacry threat: Ransomware attack 2017 publication-title: International Journal of Advanced Research in Computer Science – ident: e_1_2_1_41_1 doi: 10.5555/1622519.1622525 – ident: e_1_2_1_22_1 doi: 10.1109/TIFS.2018.2819967 – ident: e_1_2_1_39_1 doi: 10.1007/s10458-012-9200-2 – volume-title: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03) year: 2003 ident: e_1_2_1_30_1 – volume-title: Proceedings of the 2016 25th International Conference on Computer Communication and Networks (ICCCN’16) ident: e_1_2_1_16_1 – ident: e_1_2_1_11_1 doi: 10.1145/1456362.1456368 – volume-title: FirePatch: Secure and Time-Critical Dissemination of Software Patches ident: e_1_2_1_17_1 – volume-title: Proceedings of the 8th Annual Cyber Security and Information Intelligence Research Workshop (CSIIRW’13) ident: e_1_2_1_50_1 – ident: e_1_2_1_25_1 doi: 10.1145/3140549.3140562 – ident: e_1_2_1_55_1 doi: 10.1109/CDC.2009.5399894 – volume-title: et al year: 2015 ident: e_1_2_1_23_1 – ident: e_1_2_1_21_1 doi: 10.1145/2808475.2808482 – volume-title: Dreyfus year: 1962 ident: e_1_2_1_4_1 – volume: 17 start-page: 305 year: 2013 ident: e_1_2_1_56_1 article-title: Hybrid learning in stochastic games and its applications in network security publication-title: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control – volume-title: Learning and Solving Partially Observable Markov Decision Processes ident: e_1_2_1_38_1 – ident: e_1_2_1_8_1 doi: 10.1007/978-3-319-13841-1_1 – volume-title: Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II (NIPS’98) year: 1999 ident: e_1_2_1_10_1 – ident: e_1_2_1_32_1 – volume-title: Abbas Kazerouni, and Ian Osband. year: 2017 ident: e_1_2_1_35_1 – volume-title: Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroSP’18) ident: e_1_2_1_29_1 – volume: 22 volume-title: Proceedings of the National Conference on Artificial Intelligence (AAAI’07) ident: e_1_2_1_46_1 – ident: e_1_2_1_44_1 doi: 10.1093/biomet/25.3-4.285 – volume-title: VDCBPI: An approximate scalable algorithm for large POMDPs. In Advances in Neural Information Processing Systems (NIPS’04). 1081--1088. year: 2004 ident: e_1_2_1_33_1 – volume-title: Proceedings of the 17th International Conference on Machine Learning (ICML’00) year: 2000 ident: e_1_2_1_42_1 – ident: e_1_2_1_40_1 doi: 10.1287/opre.26.2.282 – volume-title: Retrieved year: 2015 ident: e_1_2_1_43_1 – ident: e_1_2_1_47_1 doi: 10.1007/BF00992698 – ident: e_1_2_1_13_1 doi: 10.1016/j.automatica.2019.02.032 – volume-title: A survey of coordinated attacks and collaborative intrusion detection. Computers 8 Security 29, 1 year: 2010 ident: e_1_2_1_52_1 – volume-title: Proceedings of the 2006 IEEE Military Communications Conference (MILCOM’06) ident: e_1_2_1_19_1 – volume-title: Proceedings of the 2015 9th International Conference on IT Security Incident Management IT Forensics (IMF’15) ident: e_1_2_1_26_1 – volume-title: Retrieved year: 2017 ident: e_1_2_1_37_1 – volume-title: Proceedings of the 2010 IEEE/IFIP International Conference on Dependable Systems Networks (DSN’10) ident: e_1_2_1_48_1 – ident: e_1_2_1_18_1 doi: 10.1109/SP.2014.25 – volume-title: Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS’06) ident: e_1_2_1_27_1 – ident: e_1_2_1_49_1 doi: 10.5555/1949317.1949338 – ident: e_1_2_1_54_1 doi: 10.1137/110843332 – volume-title: KI 2010: Advances in Artificial Intelligence ident: e_1_2_1_45_1 – ident: e_1_2_1_9_1 doi: 10.1049/iet-ifs.2014.0272 – volume-title: Retrieved year: 2006 ident: e_1_2_1_51_1 – ident: e_1_2_1_14_1 doi: 10.1145/3140549.3140556 – ident: e_1_2_1_3_1 doi: 10.1109/TAC.1985.1103963 – ident: e_1_2_1_2_1 doi: 10.1016/0022-247X(65)90154-X – ident: e_1_2_1_6_1 doi: 10.1186/s42400-018-0003-x – ident: e_1_2_1_7_1 doi: 10.1007/978-3-319-25594-1_1 – ident: e_1_2_1_20_1 doi: 10.1117/12.604240 – ident: e_1_2_1_53_1 doi: 10.1145/2663474.2663481 – ident: e_1_2_1_15_1 doi: 10.1109/MC.2008.295 – ident: e_1_2_1_34_1 doi: 10.1145/1813654.1813655 – ident: e_1_2_1_28_1
SSID	ssj0001699441
Score	2.3136191
Snippet	Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able...
SourceID	crossref
SourceType	Enrichment Source Index Database
StartPage	1
Title	Adaptive Cyber Defense Against Multi-Stage Attacks Using Learning-Based POMDP
Volume	24
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9tAEF5a55JL01dI0gd7KL2YbaTVaqU9qk5LKFVrqAO5mX1ItqG4xpED7a_v7EOy7Baa9CLMsiuQ5vO8NN8MQm9EmiXwB4yIrBIIULjhRGhBCTeCKk5jSY3NQ5Zf-OUV-3SdXm_Lmh27pFHv9K-_8kr-R6qwBnK1LNl7SLa7KSzAb5AvXEHCcL2TjAsjV670Z_RTVWvQHTUEpdWwmEG4f9MMHbmWgDs5g7WmsWz6oS8RCF1VZ-Q9GDEzHH8tL8Z9N7UYlXZ4RDtJ3H1SWK0Xt3Y2vMu0h6l3W1C4jxxzW2AfTKFLRm98Zf5yNt8sutKfxcYXBoeNIeVA4x6F22kmChaNgK8Uelj31_zInVa1enr0DoS8nox7BtcTn_9U5cx2vQAjm-e-gne3WfaeEetKCz3ROp2Ggw_RAYUAgg7QQXFRfv62zb9xIZgbbNo9jedU29Pn4XTPWel5HZPH6FEIF3DhZf8EPaiWT9FRO4oDB838DJUtFLCDAg5QwAEKuAcFHKCAHRTwLhSwg8JzdPXxw2R0ScKkDKJpmjfExFQxnSZMZnlag4MRGRUlpq5VVGuTxDpJpVQZVXEWa1EzDkEBRNZGMWlSzerkGA2WP5bVCcKScSlyHWmeZ0wLoyBmzeokVjqSUlN6it62r2SqQxt5O83k-3TvzZ8i3G1c-c4p-1vO_r3lBTrcIvAlGjTrTfUK3MBGvQ4S_Q2heFt5
linkProvider	EBSCOhost
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adaptive+Cyber+Defense+Against+Multi-Stage+Attacks+Using+Learning-Based+POMDP&rft.jtitle=ACM+transactions+on+privacy+and+security&rft.au=Hu%2C+Zhisheng&rft.au=Zhu%2C+Minghui&rft.au=Liu%2C+Peng&rft.date=2021-02-28&rft.issn=2471-2566&rft.eissn=2471-2574&rft.volume=24&rft.issue=1&rft.spage=1&rft.epage=25&rft_id=info:doi/10.1145%2F3418897&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3418897
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2471-2566&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2471-2566&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2471-2566&client=summon