Improving Deep Learning Based Password Guessing Models Using Pre-processing
Passwords are the most widely used authentication method and play an important role in users’ digital lives. Password guessing models are generally used to understand password security, yet statistic-based password models (like the Markov model and probabilistic context-free grammars (PCFG)) are sub...
Saved in:
Published in | Information and Communications Security Vol. 13407; pp. 163 - 183 |
---|---|
Main Authors | , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2022
Springer International Publishing |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Passwords are the most widely used authentication method and play an important role in users’ digital lives. Password guessing models are generally used to understand password security, yet statistic-based password models (like the Markov model and probabilistic context-free grammars (PCFG)) are subject to the inherent limitations of overfitting and sparsity. With the improvement of computing power, deep-learning based models with higher crack rates are emerging. Since neural networks are generally used as black boxes for learning password features, a key challenge for deep-learning based password guessing models is to choose the appropriate preprocessing methods to learn more effective features.
To fill the gap, this paper explores three new preprocessing methods and makes an attempt to apply them to two promising deep-learning networks, i.e., Long Short-Term Memory (LSTM) neural networks and Generative Adversarial Networks (GAN). First, we propose a character-feature based method for encoding to replace the canonical one-hot encoding. Second, we add so far the most comprehensive recognition rules of words, keyboard patterns, years, and website names into the basic PCFG, and find that the frequency distribution of extracted segments follows the Zipf’s law. Third, we adopt Xu et al.’s PCFG improvement with chunk segmentation at CCS’21, and study the performance of the Chunk+PCFG preprocessing method when applied to LSTM and GAN.
Extensive experiments on six large real-world password datasets show the effectiveness of our preprocessing methods. Results show that within 50 million guesses: 1) When we apply the PCFG preprocessing method to PassGAN (a GAN-based password model proposed by Hitja et al. at ACNS’19), 13.83%–38.81% (26.79% on average) more passwords can be cracked; 2) Our LSTM based model using PCFG for preprocessing (short for PL) outperforms Wang et al.’s original PL model by 0.35%–3.94% (1.36% on average). Overall, our preprocessing methods can improve the attacking rates in four over seven tested cases. We believe this work provides new feasible directions for guessing optimization, and contributes to a better understanding of deep-learning based models. |
---|---|
AbstractList | Passwords are the most widely used authentication method and play an important role in users’ digital lives. Password guessing models are generally used to understand password security, yet statistic-based password models (like the Markov model and probabilistic context-free grammars (PCFG)) are subject to the inherent limitations of overfitting and sparsity. With the improvement of computing power, deep-learning based models with higher crack rates are emerging. Since neural networks are generally used as black boxes for learning password features, a key challenge for deep-learning based password guessing models is to choose the appropriate preprocessing methods to learn more effective features.
To fill the gap, this paper explores three new preprocessing methods and makes an attempt to apply them to two promising deep-learning networks, i.e., Long Short-Term Memory (LSTM) neural networks and Generative Adversarial Networks (GAN). First, we propose a character-feature based method for encoding to replace the canonical one-hot encoding. Second, we add so far the most comprehensive recognition rules of words, keyboard patterns, years, and website names into the basic PCFG, and find that the frequency distribution of extracted segments follows the Zipf’s law. Third, we adopt Xu et al.’s PCFG improvement with chunk segmentation at CCS’21, and study the performance of the Chunk+PCFG preprocessing method when applied to LSTM and GAN.
Extensive experiments on six large real-world password datasets show the effectiveness of our preprocessing methods. Results show that within 50 million guesses: 1) When we apply the PCFG preprocessing method to PassGAN (a GAN-based password model proposed by Hitja et al. at ACNS’19), 13.83%–38.81% (26.79% on average) more passwords can be cracked; 2) Our LSTM based model using PCFG for preprocessing (short for PL) outperforms Wang et al.’s original PL model by 0.35%–3.94% (1.36% on average). Overall, our preprocessing methods can improve the attacking rates in four over seven tested cases. We believe this work provides new feasible directions for guessing optimization, and contributes to a better understanding of deep-learning based models. |
Author | Zou, Yunkai Wang, Ding Huang, Ziyi Wu, Yuxuan |
Author_xml | – sequence: 1 givenname: Yuxuan surname: Wu fullname: Wu, Yuxuan – sequence: 2 givenname: Ding surname: Wang fullname: Wang, Ding email: wangding@nankai.edu.cn – sequence: 3 givenname: Yunkai surname: Zou fullname: Zou, Yunkai – sequence: 4 givenname: Ziyi surname: Huang fullname: Huang, Ziyi |
BookMark | eNo1kMtSwzAMRQ0Uhrb0D1jkBwyy5cT2kmfpUIYuygw7TxqLZ0mCHeD3cVpYeXTleyWdERvUTU2MHQs4EQD61GrDkQMKLnKtNS-cgB02wqRshMddNhSFEBxR2T02Sf__e4UdsCEgSG61wgM2EqhkDhKsPmSTGN8AQGqUKMyQ3c4-2tB8v9bP2SVRm82pDHVfnZeRfLYoY_xpgs-mXxRjr981ntYxe9gUi0A82att74jtP5XrSJO_d8yW11fLixs-v5_OLs7mvJUKOy4Ko1BpQ9L7woMpgPrdDCiUyq8QbSVKk1Oea1uQQUXaKr3yVBpvQeCYyW1sbEOaSsGtmuY9JkCuR-dSlEOXYLgNKdejSya1NaV1P9MtnaPeVVHdhXJdvZRtRyE6DVrLlCGKFKYt_gI4qm04 |
ContentType | Book Chapter |
Copyright | Springer Nature Switzerland AG 2022 |
Copyright_xml | – notice: Springer Nature Switzerland AG 2022 |
DBID | FFUUA |
DEWEY | 005.8 |
DOI | 10.1007/978-3-031-15777-6_10 |
DatabaseName | ProQuest Ebook Central - Book Chapters - Demo use only |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 303115777X 9783031157776 |
EISSN | 1611-3349 |
Editor | Li, Shujun Alcaraz, Cristina Chen, Liqun Samarati, Pierangela |
Editor_xml | – sequence: 1 fullname: Chen, Liqun – sequence: 2 fullname: Li, Shujun – sequence: 3 fullname: Alcaraz, Cristina – sequence: 4 fullname: Samarati, Pierangela |
EndPage | 183 |
ExternalDocumentID | EBC7077207_160_179 |
GroupedDBID | 38. AABBV AAZWU ABSVR ABTHU ABVND ACBPT ACHZO ACPMC ADNVS AEDXK AEJLV AEKFX AHVRR AIYYB ALMA_UNASSIGNED_HOLDINGS BBABE CZZ FFUUA IEZ SBO TPJZQ TSXQS Z7Z Z81 Z83 Z84 Z88 -DT -GH -~X 1SB 29L 2HA 2HV 5QI 875 AASHB ABMNI ACGFS ADCXD AEFIE EJD F5P FEDTE HVGLF LAS LDH P2P RIG RNI RSU SVGTG VI1 ~02 |
ID | FETCH-LOGICAL-p243t-16843478e2dd6d0860e2097804324db339c1a85e55796e834e7947bdea8d9013 |
ISBN | 9783031157769 3031157761 |
ISSN | 0302-9743 |
IngestDate | Tue Jul 29 20:12:40 EDT 2025 Fri Aug 15 21:41:29 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
LCCallNum | QA76.9.D35 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p243t-16843478e2dd6d0860e2097804324db339c1a85e55796e834e7947bdea8d9013 |
OCLC | 1342502097 |
PQID | EBC7077207_160_179 |
PageCount | 21 |
ParticipantIDs | springer_books_10_1007_978_3_031_15777_6_10 proquest_ebookcentralchapters_7077207_160_179 |
PublicationCentury | 2000 |
PublicationDate | 2022 |
PublicationDateYYYYMMDD | 2022-01-01 |
PublicationDate_xml | – year: 2022 text: 2022 |
PublicationDecade | 2020 |
PublicationPlace | Switzerland |
PublicationPlace_xml | – name: Switzerland – name: Cham |
PublicationSeriesTitle | Lecture Notes in Computer Science |
PublicationSeriesTitleAlternate | Lect.Notes Computer |
PublicationSubtitle | 24th International Conference, ICICS 2022, Canterbury, UK, September 5-8, 2022, Proceedings |
PublicationTitle | Information and Communications Security |
PublicationYear | 2022 |
Publisher | Springer International Publishing AG Springer International Publishing |
Publisher_xml | – name: Springer International Publishing AG – name: Springer International Publishing |
RelatedPersons | Hartmanis, Juris Gao, Wen Steffen, Bernhard Bertino, Elisa Goos, Gerhard Yung, Moti |
RelatedPersons_xml | – sequence: 1 givenname: Gerhard surname: Goos fullname: Goos, Gerhard – sequence: 2 givenname: Juris surname: Hartmanis fullname: Hartmanis, Juris – sequence: 3 givenname: Elisa surname: Bertino fullname: Bertino, Elisa – sequence: 4 givenname: Wen surname: Gao fullname: Gao, Wen – sequence: 5 givenname: Bernhard orcidid: 0000-0001-9619-1558 surname: Steffen fullname: Steffen, Bernhard – sequence: 6 givenname: Moti orcidid: 0000-0003-0848-0873 surname: Yung fullname: Yung, Moti |
SSID | ssj0002732318 ssj0002792 |
Score | 2.0371034 |
Snippet | Passwords are the most widely used authentication method and play an important role in users’ digital lives. Password guessing models are generally used to... |
SourceID | springer proquest |
SourceType | Publisher |
StartPage | 163 |
SubjectTerms | Deep learning Generative Adversarial Networks Long Short-Term Memory neural networks Password Preprocessing |
Title | Improving Deep Learning Based Password Guessing Models Using Pre-processing |
URI | http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=7077207&ppg=179&c=UERG http://link.springer.com/10.1007/978-3-031-15777-6_10 |
Volume | 13407 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELZKuSAOvMXykg_cKqMkTuLkwAHYRatlWXEosOJiOfF0VYHaapNIwK_gJzMT201T9rJcoiqyamc-azwznm-GsZeLNJM2kVbUUKCDogCEwa0jKnTESoMWgImJKPzxLD_-nJ6cZ-eTyZ-drKWurV7Vv6_klfwPqvgOcSWW7DWQ3f4pvsDfiC8-EWF87hm_4zCrTxfcEg_7G4AR1aPxgfR2iJh3vbbtfnbDfvjqg8WH4fyiCPLaD1x9N8sBdD_y2_LXcneXDTGJQ4BNqNZ6MXuLh6NF87ShHESLG5GybfE9tV770cxcosKnSxAbR1QICyCxQfP61N9snK3bPmFsFppPBF20G6xIkr1gRQhW7oU7h4jbyLvF05VKASnXyyWwvFCDow_klCI4pZ1TKUbpSp96RRx7tenO9Ng1y_nnuNjNEMHJBM2mRK6Js3dDFdmU3XxzdHL6ZRu1Q2MvkcM9VUTlF909lVsVsYfCqmNX32n4ih3m5lVTjnycvWv53tqZ32W3iQHDiZqC8rvHJrC6z-4ECLiH4AH7sEWfE_o8oM979HlAnwf0uUOf9-jzMfoP2fz90fzdsfDNOcQmSWUr4rxIZaoKSKgnGTrGESRRX84KTXRbSVnWsSkyyIjsDIVMATW_qiyYwqINKh-x6Wq9gseMGzx0Spsvqqi2aW4BDXajFkkJdZGbJDIHTAS56D6DwKct104KjVYRuoiR0nGODq0qD9gsCE_T8EaH0ty4PC01Sl33Utck9SfXGv2U3Rq29TM2bS87eI5WaVu98FvlL89whME |
linkProvider | Library Specific Holdings |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Information+and+Communications+Security&rft.au=Wu%2C+Yuxuan&rft.au=Wang%2C+Ding&rft.au=Zou%2C+Yunkai&rft.au=Huang%2C+Ziyi&rft.atitle=Improving+Deep+Learning+Based+Password+Guessing+Models+Using+Pre-processing&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2022-01-01&rft.pub=Springer+International+Publishing&rft.isbn=9783031157769&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=163&rft.epage=183&rft_id=info:doi/10.1007%2F978-3-031-15777-6_10 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F7077207-l.jpg |