Prediction of Software Security Vulnerabilities from Source Code Using Machine Learning Methods
One of the most significant problems in software engineering is the presence of security vulnerabilities in software. Attackers can exploit these vulnerabilities to gain unauthorized access to systems, leak information, corrupt data, and cause service interruptions. Therefore, in addition to develop...
Saved in:
Published in | 2023 Innovations in Intelligent Systems and Applications Conference (ASYU) pp. 1 - 6 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
11.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | One of the most significant problems in software engineering is the presence of security vulnerabilities in software. Attackers can exploit these vulnerabilities to gain unauthorized access to systems, leak information, corrupt data, and cause service interruptions. Therefore, in addition to developing secure software, the detection of existing security vulnerabilities in software is also considered as an important research topic. In this study, security vulnerabilities in the source code of software were predicted using machine learning methods. The OWASP Benchmark Test pocket was used as the dataset. This dataset consisted of Java codes and was utilized for training machine learning models Logistic Regression, Decision Tree, Support Vector Machines, K-Nearest Neighbors, and Random Forest. TF-IDF and Doc2Vec methods were employed to extract feature vectors from the source code. In the conducted experimental study, the highest prediction accuracy (0.97) was achieved using the TF-IDF feature extraction method and the Decision Tree, SVM and Logistic Regression algorithms. |
---|---|
AbstractList | One of the most significant problems in software engineering is the presence of security vulnerabilities in software. Attackers can exploit these vulnerabilities to gain unauthorized access to systems, leak information, corrupt data, and cause service interruptions. Therefore, in addition to developing secure software, the detection of existing security vulnerabilities in software is also considered as an important research topic. In this study, security vulnerabilities in the source code of software were predicted using machine learning methods. The OWASP Benchmark Test pocket was used as the dataset. This dataset consisted of Java codes and was utilized for training machine learning models Logistic Regression, Decision Tree, Support Vector Machines, K-Nearest Neighbors, and Random Forest. TF-IDF and Doc2Vec methods were employed to extract feature vectors from the source code. In the conducted experimental study, the highest prediction accuracy (0.97) was achieved using the TF-IDF feature extraction method and the Decision Tree, SVM and Logistic Regression algorithms. |
Author | Mandal, Dilek KOsesoy, Irfan |
Author_xml | – sequence: 1 givenname: Dilek surname: Mandal fullname: Mandal, Dilek email: mandaldilek@gmail.com organization: Kocaeli University,Computer Engineering,Kocaeli,Turkey – sequence: 2 givenname: Irfan surname: KOsesoy fullname: KOsesoy, Irfan email: irfan.kosesoy@kocaeli.edu.tr organization: Kocaeli University,Software Engineering,Kocaeli,Turkey |
BookMark | eNo1kNtKAzEURaMoWGv_QDA_MPXkMrk8luINKgpjBZ9KJjljI20imSnSv7d4edqw2Sw265ycpJyQkCsGU8bAXs-at2VttDBTDlxMGXCrtNRHZGK1NaIGAaq2cExGXGuotJXqjEz6_gMABAfJmBqR1XPBEP0Qc6K5o03uhi9XkDbodyUOe_q62yQsro2bOETsaVfy9jDbFY90ngPSZR_TO310fh0T0gW6kn4KHNY59BfktHObHid_OSbL25uX-X21eLp7mM8WVWTMDpWHVjJljDJdG5R1dWuFUDW6Nhyuaim5ssiwa7mVwLwyNnQHBzXjGKTxIMbk8pcbEXH1WeLWlf3q34n4BuuqWHM |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ASYU58738.2023.10296747 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library Online IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9798350306590 |
EISSN | 2770-7946 |
EndPage | 6 |
ExternalDocumentID | 10296747 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IL 6IN ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
ID | FETCH-LOGICAL-i119t-c0b4168868fbd69a5b93365eabd320744269e1efb29401c689df109512ed48c03 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:24:41 EDT 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i119t-c0b4168868fbd69a5b93365eabd320744269e1efb29401c689df109512ed48c03 |
PageCount | 6 |
ParticipantIDs | ieee_primary_10296747 |
PublicationCentury | 2000 |
PublicationDate | 2023-Oct.-11 |
PublicationDateYYYYMMDD | 2023-10-11 |
PublicationDate_xml | – month: 10 year: 2023 text: 2023-Oct.-11 day: 11 |
PublicationDecade | 2020 |
PublicationTitle | 2023 Innovations in Intelligent Systems and Applications Conference (ASYU) |
PublicationTitleAbbrev | ASYU |
PublicationYear | 2023 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0003204116 |
Score | 1.9016271 |
Snippet | One of the most significant problems in software engineering is the presence of security vulnerabilities in software. Attackers can exploit these... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1 |
SubjectTerms | AST Tree Doc2Vec Feature extraction Logistic regression Machine Learning Algorithms Prediction algorithms Software Software algorithms Software Vulnerability Source coding Support vector machines TF-IDF |
Title | Prediction of Software Security Vulnerabilities from Source Code Using Machine Learning Methods |
URI | https://ieeexplore.ieee.org/document/10296747 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwEA66J59UnPibPPja2vxomjzKcAxhYzAn82k0yRVE2WS2CP71XrJuoiD4UkpoIFxo7r7L990Rcl0YnZXOlolXUCBAMZBoDlVS8VxYURjHsiBOHo7UYCrvZ_msFatHLQwARPIZpOE13uX7pWtCqgz_cG4Uxr-7ZFdnfC3W2iZUBM8kY6rlcLHM3NxOnqa5LkRgcHGRbmb_6KMS3Uh_n4w2C1izR17Sprap-_xVm_HfKzwg3W_FHh1vfdEh2YHFEZmPV-EaJpieLis6wRP3o1wBnbQ96-hj8xqKTkd-LCJmGrQm-FlI59Pe0gONhAI6jIRLoG0tVhyIbaffu2Tav3voDZK2oULyzJipE5dZjL-0VrqyXpkyt0YIlUNpPVqvkEHWCgwqyw3CLqe08RULMRgHL7XLxDHpLJYLOCEUUZQsSim8l4gxELQ5br3GJ5dKGaFOSTdYZ_62rpkx3xjm7I_xc7IXNil4BcYuSKdeNXCJ7r62V3GbvwAJIalz |
link.rule.ids | 310,311,786,790,795,796,802,27947,55098 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA46D3pSceJvc_Da2vxomhxlOKZuY7BN5qk0zSuIsspsEfzrTbJuoiB4KSU0EF5o3vtevu89hK4SJaMs11lgBCQWoCgIJIUiKGjMNEtUTiInTh4MRW_K72fxrBGrey0MAHjyGYTu1d_lmzKvXarM_uFUCRv_bqItG1gruZRrrVMqjEacENGwuEikrm_GT9NYJsxxuCgLV_N_dFLxjqS7i4arJSz5Iy9hXekw__xVnfHfa9xD7W_NHh6tvdE-2oD5AUpHC3cR44yPywKP7Zn7kS0Aj5uudfixfnVlpz1D1mJm7NQm9jOX0Med0gD2lAI88JRLwE01VjvgG0-_t9G0ezvp9IKmpULwTIiqgjzSNgKTUshCG6GyWCvGRAyZNtZ6CXfCViBQaKos8MqFVKYgLgqjYLjMI3aIWvNyDkcIWxzFk4wzY7hFGRa25VQbaZ-UC6GYOEZtZ530bVk1I10Z5uSP8Uu03ZsM-mn_bvhwinbchjkfQcgZalWLGs6t86_0hd_yLzmyrNA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+Innovations+in+Intelligent+Systems+and+Applications+Conference+%28ASYU%29&rft.atitle=Prediction+of+Software+Security+Vulnerabilities+from+Source+Code+Using+Machine+Learning+Methods&rft.au=Mandal%2C+Dilek&rft.au=KOsesoy%2C+Irfan&rft.date=2023-10-11&rft.pub=IEEE&rft.eissn=2770-7946&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FASYU58738.2023.10296747&rft.externalDocID=10296747 |