Cross-Site Scripting Guardian: A Static XSS Detector Based on Data Stream Input-Output Association Mining

The largest number of cybersecurity attacks is on web applications, in which Cross-Site Scripting (XSS) is the most popular way. The code audit is the main method to avoid the damage of XSS at the source code level. However, there are numerous limits implementing manual audits and rule-based audit t...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 10; no. 14
Main Authors Li, Chenghao, Wang, Yiding, Miao, Changwei, Huang, Cheng
Format Journal Article
LanguageEnglish
Published MDPI AG 15.07.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The largest number of cybersecurity attacks is on web applications, in which Cross-Site Scripting (XSS) is the most popular way. The code audit is the main method to avoid the damage of XSS at the source code level. However, there are numerous limits implementing manual audits and rule-based audit tools. In the age of big data, it is a new research field to assist the manual auditing through machine learning. In this paper, we propose a new way to audit the XSS vulnerability in PHP source code snippets based on a PHP code parsing tool and the machine learning algorithm. We analyzed the operation sequence of source code and built a model to acquire the information that is most closely related to the XSS attack in the data stream. The method proposed can significantly improve the recall rate of vulnerability samples. Compared with related audit methods, our method has high reusability and excellent performance. Our classification model achieved an FI score of 0.92, a recall rate of 0.98 (vulnerable sample), and an area under curve (AUC) of 0.97 on the test dataset. Keywords: vulnerability detection; code audit; cross-site scripting; machine learning
ISSN:2076-3417
2076-3417
DOI:10.3390/appl0144740