Semi-Supervised Encrypted Traffic Classification With Deep Convolutional Generative Adversarial Networks

Network traffic classification serves as a building block for important tasks such as security and quality of service management. The field has been studied for a long time, with many techniques such as classical machine learning and deep learning methods currently available. However, the emergence...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 8; pp. 118 - 126
Main Authors Iliyasu, Auwal Sani, Deng, Huifang
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Network traffic classification serves as a building block for important tasks such as security and quality of service management. The field has been studied for a long time, with many techniques such as classical machine learning and deep learning methods currently available. However, the emergence of stronger encryption protocols has led to the rise of new challenges. One of the challenges is capturing and labeling a large amount of encrypted traffic data especially for training deep learning classifiers, as current techniques rely on deep packet inspection tools (DPI) which perform poorly on encrypted traffic. In this paper, we propose a semi-supervised learning approach using Deep Convolutional Generative Adversarial Network (DCGAN). The basic idea is to utilize the samples generated by DCGAN generators as well as unlabeled data to improve the performance of a classifier trained on a few labeled samples. Thus, alleviating the difficulties associated with large dataset collecting and labeling. To demonstrate the efficacy of our approach, we evaluated our model using a self-collected dataset of the recently established QUIC protocol as well as publicly available ISCX VPN-NonVPN dataset. Our approach is able to achieve 89% and 78% accuracy with a very small number of labeled samples (just 10% of the dataset) on both QUIC and ISCX VPN-NonVPN datasets respectively.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2962106