dc.description.abstract |
In the symphony of human interaction, the voice remains a fundamental medium of communi cation between people. With the rapid increase in Internet traffic, the complexity of encrypted
protocols has also increased. Voice over IP (VoIP) underwent a similar technological transi tion to encryption as other protocols. Meanwhile, identifying specific classes of network traffic
is an important part of maintaining information security and net neutrality for ISPs. Catego rizing network data packets or flows based on their content is called Internet traffic classifi cation. Nowadays, more than 95% of the internet traffic is secured with encryption [1] which
leads to the ineffectiveness of traditional methods of classification. The traditional methods
often referred to as deep packet inspection (DPI) require inspecting each packet, which is time consuming and computationally expensive. Recent advancements in the field of AI have led
researchers to utilize deep learning in the realm of networks. This includes the use of deep
learning techniques for accurate pattern recognition between network packets. Internet traffic
is captured in PCAP format. Analyzing traffic patterns and extracting relevant information re quires domain knowledge and extensive investigation of PCAP dumps. Each PCAP contains
metadata such as IP addresses, ports and header information specific to each packet type. Con sequently, the statistical properties of traffic flows, including packet length and burst size, must
be analyzed in conjunction with metadata to reveal information about the underlying encrypted
traffic in transit. The classification task becomes tedious and time-consuming if done manu ally. With these limitations in mind, this work leverages the pattern recognition power of deep
learning to classify Internet traffic as either VoIP or non-VoIP. The study includes improving
data diversification by collecting and integrating network traces of VoIP and non-VoIP traf fic from different origins, as well as comprehensive analysis and comparison of deep learning
models including LSTM, MLP and 1D-CNN using both manual as well as automatic feature
extraction. Ultimately, among the examined models, this study presents a model suitable for
real-time VoIP classification. The results of the experimental studies show how feature extraction techniques affect the performance of the model. Specifically, when features were manu ally extracted, the accuracy rates achieved by LSTM, MLP, and 1D-CNN were 94.2%, 93.8%,
and 67.5%, respectively. In contrast, automatic feature extraction gave accuracy of 65.6%,
69.3%, and 26% for the same models. Therefore, considering accuracy and training time, MLP
performs better and is more suitable for VoIP real-time classification. |
en_US |