Intrusion Detection System using Machine Learning Models for IoT Novel Dataset

Zafar, Bilal

DSpace Home
→
E-Theses
→
SEECS
→
Information Technology
→
MS
→
View Item

dc.contributor.author	Zafar, Bilal
dc.date.accessioned	2024-07-05T05:41:26Z
dc.date.available	2024-07-05T05:41:26Z
dc.date.issued	2024
dc.identifier.other	363971
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/44569
dc.description	Supervisor: Dr. Muhammad Zeeshan	en_US
dc.description.abstract	In the technology park, the IoT has brought revolution and changed human lifestyle in a better way. The interconnected miniature devices over the internet consume energy resources and transmit data back and forth. However, it also sparked concerns about human privacy due to data breaches because of the vulnerabilities in the IoT network. Intruders intrude into the network and invade human privacy by accessing their user accounts or halting the network through botnet attacks bringing businesses down and making them pay huge numbers of ransom. In an IoT network to secure the data, NIDS should be enhanced and deployed to detect the intrusion on the runtime. Building an efficient and effective IDS that handles breaches to protect human privacy is a challenge in itself. To create an optimal IDS many researchers have generated multiple datasets over the IoT net work environment to perform experiments with different ML classifiers. By combining IoTID20 and HIKARI 2022 datasets, we have proposed a newly merged dataset that was initially prepro cessed, and then different imbalance removal techniques were used including SMOTE, GAN, and Manual Chunks. Further, binary and multi-class classification is performed where LR and NB didn’t perform well, comparatively to the Random Forest, KNN, ANN, and DAE classification results. The manual chunks technique achieved reliable results among others. In binary classification, the average accuracy is achieved by applying LR 97%, NB 98%, KNN 99.98% RF 100%, ANN 99.90%, and DAE 94%. The achieved multi-class classification average accuracy by applying LR 73.96%, NB 71.80%, KNN 97.51% RF 97.76%, and ANN 91.44%. Overfitting was checked to ensure the authenticity of RF and KNN classifiers. Layer-wise and top-feature clusters are also classified. The comparison is made between the IoTID20 dataset’s existing ML based approaches with the proposed work. An analysis is conducted between all the classifiers’ achieved resultsIn the technology park, the IoT has brought revolution and changed human lifestyle in a better way. The interconnected miniature devices over the internet consume energy resources and transmit data back and forth. However, it also sparked concerns about human privacy due to data breaches because of the vulnerabilities in the IoT network. Intruders intrude into the network and invade human privacy by accessing their user accounts or halting the network through botnet attacks bringing businesses down and making them pay huge numbers of ransom. In an IoT network to secure the data, NIDS should be enhanced and deployed to detect the intrusion on the runtime. Building an efficient and effective IDS that handles breaches to protect human privacy is a challenge in itself. To create an optimal IDS many researchers have generated multiple datasets over the IoT net work environment to perform experiments with different ML classifiers. By combining IoTID20 and HIKARI 2022 datasets, we have proposed a newly merged dataset that was initially prepro cessed, and then different imbalance removal techniques were used including SMOTE, GAN, and Manual Chunks. Further, binary and multi-class classification is performed where LR and NB didn’t perform well, comparatively to the Random Forest, KNN, ANN, and DAE classification results. The manual chunks technique achieved reliable results among others. In binary classification, the average accuracy is achieved by applying LR 97%, NB 98%, KNN 99.98% RF 100%, ANN 99.90%, and DAE 94%. The achieved multi-class classification aversge accuracy by applying LR 73.96%, NB 71.80%, KNN 97.51% RF 97.76%, and ANN 91.44%. Overfitting was checked to ensure the authenticity of RF and KNN classifiers. Layer-wise and top-feature clusters are also classified. The comparison is made between the IoTID20 dataset’s existing ML based approaches with the proposed work. An analysis is conducted between all the classifiers’ achieved results. In the technology park, the IoT has brought revolution and changed human lifestyle in a better way. The interconnected miniature devices over the internet consume energy resources and transmit data back and forth. However, it also sparked concerns about human privacy due to data breaches because of the vulnerabilities in the IoT network. Intruders intrude into the network and invade human privacy by accessing their user accounts or halting the network through bot net attacks bringing businesses down and making them pay huge numbers of ransom. In an IoT network to secure the data, NIDS should be enhanced and deployed to detect the intrusion on the runtime. Building an efficient and effective IDS that handles breaches to protect human privacy is a challenge in itself. To create an optimal IDS many researchers have generated multiple datasets over the IoT network environment to perform experiments with different ML classifiers. By combining IoTID20 and HIKARI 2022 datasets, we have proposed a newly merged dataset that was initially preprocessed, and then different imbalance removal techniques were used including SMOTE, GAN, and Manual Chunks. Further, binary and multi-class classification is performed where LR and NB didn’t perform well, comparatively to the Random Forest, KNN, ANN, and DAE classification results. The manual chunks technique achieved reliable results among others. In binary classification, the average accuracy is achieved by applying LR 97%, NB 98%, KNN 99.98% RF 100%, ANN 99.90%, and DAE 94%. The achieved multi-class classification aversge accuracy by applying LR 73.96%, NB 71.80%, KNN 97.51% RF 97.76%, and ANN 91.44%. Overfitting was checked to ensure the authenticity of RF and KNN classifiers. Layer-wise and top-feature clusters are also classified. The comparison is made between the IoTID20 dataset’s existing ML based approaches with the proposed work. An analysis is conducted between all the classifiers’ achieved results.	en_US
dc.language.iso	en	en_US
dc.publisher	School of Electrical Engineering & Computer Science (SEECS), NUST	en_US
dc.title	Intrusion Detection System using Machine Learning Models for IoT Novel Dataset	en_US
dc.type	Thesis	en_US