论文标题

基于机器学习的网络入侵检测系统的NetFlow数据集

NetFlow Datasets for Machine Learning-based Network Intrusion Detection Systems

论文作者

Sarhan, Mohanad, Layeghy, Siamak, Moustafa, Nour, Portmann, Marius

论文摘要

基于机器学习(ML)的网络入侵检测系统(NIDSS)已被证明是保护网络免受网络攻击的可靠智能工具。网络数据功能对基于ML的NIDS的性能有很大的影响。但是,评估ML模型通常是不可靠的,因为每个启用ML的NIDS都经过使用可能不包含安全事件的不同数据功能进行训练和验证。因此,需要来自多个数据集的共同地面功能集来评估ML模型的检测准确性及其跨数据集的能力。本文使用其公开可用的数据包捕获文件介绍了四个基准NIDS数据集的NetFlow功能,称为UNSW-NB15,BOT-IOT,TON-IOT和CSE-CIC-IDS2018。在实际情况下,与原始数据集中使用的复杂功能相比,NetFlow功能相对容易从网络流量中提取,因为它们通常是从数据包标头中提取的。生成的NetFlow数据集已被标记用于解决基于二进制和多类的学习挑战。初步结果表明,与其各自的原始功能数据集相比,NetFlow特征在四个数据集中导致相似的二进制级结果和较低的多类分类结果。 NetFlow数据集命名为NF-UNSW-NB15,NF-BOT-IOT,NF-TON-iot,NF-CSE-CIC-IDS2018和NF-UQ-NIDS和NF-UQ-NIDS在http://staff.ite.eite.uq.uq.uq.edu.uq.edu.au.au/marius/marius/nids_datasetset上发表。

Machine Learning (ML)-based Network Intrusion Detection Systems (NIDSs) have proven to become a reliable intelligence tool to protect networks against cyberattacks. Network data features has a great impact on the performances of ML-based NIDSs. However, evaluating ML models often are not reliable, as each ML-enabled NIDS is trained and validated using different data features that may do not contain security events. Therefore, a common ground feature set from multiple datasets is required to evaluate an ML model's detection accuracy and its ability to generalise across datasets. This paper presents NetFlow features from four benchmark NIDS datasets known as UNSW-NB15, BoT-IoT, ToN-IoT, and CSE-CIC-IDS2018 using their publicly available packet capture files. In a real-world scenario, NetFlow features are relatively easier to extract from network traffic compared to the complex features used in the original datasets, as they are usually extracted from packet headers. The generated Netflow datasets have been labelled for solving binary- and multiclass-based learning challenges. Preliminary results indicate that NetFlow features lead to similar binary-class results and lower multi-class classification results amongst the four datasets compared to their respective original features datasets. The NetFlow datasets are named NF-UNSW-NB15, NF-BoT-IoT, NF-ToN-IoT, NF-CSE-CIC-IDS2018 and NF-UQ-NIDS are published at http://staff.itee.uq.edu.au/marius/NIDS_datasets/ for research purposes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源