为小样本APT攻击交通分类的两个统计异常构建

论文标题

为小样本APT攻击交通分类的两个统计异常构建

Construction of Two Statistical Anomaly Features for Small-Sample APT Attack Traffic Classification

论文作者

Zhang, Ru, Sun, Wenxin, Liu, Jianyi, Li, Jingwen, Lei, Guan, Guo, Han

论文摘要

高级持续威胁（APT）攻击（也称为定向威胁攻击）是指组织对特定物体进行的持续有效攻击活动。它们是秘密的，持久的和有针对性的，很难通过传统的入侵检测系统（IDS）捕获。 APT组织生成的流量是启动APT攻击的组织，具有很高的相似性，尤其是在命令和控制阶段（C2）阶段。为APT组织增加功能可以有效提高APT攻击的交通检测准确性。本文分析了APT攻击的DNS和TCP流量，并构建了两个新功能，C2load_fluct（响应数据包负载波动）和BAD_RATE（数据包率不良）。分析表明，在这两个特征中，APT攻击具有明显的统计定律。本文结合了两个新功能和常见功能，以对APT攻击流量进行分类。针对数据丢失和边界样本的问题，我们改善了自适应合成（ADASYN）采样方法，并提出PADASYN算法以达到数据平衡。流量分类方案是基于Adaboost算法设计的。实验表明，在两个数据集中添加新功能之后，提高了APT攻击流量的分类精度，从而使用10个DNS功能，11个TCP和HTTP/HTTPS功能来构建功能集。在两个数据集上，F1得分可以分别达到0.98和0.94，这证明了本文中的两个新功能对于合适的交通检测有效。

Advanced Persistent Threat (APT) attack, also known as directed threat attack, refers to the continuous and effective attack activities carried out by an organization on a specific object. They are covert, persistent and targeted, which are difficult to capture by traditional intrusion detection system(IDS). The traffic generated by the APT organization, which is the organization that launch the APT attack, has a high similarity, especially in the Command and Control(C2) stage. The addition of features for APT organizations can effectively improve the accuracy of traffic detection for APT attacks. This paper analyzes the DNS and TCP traffic of the APT attack, and constructs two new features, C2Load_fluct (response packet load fluctuation) and Bad_rate (bad packet rate). The analysis showed APT attacks have obvious statistical laws in these two features. This article combines two new features with common features to classify APT attack traffic. Aiming at the problem of data loss and boundary samples, we improve the Adaptive Synthetic(ADASYN) Sampling Approach and propose the PADASYN algorithm to achieve data balance. A traffic classification scheme is designed based on the AdaBoost algorithm. Experiments show that the classification accuracy of APT attack traffic is improved after adding new features to the two datasets so that 10 DNS features, 11 TCP and HTTP/HTTPS features are used to construct a Features set. On the two datasets, F1-score can reach above 0.98 and 0.94 respectively, which proves that the two new features in this paper are effective for APT traffic detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题