论文标题
基于深度学习的框架,用于处理DGA,电子邮件和URL数据分析中的不平衡
Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis
论文作者
论文摘要
深度学习是许多应用的最先进方法。主要问题是,大多数实时数据本质上都高度不平衡。为了避免培训中的偏见,可以使用成本敏感的方法。在本文中,我们提出了基于成本敏感的深度学习框架,并且对三种不同的网络安全用例进行评估,即域名生成算法(DGA),电子邮件(电子邮件)和统一资源定位器(URL)。使用对这两种方法的成本敏感和成本敏感的方法和参数进行了各种实验,基于高参数调整。在所有实验中,成本敏感的深度学习方法的性能都比成本不敏感的方法更好。这主要是由于成本敏感的方法对培训过程中样本数量较少的课程的重要性很重要,这有助于以更有效的方式学习所有课程。
Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email), and Uniform Resource Locator (URL). Various experiments were performed using cost-insensitive as well as cost-sensitive methods and parameters for both of these methods are set based on hyperparameter tuning. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches. This is mainly due to the reason that cost-sensitive approach gives importance to the classes which have a very less number of samples during training and this helps to learn all the classes in a more efficient manner.