论文标题
通过电子邮件嵌入进行网络钓鱼检测
Phishing Detection through Email Embeddings
论文作者
论文摘要
通过机器学习技术检测网络钓鱼电子邮件的问题已经在文献中进行了广泛讨论。常规和最先进的机器学习算法已经证明了具有高精度构建分类器的可能性。现有的研究通过一般指标处理网络钓鱼和真实的电子邮件,因此尚不清楚哪些网络钓鱼功能有助于分类器的变化。在本文中,我们制作了一系列带有类似指标的网络钓鱼和合法电子邮件,以调查这些提示是被电子邮件嵌入(即矢量化)捕获或忽略的。然后,我们通过精心制作的电子邮件将机器学习分类器喂食,以了解开发了电子邮件嵌入的性能。我们的结果表明,使用这些指标,电子邮件嵌入技术对于将电子邮件分类为网络钓鱼或合法性是有效的。
The problem of detecting phishing emails through machine learning techniques has been discussed extensively in the literature. Conventional and state-of-the-art machine learning algorithms have demonstrated the possibility of building classifiers with high accuracy. The existing research studies treat phishing and genuine emails through general indicators and thus it is not exactly clear what phishing features are contributing to variations of the classifiers. In this paper, we crafted a set of phishing and legitimate emails with similar indicators in order to investigate whether these cues are captured or disregarded by email embeddings, i.e., vectorizations. We then fed machine learning classifiers with the carefully crafted emails to find out about the performance of email embeddings developed. Our results show that using these indicators, email embeddings techniques is effective for classifying emails as phishing or legitimate.