解释机器学习恶意软件探测器，哪些利用n-gram分析

论文标题

解释机器学习恶意软件探测器，哪些利用n-gram分析

Interpreting Machine Learning Malware Detectors Which Leverage N-gram Analysis

论文作者

Briguglio, William, Saad, Sherif

论文摘要

在网络攻击检测和预防系统中，网络安全分析师始终更喜欢与基于规则或基于签名的检测一样可解释和可理解的解决方案。这是因为需要调整和优化这些解决方案，以减轻和控制假阳性和假否定性的效果。解释机器学习模型是一个新的开放挑战。但是，预计可解释的机器学习解决方案将是特定领域的。例如，医疗保健中机器学习模型的可解释解决方案与恶意软件检测中的解决方案不同。这是因为模型很复杂，并且其中大多数是黑框。最近，恶意软件作者绕过Antimalware Systems的能力增加了，迫使安全专家将其用于创建强大检测系统的机器学习。如果这些系统在行业中依赖，那么除其他挑战外，它们还必须解释其预测。本文的目的是评估当应用于基于ML的恶意软件探测器时，评估当前的最新ML模型可解释性技术。我们在实践中展示了可解释性技术，并评估了现有的可解释性技术在恶意软件分析域中的有效性。

In cyberattack detection and prevention systems, cybersecurity analysts always prefer solutions that are as interpretable and understandable as rule-based or signature-based detection. This is because of the need to tune and optimize these solutions to mitigate and control the effect of false positives and false negatives. Interpreting machine learning models is a new and open challenge. However, it is expected that an interpretable machine learning solution will be domain-specific. For instance, interpretable solutions for machine learning models in healthcare are different than solutions in malware detection. This is because the models are complex, and most of them work as a black-box. Recently, the increased ability for malware authors to bypass antimalware systems has forced security specialists to look to machine learning for creating robust detection systems. If these systems are to be relied on in the industry, then, among other challenges, they must also explain their predictions. The objective of this paper is to evaluate the current state-of-the-art ML models interpretability techniques when applied to ML-based malware detectors. We demonstrate interpretability techniques in practice and evaluate the effectiveness of existing interpretability techniques in the malware analysis domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题