论文标题
拼图拼图:选择性后门攻击以颠覆恶意软件分类器
Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers
论文作者
论文摘要
由于需要使用从野外收集的样品进行定期重新训练,因此恶意软件分类器会受到培训时间的剥削。最近的工作表明了对恶意软件分类器的后门攻击的可行性,但是这种攻击的隐身性尚不清楚。在本文中,我们在清洁标签设置下研究了这种现象(即攻击者对训练或标记过程没有完全控制)。从经验上讲,我们表明,诸如MNTD之类的防御措施仍然可以检测到恶意软件分类器中现有的后门攻击。为了提高隐身性,我们提出了一项新的攻击,拼图拼图(JP),基于关键观察,即恶意软件作者几乎没有动机来保护任何其他作者的恶意软件。因此,拼图拼图学会了一个触发器,以补充恶意软件作者样品的潜在图案,并仅当触发器和潜在图案拼凑在一起时,才能激活后门。我们进一步关注问题空间中可实现的触发器(例如,软件代码),使用从良性软件中广泛收获的字节码小工具。我们的评估证实,拼图拼图是后门有效的,对最先进的防御措施仍然隐秘,并且在现实环境中是一个威胁,而不是仅仅推理了仅针对功能空间攻击的推理。我们通过探索有前途的方法来改善后门防御能力来结束。
Malware classifiers are subject to training-time exploitation due to the need to regularly retrain using samples collected from the wild. Recent work has demonstrated the feasibility of backdoor attacks against malware classifiers, and yet the stealthiness of such attacks is not well understood. In this paper, we investigate this phenomenon under the clean-label setting (i.e., attackers do not have complete control over the training or labeling process). Empirically, we show that existing backdoor attacks in malware classifiers are still detectable by recent defenses such as MNTD. To improve stealthiness, we propose a new attack, Jigsaw Puzzle (JP), based on the key observation that malware authors have little to no incentive to protect any other authors' malware but their own. As such, Jigsaw Puzzle learns a trigger to complement the latent patterns of the malware author's samples, and activates the backdoor only when the trigger and the latent pattern are pieced together in a sample. We further focus on realizable triggers in the problem space (e.g., software code) using bytecode gadgets broadly harvested from benign software. Our evaluation confirms that Jigsaw Puzzle is effective as a backdoor, remains stealthy against state-of-the-art defenses, and is a threat in realistic settings that depart from reasoning about feature-space only attacks. We conclude by exploring promising approaches to improve backdoor defenses.