论文标题

针对强大的CNN的基于频率的解释

Towards Frequency-Based Explanation for Robust CNN

论文作者

Wang, Zifan, Yang, Yilin, Shrivastava, Ankit, Rawal, Varun, Ding, Zihao

论文摘要

当前对透明卷积神经网络(CNN)的解释技术主要集中在建立与模型预测的人类可遗忘的输入特征之间建立连接,从而俯瞰了输入的替代表示频率成分分解。在这项工作中,我们介绍了输入数据集中频率分量分布与模型从数据中学习的推理过程之间的连接分析。我们进一步提供了有关不同频率成分对模型预测的贡献的量化分析。我们表明,模型对微小扭曲的脆弱性是该模型的结果是依赖于高频功能,即对对抗(黑色和白色框)攻击者的目标特征来进行预测。我们进一步表明,如果该模型与真实标签之间的低频组件之间建立了更牢固的关联,则该模型更加可靠,这就是解释为什么对手训练的模型对微小的扭曲更加强大。

Current explanation techniques towards a transparent Convolutional Neural Network (CNN) mainly focuses on building connections between the human-understandable input features with models' prediction, overlooking an alternative representation of the input, the frequency components decomposition. In this work, we present an analysis of the connection between the distribution of frequency components in the input dataset and the reasoning process the model learns from the data. We further provide quantification analysis about the contribution of different frequency components toward the model's prediction. We show that the vulnerability of the model against tiny distortions is a result of the model is relying on the high-frequency features, the target features of the adversarial (black and white-box) attackers, to make the prediction. We further show that if the model develops stronger association between the low-frequency component with true labels, the model is more robust, which is the explanation of why adversarially trained models are more robust against tiny distortions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源