机械模式连接

论文标题

机械模式连接

Mechanistic Mode Connectivity

论文作者

Lubana, Ekdeep Singh, Bigelow, Eric J., Dick, Robert P., Krueger, David, Tanaka, Hidenori

论文摘要

我们通过模式连通性的镜头研究神经网络损失景观，观察到，通过训练在数据集中检索到的神经网络的最小化是通过低损失的简单路径连接的。具体来说，我们提出以下问题：是否依靠不同机制来通过低损失的简单路径进行预测的最小化器？我们提供了机械相似性的定义，即共享的不向导，以输入转换，并证明两个模型之间缺乏线性连接意味着他们使用不同的机制来做出预测。与练习相关，该结果有助于我们证明在下游数据集上的幼稚微调可能无法改变模型的机制，例如，微调无法消除模型对伪造属性的依赖。我们的分析还激发了一种针对模型机制的目标改变的方法，该机制为基于连接性的微调（CBFT），我们使用多个合成数据集对其进行分析，以减少模型对伪造属性的依赖的任务。

We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved via training on a dataset are connected via simple paths of low loss. Specifically, we ask the following question: are minimizers that rely on different mechanisms for making their predictions connected via simple paths of low loss? We provide a definition of mechanistic similarity as shared invariances to input transformations and demonstrate that lack of linear connectivity between two models implies they use dissimilar mechanisms for making their predictions. Relevant to practice, this result helps us demonstrate that naive fine-tuning on a downstream dataset can fail to alter a model's mechanisms, e.g., fine-tuning can fail to eliminate a model's reliance on spurious attributes. Our analysis also motivates a method for targeted alteration of a model's mechanisms, named connectivity-based fine-tuning (CBFT), which we analyze using several synthetic datasets for the task of reducing a model's reliance on spurious attributes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题