关于关系任务的神经结构归纳偏见

论文标题

关于关系任务的神经结构归纳偏见

On Neural Architecture Inductive Biases for Relational Tasks

论文作者

Kerg, Giancarlo, Mittal, Sarthak, Rolnick, David, Bengio, Yoshua, Richards, Blake, Lajoie, Guillaume

论文摘要

当前的深度学习方法显示出良好的分布概括性能，但在分布外的概括方面挣扎。在涉及抽象关系（例如识别序列中的规则）的任务中，尤其如此，正如我们在许多情报测试中所发现的那样。最近的工作探索了如何强迫关系表示与感觉表示的不同，这在大脑中似乎是这种情况，可以帮助人工系统。在这项工作的基础上，我们进一步探索并正式化了关系和感官细节的“分区”表示所提供的优势，以及这种归纳偏见如何帮助在新遇到的环境中重新组建学习的关系结构。我们基于相似性分数介绍了一个简单的体系结构，我们将其命名为组成关系网络（Corelnet）。使用此模型，我们研究了一系列的归纳偏见，以确保从感觉数据中学习并明显地汲取抽象关系，并探索它们对一系列关系心理物理学任务的分数概括的影响。我们发现，简单的体系结构选择可以超越现有的模型，以分布概括。总之，这些结果表明，从其他信息流中分配关系表示形式可能是一种简单的方法，可以在执行分布外的关系计算时增强现有网络体系结构的鲁棒性。

Current deep learning approaches have shown good in-distribution generalization performance, but struggle with out-of-distribution generalization. This is especially true in the case of tasks involving abstract relations like recognizing rules in sequences, as we find in many intelligence tests. Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems. Building on this work, we further explore and formalize the advantages afforded by 'partitioned' representations of relations and sensory details, and how this inductive bias can help recompose learned relational structure in newly encountered settings. We introduce a simple architecture based on similarity scores which we name Compositional Relational Network (CoRelNet). Using this model, we investigate a series of inductive biases that ensure abstract relations are learned and represented distinctly from sensory data, and explore their effects on out-of-distribution generalization for a series of relational psychophysics tasks. We find that simple architectural choices can outperform existing models in out-of-distribution generalization. Together, these results show that partitioning relational representations from other information streams may be a simple way to augment existing network architectures' robustness when performing out-of-distribution relational computations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题