论文标题
解释性学习:超越神经网络的经验主义
Explanatory Learning: Beyond Empiricism in Neural Networks
论文作者
论文摘要
我们介绍了解释性学习(EL),这是一个框架,让机器使用以象征性序列埋葬的现有知识 - 例如用象形文字写的解释 - 自主学习解释它们。在EL中,解释符号的负担并不留给人类或僵化的人工编码的编译器,如程序合成所做的那样。相反,El呼吁提供一个学识渊博的解释器,建立在有限的符号序列中,并与几种现象的观察配对。鉴于其解释,该口译员可用于对一种新现象进行预测,甚至可以像人类科学家一样发现仅使用少数观察结果的解释。我们将EL问题提出为简单的二进制分类任务,以便与机器学习的主要经验主义观点保持一致的通用端到端方法原则上可以解决它。对于这些模型,我们反对关键的理性主义网络(CRN),而批判性的理性主义者对知识的获取有了理性主义的看法。 CRNS通过构造表达了几种所需的属性,它们确实可以解释,可以在测试时间调整其处理以进行更严格的推论,并可以在预测上提供强烈的信心保证。作为最终贡献,我们介绍了Odeen,这是一个基本的EL环境,该环境模拟了一个充满现象的小型平地风格的宇宙。使用ODEEN作为测试床,我们展示了CRN在发现新现象的解释时如何优于经验主义者的端到端方法(变形金刚)。
We introduce Explanatory Learning (EL), a framework to let machines use existing knowledge buried in symbolic sequences -- e.g. explanations written in hieroglyphic -- by autonomously learning to interpret them. In EL, the burden of interpreting symbols is not left to humans or rigid human-coded compilers, as done in Program Synthesis. Rather, EL calls for a learned interpreter, built upon a limited collection of symbolic sequences paired with observations of several phenomena. This interpreter can be used to make predictions on a novel phenomenon given its explanation, and even to find that explanation using only a handful of observations, like human scientists do. We formulate the EL problem as a simple binary classification task, so that common end-to-end approaches aligned with the dominant empiricist view of machine learning could, in principle, solve it. To these models, we oppose Critical Rationalist Networks (CRNs), which instead embrace a rationalist view on the acquisition of knowledge. CRNs express several desired properties by construction, they are truly explainable, can adjust their processing at test-time for harder inferences, and can offer strong confidence guarantees on their predictions. As a final contribution, we introduce Odeen, a basic EL environment that simulates a small flatland-style universe full of phenomena to explain. Using Odeen as a testbed, we show how CRNs outperform empiricist end-to-end approaches of similar size and architecture (Transformers) in discovering explanations for novel phenomena.