论文标题
半参数诱导点网络和神经过程
Semi-Parametric Inducing Point Networks and Neural Processes
论文作者
论文摘要
我们介绍了半参数诱导点网络(SPIN),这是一种通用体系结构,可以以计算效率的方式查询推理时间设置的训练。半参数架构通常比参数模型更紧凑,但是它们的计算复杂性通常是二次的。相反,自旋通过受诱导点方法启发的数据点之间的跨注意机制达到线性复杂性。查询大型训练集在元学习中特别有用,因为它可以解锁其他训练信号,但通常超过现有模型的缩放限制。我们将旋转作为诱导点神经过程的基础,这是一种概率模型,它支持元学习中的较大环境,并在现有模型失败的情况下实现高精度。在我们的实验中,旋转可以减少记忆需求,提高一系列元学习任务的准确性,并提高在重要的实际问题(基因型插补)上的最新性能。
We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner. Semi-parametric architectures are typically more compact than parametric models, but their computational complexity is often quadratic. In contrast, SPIN attains linear complexity via a cross-attention mechanism between datapoints inspired by inducing point methods. Querying large training sets can be particularly useful in meta-learning, as it unlocks additional training signal, but often exceeds the scaling limits of existing models. We use SPIN as the basis of the Inducing Point Neural Process, a probabilistic model which supports large contexts in meta-learning and achieves high accuracy where existing models fail. In our experiments, SPIN reduces memory requirements, improves accuracy across a range of meta-learning tasks, and improves state-of-the-art performance on an important practical problem, genotype imputation.