通过认知神经网络进行微调语言模型

论文标题

通过认知神经网络进行微调语言模型

Fine-Tuning Language Models via Epistemic Neural Networks

论文作者

Osband, Ian, Asghari, Seyed Mohammad, Van Roy, Benjamin, McAleese, Nat, Aslanides, John, Irving, Geoffrey

论文摘要

语言模型通常会在大型无监督文本语料库中预先培训，然后对其他特定于任务的数据进行微调。但是，典型的微调方案并未优先考虑其调整的示例。我们表明，如果您可以优先考虑信息培训数据，则可以在使用更少的标签的同时获得更好的性能。为此，我们增强了一个具有Epiatet的语言模型：一个小的附加网络，有助于估算模型不确定性并形成\ textit {认知神经网络}（ENN）。 ENN是可以知道他们不知道的神经网络。使用epiatet来确定不确定数据的优先级，我们可以将胶水任务的BERT微调为相同的性能，同时使用2倍的数据而不是培训而没有优先级。我们还研究了旨在建立理解的合成神经网络生成模型的性能。在每种情况下，使用Epactet的表现都优于启发式主动学习方案。

Language models often pre-train on large unsupervised text corpora, then fine-tune on additional task-specific data. However, typical fine-tuning schemes do not prioritize the examples that they tune on. We show that, if you can prioritize informative training data, you can achieve better performance while using fewer labels. To do this we augment a language model with an epinet: a small additional network that helps to estimate model uncertainty and forms an \textit{epistemic neural network} (ENN). ENNs are neural networks that can know what they don't know. Using an epinet to prioritize uncertain data, we can fine-tune BERT on GLUE tasks to the same performance while using 2x less data than training without prioritization. We also investigate performance in synthetic neural network generative models designed to build understanding. In each setting, using an epinet outperforms heuristic active learning schemes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题