几次学习语言模型的原型校准

论文标题

几次学习语言模型的原型校准

Prototypical Calibration for Few-shot Learning of Language Models

论文作者

Han, Zhixiong, Hao, Yaru, Dong, Li, Sun, Yutao, Wei, Furu

论文摘要

在不同手工制作的模板和演示排列中，对GPT样模型的内在学习被认为是脆弱的。在这项工作中，我们提出了典型的校准，以适应地学习一个更健壮的决策边界，以零分类和少量分类，而不是贪婪的解码。具体而言，我们的方法首先采用高斯混合物分布来估计所有类别的原型簇。然后，我们通过求解加权的两分匹配问题将每个群集分配给相应的标签。以一个例子为例，其预测是通过原型簇的可能性来校准的。实验结果表明，原型校准在各种任务方面都有很大的改进。跨不同尺度的广泛分析还表明，我们的方法校准了预期的决策边界，从而大大提高了GPT对模板，排列和阶级失衡的鲁棒性。

In-context learning of GPT-like models has been recognized as fragile across different hand-crafted templates, and demonstration permutations. In this work, we propose prototypical calibration to adaptively learn a more robust decision boundary for zero- and few-shot classification, instead of greedy decoding. Concretely, our method first adopts Gaussian mixture distribution to estimate the prototypical clusters for all categories. Then we assign each cluster to the corresponding label by solving a weighted bipartite matching problem. Given an example, its prediction is calibrated by the likelihood of prototypical clusters. Experimental results show that prototypical calibration yields a substantial improvement on a diverse set of tasks. Extensive analysis across different scales also indicates that our method calibrates the decision boundary as expected, greatly improving the robustness of GPT to templates, permutations, and class imbalance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题