大型语言模型的解释使小推理器变得更好

论文标题

大型语言模型的解释使小推理器变得更好

Explanations from Large Language Models Make Small Reasoners Better

论文作者

Li, Shiyang, Chen, Jianshu, Shen, Yelong, Chen, Zhiyu, Zhang, Xinlu, Li, Zekun, Wang, Hong, Qian, Jing, Peng, Baolin, Mao, Yi, Chen, Wenhu, Yan, Xifeng

论文摘要

将自由文本解释与大型语言模型（LLM）的文本学习（LLM）的内在学习相结合，可引起强大的推理能力以及合理的解释。在本文中，我们考虑了利用LLM生成的解释来改善小理性培训的问题，由于其低成本，这在实体部署中更有利。我们系统地探索了LLM中的三种解释生成方法，并利用多任务学习框架来促进小型模型与解释生成能力一起获得强大的推理能力。在多个推理任务上进行的实验表明，我们的方法可以始终如一地超过不同设置的基准，甚至比芬太尼/提示60倍更大的GPT-3（175b）模型的表现更好，高达9.5％的精度。作为附带的好处，人类评估进一步表明，我们的方法可以产生高质量的解释以证明其预测是合理的，并朝着可解释的AI的目标迈进。

Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations. In this paper, we consider the problem of leveraging the explanations generated by LLM to improve the training of small reasoners, which are more favorable in real-production deployment due to their low cost. We systematically explore three explanation generation approaches from LLM and utilize a multi-task learning framework to facilitate small models to acquire strong reasoning power together with explanation generation capabilities. Experiments on multiple reasoning tasks show that our method can consistently and significantly outperform finetuning baselines across different settings, and even perform better than finetuning/prompting a 60x larger GPT-3 (175B) model by up to 9.5% in accuracy. As a side benefit, human evaluation further shows that our method can generate high-quality explanations to justify its predictions, moving towards the goal of explainable AI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题