论文标题
部分可观测时空混沌系统的无模型预测
Complementary Explanations for Effective In-Context Learning
论文作者
论文摘要
大型语言模型(LLM)在提示中的解释中学习表现出了显着的能力,但是对这些解释的功能或为什么它们有效的确切理解有限。这项工作旨在更好地理解用来用于秘密学习的解释的机制。我们首先研究了两个不同因素对提示性能的影响:计算跟踪(解决方案的分解方式)和用于表达提示的自然语言。通过对三个受控任务的解释,我们表明这两个因素都有助于解释的有效性。我们进一步研究了如何形成最大有效的解释集来解决给定的测试查询。我们发现LLM可以从解释集的互补性中受益:不同示例显示的各种推理技能可以带来更好的性能。因此,我们提出了一种基于边际相关性的最大相关性选择方法,用于构建既相关又互补的示例集,从而成功地改善了多个LLMS上三个现实世界中的三个现实世界任务中的秘密学习绩效。
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective. This work aims to better understand the mechanisms by which explanations are used for in-context learning. We first study the impact of two different factors on the performance of prompts with explanations: the computation trace (the way the solution is decomposed) and the natural language used to express the prompt. By perturbing explanations on three controlled tasks, we show that both factors contribute to the effectiveness of explanations. We further study how to form maximally effective sets of explanations for solving a given test query. We find that LLMs can benefit from the complementarity of the explanation set: diverse reasoning skills shown by different exemplars can lead to better performance. Therefore, we propose a maximal marginal relevance-based exemplar selection approach for constructing exemplar sets that are both relevant as well as complementary, which successfully improves the in-context learning performance across three real-world tasks on multiple LLMs.