论文标题
教学:通过指导调整对话中的零和几乎没有概括
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
论文作者
论文摘要
指令调整是NLP中的一种紧急范式,其中使用语言模型利用自然语言说明,以在看不见的任务上诱导零拍摄的性能。已显示指示可以在大型和小语言模型中都能在看不见的任务和数据集上良好的性能。对话是一个特别有趣的领域,可以探索教学调整,因为对话系统执行了多种与语言有关的任务(例如,自然语言理解和生成,特定于领域的互动),但是尚未系统地探索指令调整与对话相关的任务。我们介绍了Dendertdial,这是对话的指令调整框架,该框架由一个由48个不同的对话任务组成的存储库,以统一的文本对文本格式由59个公开可用的对话数据集创建。接下来,我们探索在各种对话任务上使用指导性的模型上的交叉任务概括能力。我们的分析表明,教学界可以在看不见的数据集和对话评估和意图检测等任务上良好的零射击性能,甚至可以在几次设置中进行更好的性能。为了确保模型遵守说明,我们介绍了新颖的元任务。我们在多个对话任务上建立了基准的零击和几乎没有射击的模型。
Instruction tuning is an emergent paradigm in NLP wherein natural language instructions are leveraged with language models to induce zero-shot performance on unseen tasks. Instructions have been shown to enable good performance on unseen tasks and datasets in both large and small language models. Dialogue is an especially interesting area to explore instruction tuning because dialogue systems perform multiple kinds of tasks related to language (e.g., natural language understanding and generation, domain-specific interaction), yet instruction tuning has not been systematically explored for dialogue-related tasks. We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. Next, we explore cross-task generalization ability on models tuned on InstructDial across diverse dialogue tasks. Our analysis reveals that InstructDial enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting. To ensure that models adhere to instructions, we introduce novel meta-tasks. We establish benchmark zero-shot and few-shot performance of models trained using the proposed framework on multiple dialogue tasks.