用自然语言补丁修复模型错误

论文标题

用自然语言补丁修复模型错误

Fixing Model Bugs with Natural Language Patches

论文作者

Murty, Shikhar, Manning, Christopher D., Lundberg, Scott, Ribeiro, Marco Tulio

论文摘要

当前解决NLP模型中系统问题的方法（例如，正则贴片，对更多数据进行填充）是脆弱的，或者是劳动密集型且对快捷方式负有责任。相比之下，人类通常通过自然语言彼此提供更正。从中汲取灵感，我们探讨了自然语言补丁 - 声明性陈述，使开发人员可以在正确的抽象级别提供纠正式反馈，要么覆盖模型（如果评论给出了2星，情感是负面的'），或提供模型可能缺乏模型的其他信息（如果将某些事物描述为炸弹，那么它是很好的'）。我们对确定补丁是否与集成补丁信息的任务分开应用的任务进行建模，并证明，使用少量的合成数据，我们可以教导模型在真实数据上有效地使用真实的补丁 - 1至7个补丁提高了精度，将〜1-4的精度提高了〜1-4个情感分析数据集的不同切片的精度，并且在情感分析数据集的不同片段上，对相关点上的F1点数提高了7点。最后，我们表明，可能需要在多达100个标记的示例上进行填充以匹配一小部分语言补丁的性能。

Current approaches for fixing systematic problems in NLP models (e.g. regex patches, finetuning on more data) are either brittle, or labor-intensive and liable to shortcuts. In contrast, humans often provide corrections to each other through natural language. Taking inspiration from this, we explore natural language patches -- declarative statements that allow developers to provide corrective feedback at the right level of abstraction, either overriding the model (``if a review gives 2 stars, the sentiment is negative'') or providing additional information the model may lack (``if something is described as the bomb, then it is good''). We model the task of determining if a patch applies separately from the task of integrating patch information, and show that with a small amount of synthetic data, we can teach models to effectively use real patches on real data -- 1 to 7 patches improve accuracy by ~1-4 accuracy points on different slices of a sentiment analysis dataset, and F1 by 7 points on a relation extraction dataset. Finally, we show that finetuning on as many as 100 labeled examples may be needed to match the performance of a small set of language patches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题