论文标题
变压器中的质量编辑记忆
Mass-Editing Memory in a Transformer
论文作者
论文摘要
最近的工作在用新记忆更新大型语言模型方面表现出令人兴奋的希望,以取代过时的信息或添加专业知识。但是,这项工作主要仅限于更新单个关联。我们开发了一种直接更新具有许多记忆力的语言模型的方法,在实验上证明它可以扩展到GPT-J(6B)和GPT-NEOX(20B)(20B)的数千个关联,超过了先前的工作。我们的代码和数据位于https://memit.baulab.info。
Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge. However, this line of work is predominantly limited to updating single associations. We develop MEMIT, a method for directly updating a language model with many memories, demonstrating experimentally that it can scale up to thousands of associations for GPT-J (6B) and GPT-NeoX (20B), exceeding prior work by orders of magnitude. Our code and data are at https://memit.baulab.info.