论文标题
Katana:学习错误修复的基于双切片的上下文
Katana: Dual Slicing-Based Context for Learning Bug Fixes
论文作者
论文摘要
在理解和修复错误时,上下文信息对于软件开发人员起着至关重要的作用。因此,基于深度学习的程序维修技术利用错误修复的环境。但是,现有技术以任意方式处理上下文,通过在封闭文件,类或方法中密切近距离提取代码,而无需任何分析以找到与错误的实际关系。为了降低噪声,他们对用作上下文的代币数量使用预定义的最大限制。我们提出了一种基于程序切片的方法,在这种方法中,我们不是任意将代码作为上下文,而是分析对错误语句具有控制或数据依赖性的语句。我们提出了一个名为“双重切片”的新颖概念,该概念利用了代码的越野车和固定版本的上下文来捕获相关的维修成分。我们介绍了称为Katana的技术和工具,该技术和工具是第一个将基于切片的上下文应用程序修复任务应用的技术和工具。结果表明,Katana有效地保留了足够的信息,可以在减少噪声的同时选择上下文信息。我们将四种最新的上下文感知程序修复技术进行比较。我们的结果表明,Katana修复的错误是现有技术的1.5至3.7倍。
Contextual information plays a vital role for software developers when understanding and fixing a bug. Consequently, deep learning-based program repair techniques leverage context for bug fixes. However, existing techniques treat context in an arbitrary manner, by extracting code in close proximity of the buggy statement within the enclosing file, class, or method, without any analysis to find actual relations with the bug. To reduce noise, they use a predefined maximum limit on the number of tokens to be used as context. We present a program slicing-based approach, in which instead of arbitrarily including code as context, we analyze statements that have a control or data dependency on the buggy statement. We propose a novel concept called dual slicing, which leverages the context of both buggy and fixed versions of the code to capture relevant repair ingredients. We present our technique and tool called Katana, the first to apply slicing-based context for a program repair task. The results show Katana effectively preserves sufficient information for a model to choose contextual information while reducing noise. We compare against four recent state-of-the-art context-aware program repair techniques. Our results show Katana fixes between 1.5 to 3.7 times more bugs than existing techniques.