诊断反馈的基于图的，自我监督的程序维修

论文标题

诊断反馈的基于图的，自我监督的程序维修

Graph-based, Self-Supervised Program Repair from Diagnostic Feedback

论文作者

Yasunaga, Michihiro, Liang, Percy

论文摘要

我们考虑从诊断反馈中学习修复程序的问题（例如编译器错误消息）。程序维修具有挑战性，有两个原因：首先，它需要跨源代码和诊断反馈的推理和跟踪符号。其次，可用于程序维修的标签数据集相对较小。在这项工作中，我们针对这两个挑战提出了新的解决方案。首先，我们引入了一个程序反馈图，该图将与源代码和诊断反馈中的程序维修相关的符号连接，然后在顶部应用图形神经网络以建模推理过程。其次，我们提出了一个自制的学习范式进行程序维修范式，该范式利用在线可用的未标记程序来创建大量额外的程序维修示例，我们用来预先培训我们的模型。我们在两个应用程序上评估了建议的方法：纠正介绍性编程分配（DeepFix数据集）并纠正程序合成的输出（SPOC数据集）。我们的最终系统Drrepair明显优于先前的工作，在DeepFix上达到了68.2％的完整维修率（超过先前最佳最佳），而SPOC的合成成功率为48.4％（超过先前最佳最佳）。

We consider the problem of learning to repair programs from diagnostic feedback (e.g., compiler error messages). Program repair is challenging for two reasons: First, it requires reasoning and tracking symbols across source code and diagnostic feedback. Second, labeled datasets available for program repair are relatively small. In this work, we propose novel solutions to these two challenges. First, we introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback, and then apply a graph neural network on top to model the reasoning process. Second, we present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online to create a large amount of extra program repair examples, which we use to pre-train our models. We evaluate our proposed approach on two applications: correcting introductory programming assignments (DeepFix dataset) and correcting the outputs of program synthesis (SPoC dataset). Our final system, DrRepair, significantly outperforms prior work, achieving 68.2% full repair rate on DeepFix (+22.9% over the prior best), and 48.4% synthesis success rate on SPoC (+3.7% over the prior best).

下载PDF全文

下载文献需遵守相关版权规定

论文标题