在一起更好？ AI支持的代码翻译的评估

论文标题

在一起更好？ AI支持的代码翻译的评估

Better Together? An Evaluation of AI-Supported Code Translation

论文作者

Weisz, Justin D., Muller, Michael, Ross, Steven I., Martinez, Fernando, Houde, Stephanie, Agarwal, Mayank, Talamadupula, Kartik, Richards, John T.

论文摘要

生成机器学习模型最近已应用于源代码，用于用例，包括在编程语言之间翻译代码，从代码创建文档以及自动完成方法。但是，最新的模型通常会产生错误或不完整的代码。在与32位软件工程师的对照研究中，我们检查了这种不完美的输出在Java-to-Python代码翻译的背景下是否有帮助。当通过代码翻译模型的输出帮助时，参与者产生的代码比单独工作时更少。我们还研究了AI翻译的质量和数量如何影响成果的工作过程和质量，并观察到，提供多个翻译对翻译过程的影响要比改变提供的翻译的质量更大。我们的结果讲述了一个复杂，细微的故事，讲述了生成代码模型的好处以及软件工程师在使用输出时面临的挑战。我们的工作激发了对智能用户界面的需求，这些智能用户界面可以帮助软件工程师有效地使用生成代码模型，以了解和评估其产出，并实现与独自工作的优势成果。

Generative machine learning models have recently been applied to source code, for use cases including translating code between programming languages, creating documentation from code, and auto-completing methods. Yet, state-of-the-art models often produce code that is erroneous or incomplete. In a controlled study with 32 software engineers, we examined whether such imperfect outputs are helpful in the context of Java-to-Python code translation. When aided by the outputs of a code translation model, participants produced code with fewer errors than when working alone. We also examined how the quality and quantity of AI translations affected the work process and quality of outcomes, and observed that providing multiple translations had a larger impact on the translation process than varying the quality of provided translations. Our results tell a complex, nuanced story about the benefits of generative code models and the challenges software engineers face when working with their outputs. Our work motivates the need for intelligent user interfaces that help software engineers effectively work with generative code models in order to understand and evaluate their outputs and achieve superior outcomes to working alone.

下载PDF全文

下载文献需遵守相关版权规定

论文标题