AIXBENCH：代码生成基准数据集

论文标题

AIXBENCH：代码生成基准数据集

AixBench: A Code Generation Benchmark Dataset

论文作者

Hao, Yiyang, Li, Ge, Liu, Yongqiang, Miao, Xiaowei, Zong, He, Jiang, Siyuan, Liu, Yang, Wei, He

论文摘要

我们提出了一个基准数据集，用于评估方法级代码生成任务。该基准包含一个数据集的175个样本，用于自动化评估和161个样本的数据集用于手动评估。我们还提出了一个新的度量标准，用于自动评估生成的代码的正确性，以及一组手动评估生成代码的整体质量的标准。

We present a benchmark dataset for evaluating method-level code generation task. The benchmark contains a dataset of 175 samples for automated evaluation and a dataset of 161 samples for manual evaluation. We also present a new metric for automatically evaluating the correctness of the generated code, and a set of criteria to manually evaluating the overall quality of the generated code.

下载PDF全文

下载文献需遵守相关版权规定

论文标题