论文标题
非程序员可以通过主动示例间接标记程序:具有文本到SQL的案例研究
Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL
论文作者
论文摘要
非程序员可以用代表其含义的复杂程序注释自然语言吗?我们介绍了APEL,这是一个框架,其中非程序员在种子语义解析器生成的候选程序中选择该框架(例如Codex)。由于他们无法理解候选程序,因此我们要求他们通过检查程序的输入式示例间接选择。对于每种话语,APEL都会积极搜索一个简单的输入,候选程序倾向于在该输入上产生不同的输出。然后,它要求非程序员选择适当的输出,从而使我们可以推断出哪个程序是正确的,并且可以用于微调解析器。作为第一个案例研究,我们招募了人类的非程序员使用APEL重新注释蜘蛛,这是一种文本到SQL数据集。我们的方法达到了与原始专家注释者相同的注释精度(75%),并在原始注释中暴露了许多微妙的错误。
Can non-programmers annotate natural language utterances with complex programs that represent their meaning? We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e.g., Codex). Since they cannot understand the candidate programs, we ask them to select indirectly by examining the programs' input-ouput examples. For each utterance, APEL actively searches for a simple input on which the candidate programs tend to produce different outputs. It then asks the non-programmers only to choose the appropriate output, thus allowing us to infer which program is correct and could be used to fine-tune the parser. As a first case study, we recruited human non-programmers to use APEL to re-annotate SPIDER, a text-to-SQL dataset. Our approach achieved the same annotation accuracy as the original expert annotators (75%) and exposed many subtle errors in the original annotations.