大型语言模型自动化

论文标题

大型语言模型自动化

Autoformalization with Large Language Models

论文作者

Wu, Yuhuai, Jiang, Albert Q., Li, Wenda, Rabe, Markus N., Staats, Charles, Jamnik, Mateja, Szegedy, Christian

论文摘要

自动化是自动从自然语言数学转化为形式规格和证明的过程。成功的自动化系统可以推进正式验证，程序合成和人工智能的领域。虽然自动化的长期目标长期以来似乎难以捉摸，但我们展示了大型语言模型为这一目标提供了新的前景。我们对LLM的数学竞争问题（$ 25.3 \％$）完美地转化为Isabelle/Hol中的正式规格，可以正确地看出，LLM可以正确地将一部分（$ 25.3 \％$）转换为数学竞争问题。我们通过对这些自动化定理进行培训来改善先前引入的神经定理供者来证明这一过程的有用性。我们的方法可以在minif2f定理证明基准的Minif2f定理上产生新的最新结果，从而将证明利率从$ 29.6 \％$提高到$ 35.2 \％$。

Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal. We make the surprising observation that LLMs can correctly translate a significant portion ($25.3\%$) of mathematical competition problems perfectly to formal specifications in Isabelle/HOL. We demonstrate the usefulness of this process by improving a previously introduced neural theorem prover via training on these autoformalized theorems. Our methodology results in a new state-of-the-art result on the MiniF2F theorem proving benchmark, improving the proof rate from $29.6\%$ to $35.2\%$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题