论文标题
使用错误报告配置测试生成器:GCC编译器和CSMITH的案例研究
Configuring Test Generators using Bug Reports: A Case Study of GCC Compiler and Csmith
论文作者
论文摘要
编译器的正确性在其他软件系统的安全性和可靠性方面具有重要作用,因为编译器中的错误可以产生不能反映程序员意图的可执行文件。此类错误很难识别和调试。随机测试程序生成器通常用于测试编译器,它们在发现错误方面有效。但是,指导这些测试生成器生产更可能找到错误的测试程序的问题仍然具有挑战性。在本文中,我们使用错误报告中的代码段来指导测试生成。这项工作的主要思想是从错误报告中提取有关语言功能的见解,这些功能更容易实现并使用洞察力来指导测试生成器。我们使用GCC C编译器来评估这种方法的有效性。特别是,我们首先根据其功能将测试程序聚集在GCC错误报告中。然后,我们使用簇的质心计算CSMITH的配置,CSMITH是C编译器的流行测试生成器。我们在八个版本的GCC上评估了这种方法,发现我们的方法提供了更高的覆盖范围,并且触发了比GCC最新的测试生成技术更多的失败失败。
The correctness of compilers is instrumental in the safety and reliability of other software systems, as bugs in compilers can produce executables that do not reflect the intent of programmers. Such errors are difficult to identify and debug. Random test program generators are commonly used in testing compilers, and they have been effective in uncovering bugs. However, the problem of guiding these test generators to produce test programs that are more likely to find bugs remains challenging. In this paper, we use the code snippets in the bug reports to guide the test generation. The main idea of this work is to extract insights from the bug reports about the language features that are more prone to inadequate implementation and using the insights to guide the test generators. We use the GCC C compiler to evaluate the effectiveness of this approach. In particular, we first cluster the test programs in the GCC bugs reports based on their features. We then use the centroids of the clusters to compute configurations for Csmith, a popular test generator for C compilers. We evaluated this approach on eight versions of GCC and found that our approach provides higher coverage and triggers more miscompilation failures than the state-of-the-art test generation techniques for GCC.