ATM：基于测试代码相似性和进化搜索的黑盒测试用例最小化

论文标题

ATM：基于测试代码相似性和进化搜索的黑盒测试用例最小化

ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolutionary Search

论文作者

Pan, Rongqi, Ghaleb, Taher A., Briand, Lionel

论文摘要

执行大型测试套件是时间和资源消耗，有时是不可能的，此类测试套件通常包含许多冗余测试用例。因此，测试案例最小化用于删除不太可能检测新故障的冗余测试用例。但是，大多数测试用例（套件）最小化技术都取决于代码覆盖范围（白色框），基于模型的功能或需求规格，这些规范并非总是可以由测试工程师访问。最近，提出了一组新型技术，称为FAST-R，仅依赖于测试案例代码以最小化测试案例，这似乎比白盒技术高得多。但是，它在Java项目中实现了可比的低故障检测能力，从而使其应用在实践中具有挑战性。本文提出了ATM（基于AST的测试案例最小化器），这是一种基于相似性的基于搜索的测试用例最小化技术，以特定的预算作为输入，这也仅依赖于测试用例的源代码，但试图通过较高的粒度相似性分析和专用的搜索算法来实现更高的故障检测。 ATM将测试案例代码转换为抽象的语法树（AST），并依靠四个基于树的相似性度量来应用进化搜索，特别是遗传算法，以最大程度地减少测试用例。我们使用3个预算，范围从25％到75％的测试套件，评估了ATM在16个Java项目的大型数据集上的有效性和效率。与快速R（平均为0.61）和随机最小化（平均0.52）相比，ATM的实现率明显更高（平均为0.61），当仅在可接受的时间（平均1.1-4.3小时，平均为1.1-4.3小时）运行时，只有在创建许多新的测试用例时，只有最小化的时间才能实现（大量的重大）。其他预算取得的结果是一致的。

Executing large test suites is time and resource consuming, sometimes impossible, and such test suites typically contain many redundant test cases. Hence, test case minimization is used to remove redundant test cases that are unlikely to detect new faults. However, most test case (suite) minimization techniques rely on code coverage (white-box), model-based features, or requirements specifications, which are not always accessible by test engineers. Recently, a set of novel techniques was proposed, called FAST-R, relying solely on test case code for test case minimization, which appeared to be much more efficient than white-box techniques. However, it achieved a comparable low fault detection capability for Java projects, making its application challenging in practice. This paper proposes ATM (AST-based Test case Minimizer), a similarity-based, search-based test case minimization technique, taking a specific budget as input, that also relies exclusively on the source code of test cases but attempts to achieve higher fault detection through finer-grained similarity analysis and a dedicated search algorithm. ATM transforms test case code into Abstract Syntax Trees (AST) and relies on four tree-based similarity measures to apply evolutionary search, specifically genetic algorithms, to minimize test cases. We evaluated the effectiveness and efficiency of ATM on a large dataset of 16 Java projects with 661 faulty versions using three budgets ranging from 25% to 75% of test suites. ATM achieved significantly higher fault detection rates (0.82 on average), compared to FAST-R (0.61 on average) and random minimization (0.52 on average), when running only 50% of the test cases, within practically acceptable time (1.1-4.3 hours, on average), given that minimization is only occasionally applied when many new test cases are created (major releases). Results achieved for other budgets were consistent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题