论文标题
预测决策影响软件项目有多远?来自跨项目缺陷预测模型的成本,服务时间和失败分析
How Far Does the Predictive Decision Impact the Software Project? The Cost, Service Time, and Failure Analysis from a Cross-Project Defect Prediction Model
论文作者
论文摘要
上下文:正在开发跨项目缺陷预测(CPDP)模型以优化测试资源。目的:为CPDP提出一个集合分类框架,因为缺乏许多现有模型,并从提出的分类框架的结果中分析了CPDP的主要目标。方法:对于分类任务,我们提出了一种基于自举的混合诱导器集合学习(HIEL)技术,该技术使用概率加权多数投票(PWMV)策略。为了了解HIEL对软件项目的影响,我们提出了三种特定项目的性能指标,例如完美清洁(PPC)的百分比(PPC),非完美清洁剂(PNPC)(PNPC)的百分比(PNPC)以及虚假的省略率(用于计算已保存的成本,剩余的服务时间以及目标项目中的失败百分比的预测量)。结果:在Promise,NASA和AEEEM存储库中的许多目标项目中,提出的模型在F-MEASED方面优于TDS,TCA+,Hydra,TPTL和Codep等最新作品。就AUC而言,TCA+和HYDRA模型与HIEL模型一样强大。结论:有关更好的预测,我们建议使用CPDP模型的集合学习方法。并且,为了估算CPDP模型的好处,我们建议采取上述特定项目的绩效指标。
Context: Cross-project defect prediction (CPDP) models are being developed to optimize the testing resources. Objectives: Proposing an ensemble classification framework for CPDP as many existing models are lacking with better performances and analysing the main objectives of CPDP from the outcomes of the proposed classification framework. Method: For the classification task, we propose a bootstrap aggregation based hybrid-inducer ensemble learning (HIEL) technique that uses probabilistic weighted majority voting (PWMV) strategy. To know the impact of HIEL on the software project, we propose three project-specific performance measures such as percent of perfect cleans (PPC), percent of non-perfect cleans (PNPC), and false omission rate (FOR) from the predictions to calculate the amount of saved cost, remaining service time, and percent of the failures in the target project. Results: On many target projects from PROMISE, NASA, and AEEEM repositories, the proposed model outperformed recent works such as TDS, TCA+, HYDRA, TPTL, and CODEP in terms of F-measure. In terms of AUC, the TCA+ and HYDRA models stand as strong competitors to the HIEL model. Conclusion: For better predictions, we recommend ensemble learning approaches for the CPDP models. And, to estimate the benefits from the CPDP models, we recommend the above project-specific performance measures.