论文标题
在软件工作估算数据集中测试平稳性假设
Testing the Stationarity Assumption in Software Effort Estimation Datasets
论文作者
论文摘要
软件努力估计(请参阅)模型通常是基于基本假设开发的,即所有数据点都与预测未来项目的努力同样相关。软件工程过程的几个方面的动态性质可能意味着至少在某些情况下这一假设不存在。这项研究采用三个内核估计器函数来测试三个软件工程数据集中的平稳性假设,这些假设已用于构建软件努力估计模型。内核估计器用于生成不均匀的权重,随后在加权线性回归建模中使用。将预测误差与从统一模型获得的误差进行比较。我们的结果表明,对于表现出潜在非平稳过程的数据集,均匀模型比非均匀模型更准确。相反,显示固定过程的数据集的均匀和非均匀模型的准确性本质上是等效的。我们的研究结果还证实了先前的发现,即努力估计模型的准确性与模型开发中使用的内核估计器函数的类型无关。
Software effort estimation (SEE) models are typically developed based on an underlying assumption that all data points are equally relevant to the prediction of effort for future projects. The dynamic nature of several aspects of the software engineering process could mean that this assumption does not hold in at least some cases. This study employs three kernel estimator functions to test the stationarity assumption in three software engineering datasets that have been used in the construction of software effort estimation models. The kernel estimators are used in the generation of non-uniform weights which are subsequently employed in weighted linear regression modeling. Prediction errors are compared to those obtained from uniform models. Our results indicate that, for datasets that exhibit underlying non-stationary processes, uniform models are more accurate than non-uniform models. In contrast, the accuracy of uniform and non-uniform models for datasets that exhibited stationary processes was essentially equivalent. The results of our study also confirm prior findings that the accuracy of effort estimation models is independent of the type of kernel estimator function used in model development.