论文标题
克服差异通货膨胀因素的不一致:重新定义的VIF和检测统计困难的多重共线性的测试
Overcoming the inconsistences of the variance inflation factor: a redefined VIF and a test to detect statistical troubling multicollinearity
论文作者
论文摘要
多重共线性与应用线性回归模型的许多不同领域有关,并且其存在可能会影响来自数值和统计观点的普通最小二乘(OLS)估计器的分析。因此,多重共线性可能导致独立变量的统计意义和模型的全球意义不一致。传统上,差异通货膨胀因子(VIF)用于诊断多重共线性的可能存在,但是通过VIF检测令人困扰的多重共线性的情况并非总是如此,这与统计分析的负面影响相对应。缺乏VIF特异性的原因是,还有其他因素,例如样本的大小和随机干扰的差异,可能导致VIF的较高值,但不会导致OLS估计量的有问题差异(请参见O'Brien 2007)。本文提出了一个新的差异通胀因子(TVIF),该因子考虑了所有这些其他因素。还提供了此新措施的阈值以及Stewart(1987)提供的指数。这些阈值被重新解释并作为一种新的统计检验,以诊断存在统计困难的多重共线性的存在。本文的贡献用先前在科学文献中应用的两个真实数据示例进行了说明。
Multicollinearity is relevant to many different fields where linear regression models are applied, and its existence may affect the analysis of ordinary least squares (OLS) estimators from both the numerical and statistical points of views. Thus, multicollinearity can lead to incoherence in the statistical significance of the independent variables and the global significance of the model. The variance inflation factor (VIF) is traditionally applied to diagnose the possible existence of multicollinearity, but it is not always the case that detection by VIF of a troubling degree of multicollinearity corresponds to negative effects on the statistical analysis. The reason for the lack of specificity of VIF is that there are other factors, such as the size of the sample and the variance of the random disturbance, that can lead to high values of the VIF but not to problematic variance in the OLS estimators (see O'Brien 2007). This paper presents a new variance inflation factor (TVIF) that consider all these additional factors. Thresholds for this new measure and from the index provided by Stewart (1987) are also provided. These thresholds are reinterpreted and presented as a new statistical test to diagnose the existence of statistical troubling multicollinearity. The contributions of this paper are illustrated with two real data examples previously applied in the scientific literature.