论文标题
无限失衡的逻辑回归的渐近推断
Asymptotic Inference for Infinitely Imbalanced Logistic Regression
论文作者
论文摘要
在本文中,当多数类的大小是无限的,并且少数族裔类是有限的时,我们通过在逻辑回归中的斜率参数的二阶扩展来扩展欧文(2007)的工作。更确切地说,我们证明了二阶项会收敛到正态分布并明确计算其方差,这再次仅取决于少数族类点的平均值,而不取决于其在轻度规律性假设下的排列。在多数类是正态分布的情况下,我们说明限制斜率的差异呈指数级取决于少数类别点的平均值相对于多数类的分布的z得分。我们通过蒙特卡洛模拟确认了结果。
In this paper we extend the work of Owen (2007) by deriving a second order expansion for the slope parameter in logistic regression, when the size of the majority class is unbounded and the minority class is finite. More precisely, we demonstrate that the second order term converges to a normal distribution and explicitly compute its variance, which surprisingly once again depends only on the mean of the minority class points and not their arrangement under mild regularity assumptions. In the case that the majority class is normally distributed, we illustrate that the variance of the the limiting slope depends exponentially on the z-score of the average of the minority class's points with respect to the majority class's distribution. We confirm our results by Monte Carlo simulations.