论文标题

使用分布在Wasserstein度量下的分布优化的鲁棒性多元回归和分类

Robustified Multivariate Regression and Classification Using Distributionally Robust Optimization under the Wasserstein Metric

论文作者

Chen, Ruidi, Paschalidis, Ioannis Ch.

论文摘要

当两种协变量和响应/标签都可能受到异常值的污染时,我们开发了多元线性回归(MLR)和多类逻辑回归(MLG)的分布强度优化(DRO)公式。 DRO框架使用概率歧义集,该集合定义为分布球,该分布与Wasserstein Metric的意义相近训练集的经验分布。我们将DRO公式放在正规化问题中,该问题的正规化器是系数矩阵的规范。我们为模型的解决方案建立了样本外的性能保证,提供了有关正规器在控制预测错误中作用的见解。实验结果表明,我们的方法将预测误差提高了7%-MLR的37%,而MLG的鲁棒性度则提高了100%。

We develop Distributionally Robust Optimization (DRO) formulations for Multivariate Linear Regression (MLR) and Multiclass Logistic Regression (MLG) when both the covariates and responses/labels may be contaminated by outliers. The DRO framework uses a probabilistic ambiguity set defined as a ball of distributions that are close to the empirical distribution of the training set in the sense of the Wasserstein metric. We relax the DRO formulation into a regularized learning problem whose regularizer is a norm of the coefficient matrix. We establish out-of-sample performance guarantees for the solutions to our model, offering insights on the role of the regularizer in controlling the prediction error. Experimental results show that our approach improves the predictive error by 7% -- 37% for MLR, and a metric of robustness by 100% for MLG.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源