论文标题
预算有限和最大扰动样品数量的反向分类
Inverse Classification with Limited Budget and Maximum Number of Perturbed Samples
论文作者
论文摘要
最新的机器学习研究重点是开发新的分类器,以提高分类准确性。有了许多良好表现的最先进的分类器,越来越需要理解以实际目的需要的分类器的解释性,例如为糖尿病患者找到最佳的饮食建议。逆分类是一个后建模过程,可以在样品的输入特征中更改以更改最初的预测类。在许多业务应用程序中,确定如何调整样本输入数据以使分类器预测其在所需类中。在现实世界的应用中,通常考虑对与客户或患者相对应的样品的扰动预算,在这种情况下,成功的扰动样品的数量是增加收益的关键。在这项研究中,我们提出了一个新框架来解决逆分类,以最大化受扰动样品的扰动样品的数量和有利的分类类别。我们设计算法以基于梯度方法,随机过程,拉格朗日放松和牙龈技巧来解决此优化问题。在实验中,我们发现基于随机过程的算法在不同的预算环境中表现出色,并且可以很好地扩展。
Most recent machine learning research focuses on developing new classifiers for the sake of improving classification accuracy. With many well-performing state-of-the-art classifiers available, there is a growing need for understanding interpretability of a classifier necessitated by practical purposes such as to find the best diet recommendation for a diabetes patient. Inverse classification is a post modeling process to find changes in input features of samples to alter the initially predicted class. It is useful in many business applications to determine how to adjust a sample input data such that the classifier predicts it to be in a desired class. In real world applications, a budget on perturbations of samples corresponding to customers or patients is usually considered, and in this setting, the number of successfully perturbed samples is key to increase benefits. In this study, we propose a new framework to solve inverse classification that maximizes the number of perturbed samples subject to a per-feature-budget limits and favorable classification classes of the perturbed samples. We design algorithms to solve this optimization problem based on gradient methods, stochastic processes, Lagrangian relaxations, and the Gumbel trick. In experiments, we find that our algorithms based on stochastic processes exhibit an excellent performance in different budget settings and they scale well.