论文标题
流行决策树算法可证明噪声
Popular decision tree algorithms are provably noise tolerant
论文作者
论文摘要
使用增强的框架,我们证明所有基于杂质的决策树学习算法(包括经典ID3,C4.5和CART)都具有很高的噪音耐受性。我们的保证在令人讨厌的噪声的最强噪声模型下保持,我们在允许的噪声速率上提供了近乎匹配的上和下限。我们进一步表明,这些算法很简单,长期以来一直是日常机器学习的核心,在嘈杂的环境中享有可证明的保证,这些环境是由关于决策树学习的理论文献中现有算法无与伦比的。综上所述,我们的结果增加了一项持续的研究线,该研究旨在将这些实际决策树算法的经验成功放在牢固的理论基础上。
Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. Our guarantees hold under the strongest noise model of nasty noise, and we provide near-matching upper and lower bounds on the allowable noise rate. We further show that these algorithms, which are simple and have long been central to everyday machine learning, enjoy provable guarantees in the noisy setting that are unmatched by existing algorithms in the theoretical literature on decision tree learning. Taken together, our results add to an ongoing line of research that seeks to place the empirical success of these practical decision tree algorithms on firm theoretical footing.