论文标题
更好地利用匪徒进行在线学习
Better Boosting with Bandits for Online Learning
论文作者
论文摘要
由于算法的最大化性质,通过增强合奏产生的概率估计值不佳。合奏的输出需要正确校准,然后才能用作概率估计。在这项工作中,我们证明在线增强也容易产生扭曲的概率估计。在批处理学习中,通过保留部分培训数据来训练校准器功能来实现校准。在在线环境中,需要在每轮上做出决定:应使用新示例来更新集合或校准器的参数。我们继续借助强盗优化算法来解决这一决定。根据概率估计,我们证明了与未校准和天真校准的在线增强合奏相比的卓越性能。我们提出的机制可以很容易地适应其他任务(例如,成本敏感的分类),并且可以选择校准器和合奏的超参数的选择。
Probability estimates generated by boosting ensembles are poorly calibrated because of the margin maximization nature of the algorithm. The outputs of the ensemble need to be properly calibrated before they can be used as probability estimates. In this work, we demonstrate that online boosting is also prone to producing distorted probability estimates. In batch learning, calibration is achieved by reserving part of the training data for training the calibrator function. In the online setting, a decision needs to be made on each round: shall the new example(s) be used to update the parameters of the ensemble or those of the calibrator. We proceed to resolve this decision with the aid of bandit optimization algorithms. We demonstrate superior performance to uncalibrated and naively-calibrated on-line boosting ensembles in terms of probability estimation. Our proposed mechanism can be easily adapted to other tasks(e.g. cost-sensitive classification) and is robust to the choice of hyperparameters of both the calibrator and the ensemble.