论文标题
在线加强匪徒反馈
Online Boosting with Bandit Feedback
论文作者
论文摘要
当仅提供有限的信息提供有限的信息时,我们将考虑在线提升回归任务的问题。我们提供了一种有效的遗憾最小化方法,具有两种含义:具有嘈杂的多点匪徒反馈的在线增强算法,以及具有随机梯度的新的无预测在线凸优化算法,可在效率方面提高先进的保证。
We consider the problem of online boosting for regression tasks, when only limited information is available to the learner. We give an efficient regret minimization method that has two implications: an online boosting algorithm with noisy multi-point bandit feedback, and a new projection-free online convex optimization algorithm with stochastic gradient, that improves state-of-the-art guarantees in terms of efficiency.