在线加强匪徒反馈

论文标题

在线加强匪徒反馈

Online Boosting with Bandit Feedback

论文作者

Brukhim, Nataly, Hazan, Elad

论文摘要

当仅提供有限的信息提供有限的信息时，我们将考虑在线提升回归任务的问题。我们提供了一种有效的遗憾最小化方法，具有两种含义：具有嘈杂的多点匪徒反馈的在线增强算法，以及具有随机梯度的新的无预测在线凸优化算法，可在效率方面提高先进的保证。

We consider the problem of online boosting for regression tasks, when only limited information is available to the learner. We give an efficient regret minimization method that has two implications: an online boosting algorithm with noisy multi-point bandit feedback, and a new projection-free online convex optimization algorithm with stochastic gradient, that improves state-of-the-art guarantees in terms of efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题