论文标题

高维稀疏线性匪徒

High-Dimensional Sparse Linear Bandits

论文作者

Hao, Botao, Lattimore, Tor, Wang, Mengdi

论文摘要

具有高维稀疏特征的随机线性匪徒是各种领域的实用模型,包括个性化医学和在线广告。我们得出了一个新颖的$ω(n^{2/3})$无尺寸的minimax后悔的下限,用于稀疏的线性匪徒在数据贫乏的状态下的地平线小于环境维度,而特征矢量则可以接受良好的条件探索。这是通过近乎匹配的上限进行探索的算法的补充,表明$θ(n^{2/3})$是数据罚款制度中的最佳速率。结果补充了数据丰富的制度的现有界限,并提供了另一个示例,即需要精心平衡信息和遗憾之间的权衡。最后,我们证明了一个无维的$ o(\ sqrt {n})$遗憾的上限,在相关功能的信号大小的附加假设下。

Stochastic linear bandits with high-dimensional sparse features are a practical model for a variety of domains, including personalized medicine and online advertising. We derive a novel $Ω(n^{2/3})$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime where the horizon is smaller than the ambient dimension and where the feature vectors admit a well-conditioned exploration distribution. This is complemented by a nearly matching upper bound for an explore-then-commit algorithm showing that that $Θ(n^{2/3})$ is the optimal rate in the data-poor regime. The results complement existing bounds for the data-rich regime and provide another example where carefully balancing the trade-off between information and regret is necessary. Finally, we prove a dimension-free $O(\sqrt{n})$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源