高维稀疏线性匪徒

论文标题

高维稀疏线性匪徒

High-Dimensional Sparse Linear Bandits

论文作者

Hao, Botao, Lattimore, Tor, Wang, Mengdi

论文摘要

具有高维稀疏特征的随机线性匪徒是各种领域的实用模型，包括个性化医学和在线广告。我们得出了一个新颖的$ω（n^{2/3}）$无尺寸的minimax后悔的下限，用于稀疏的线性匪徒在数据贫乏的状态下的地平线小于环境维度，而特征矢量则可以接受良好的条件探索。这是通过近乎匹配的上限进行探索的算法的补充，表明$θ（n^{2/3}）$是数据罚款制度中的最佳速率。结果补充了数据丰富的制度的现有界限，并提供了另一个示例，即需要精心平衡信息和遗憾之间的权衡。最后，我们证明了一个无维的$ o（\ sqrt {n}）$遗憾的上限，在相关功能的信号大小的附加假设下。

Stochastic linear bandits with high-dimensional sparse features are a practical model for a variety of domains, including personalized medicine and online advertising. We derive a novel $Ω(n^{2/3})$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime where the horizon is smaller than the ambient dimension and where the feature vectors admit a well-conditioned exploration distribution. This is complemented by a nearly matching upper bound for an explore-then-commit algorithm showing that that $Θ(n^{2/3})$ is the optimal rate in the data-poor regime. The results complement existing bounds for the data-rich regime and provide another example where carefully balancing the trade-off between information and regret is necessary. Finally, we prove a dimension-free $O(\sqrt{n})$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题