论文标题
高维稀疏线性匪徒
High-Dimensional Sparse Linear Bandits
论文作者
论文摘要
具有高维稀疏特征的随机线性匪徒是各种领域的实用模型,包括个性化医学和在线广告。我们得出了一个新颖的$ω(n^{2/3})$无尺寸的minimax后悔的下限,用于稀疏的线性匪徒在数据贫乏的状态下的地平线小于环境维度,而特征矢量则可以接受良好的条件探索。这是通过近乎匹配的上限进行探索的算法的补充,表明$θ(n^{2/3})$是数据罚款制度中的最佳速率。结果补充了数据丰富的制度的现有界限,并提供了另一个示例,即需要精心平衡信息和遗憾之间的权衡。最后,我们证明了一个无维的$ o(\ sqrt {n})$遗憾的上限,在相关功能的信号大小的附加假设下。
Stochastic linear bandits with high-dimensional sparse features are a practical model for a variety of domains, including personalized medicine and online advertising. We derive a novel $Ω(n^{2/3})$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime where the horizon is smaller than the ambient dimension and where the feature vectors admit a well-conditioned exploration distribution. This is complemented by a nearly matching upper bound for an explore-then-commit algorithm showing that that $Θ(n^{2/3})$ is the optimal rate in the data-poor regime. The results complement existing bounds for the data-rich regime and provide another example where carefully balancing the trade-off between information and regret is necessary. Finally, we prove a dimension-free $O(\sqrt{n})$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.