线性随机匪徒在一个有点受限的通道上

论文标题

线性随机匪徒在一个有点受限的通道上

Linear Stochastic Bandits over a Bit-Constrained Channel

论文作者

Mitra, Aritra, Hassani, Hamed, Pappas, George J.

论文摘要

大规模分布式学习中的主要挑战之一是严格的沟通限制。尽管最近的几项工作针对静态优化问题解决了这一挑战，但在这方面，不确定性下的顺序决策仍然少得多。在这个差距的激励下，我们在一个有点受约束的通道上引入了一种新的线性随机匪徒配方。具体而言，在我们的设置中，通过有限容量的通信通道，与环境相互作用的代理将未知模型参数的编码估计值传输到服务器。服务器的目的是根据这些估算采取措施，以最大程度地减少累积后悔。为此，我们开发了一种新颖而通用的算法框架，该框架取决于两个主要组成部分：（i）一种自适应编码机制，利用统计浓度界限，以及（ii）基于置信度的决策原则，这些决策原理是说明编码错误的置信度。作为我们的主要结果，我们证明，当未知模型为$ d $二维时，渠道容量为$ o（d）$ lits就足以实现订单最佳的遗憾。为了证明我们的方法的通用性，我们表明，对于满足标准规律性条件的非线性观察模型，相同的结果继续存在。最后，我们确定，对于更简单的非结构化多军强盗问题，$ 1 $ BIT CHANNEL-ACTAILE足以实现最佳的遗憾界限。总体而言，我们的工作迈出了重要的第一步，以铺平有限能力渠道的统计决策。

One of the primary challenges in large-scale distributed learning stems from stringent communication constraints. While several recent works address this challenge for static optimization problems, sequential decision-making under uncertainty has remained much less explored in this regard. Motivated by this gap, we introduce a new linear stochastic bandit formulation over a bit-constrained channel. Specifically, in our setup, an agent interacting with an environment transmits encoded estimates of an unknown model parameter to a server over a communication channel of finite capacity. The goal of the server is to take actions based on these estimates to minimize cumulative regret. To this end, we develop a novel and general algorithmic framework that hinges on two main components: (i) an adaptive encoding mechanism that exploits statistical concentration bounds, and (ii) a decision-making principle based on confidence sets that account for encoding errors. As our main result, we prove that when the unknown model is $d$-dimensional, a channel capacity of $O(d)$ bits suffices to achieve order-optimal regret. To demonstrate the generality of our approach, we then show that the same result continues to hold for non-linear observation models satisfying standard regularity conditions. Finally, we establish that for the simpler unstructured multi-armed bandit problem, $1$ bit channel-capacity is sufficient for achieving optimal regret bounds. Overall, our work takes a significant first step towards paving the way for statistical decision-making over finite-capacity channels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题