论文标题
线性随机匪徒在一个有点受限的通道上
Linear Stochastic Bandits over a Bit-Constrained Channel
论文作者
论文摘要
大规模分布式学习中的主要挑战之一是严格的沟通限制。尽管最近的几项工作针对静态优化问题解决了这一挑战,但在这方面,不确定性下的顺序决策仍然少得多。在这个差距的激励下,我们在一个有点受约束的通道上引入了一种新的线性随机匪徒配方。具体而言,在我们的设置中,通过有限容量的通信通道,与环境相互作用的代理将未知模型参数的编码估计值传输到服务器。服务器的目的是根据这些估算采取措施,以最大程度地减少累积后悔。为此,我们开发了一种新颖而通用的算法框架,该框架取决于两个主要组成部分:(i)一种自适应编码机制,利用统计浓度界限,以及(ii)基于置信度的决策原则,这些决策原理是说明编码错误的置信度。作为我们的主要结果,我们证明,当未知模型为$ d $二维时,渠道容量为$ o(d)$ lits就足以实现订单最佳的遗憾。为了证明我们的方法的通用性,我们表明,对于满足标准规律性条件的非线性观察模型,相同的结果继续存在。最后,我们确定,对于更简单的非结构化多军强盗问题,$ 1 $ BIT CHANNEL-ACTAILE足以实现最佳的遗憾界限。总体而言,我们的工作迈出了重要的第一步,以铺平有限能力渠道的统计决策。
One of the primary challenges in large-scale distributed learning stems from stringent communication constraints. While several recent works address this challenge for static optimization problems, sequential decision-making under uncertainty has remained much less explored in this regard. Motivated by this gap, we introduce a new linear stochastic bandit formulation over a bit-constrained channel. Specifically, in our setup, an agent interacting with an environment transmits encoded estimates of an unknown model parameter to a server over a communication channel of finite capacity. The goal of the server is to take actions based on these estimates to minimize cumulative regret. To this end, we develop a novel and general algorithmic framework that hinges on two main components: (i) an adaptive encoding mechanism that exploits statistical concentration bounds, and (ii) a decision-making principle based on confidence sets that account for encoding errors. As our main result, we prove that when the unknown model is $d$-dimensional, a channel capacity of $O(d)$ bits suffices to achieve order-optimal regret. To demonstrate the generality of our approach, we then show that the same result continues to hold for non-linear observation models satisfying standard regularity conditions. Finally, we establish that for the simpler unstructured multi-armed bandit problem, $1$ bit channel-capacity is sufficient for achieving optimal regret bounds. Overall, our work takes a significant first step towards paving the way for statistical decision-making over finite-capacity channels.