论文标题
互动有限的协作顶级分销标识
Collaborative Top Distribution Identifications with Limited Interaction
论文作者
论文摘要
我们在本文中考虑以下问题:鉴于一组$ n $分布,请找到最大手段的顶部$ m $。在强化学习文献中,此问题也称为{\ em top- $ m $ harm标识},并且具有许多应用。我们在协作学习模型中研究了问题,在该模型中,我们有多个代理可以并行从$ n $分布中绘制样本。我们的目标是表征学习过程的运行时间与代理之间的交互作用数量之间的权衡,这在各种情况下都非常昂贵。我们给出了最佳的时光权衡,并在$ M $的一般$ M $以及固定时间和固定信心变体之间表明了$ 1 $ ARM标识和顶部$ M $ ARM标识之间的复杂性分离。作为副产品,我们还提供了一种算法,用于选择合作学习模型中最大的$ M $ nee发行版。
We consider the following problem in this paper: given a set of $n$ distributions, find the top-$m$ ones with the largest means. This problem is also called {\em top-$m$ arm identifications} in the literature of reinforcement learning, and has numerous applications. We study the problem in the collaborative learning model where we have multiple agents who can draw samples from the $n$ distributions in parallel. Our goal is to characterize the tradeoffs between the running time of learning process and the number of rounds of interaction between agents, which is very expensive in various scenarios. We give optimal time-round tradeoffs, as well as demonstrate complexity separations between top-$1$ arm identification and top-$m$ arm identifications for general $m$ and between fixed-time and fixed-confidence variants. As a byproduct, we also give an algorithm for selecting the distribution with the $m$-th largest mean in the collaborative learning model.