论文标题
图形双线匪徒中的最佳手臂识别
Best Arm Identification in Graphical Bilinear Bandits
论文作者
论文摘要
我们引入了一个新的图形双线性匪徒问题,其中学习者(或\ emph {中央实体})将臂分配给图形节点,并观察到每个边缘的嘈杂双线性奖励,代表两个末端节点之间的相互作用。我们研究了学习者希望找到最大化双线性奖励总和的图形分配的最佳手臂识别问题。通过有效利用此匪徒问题的几何形状,我们提出了基于随机抽样的理论保证,提出了\ emph {分散}分配策略。特别是,我们表征了图结构(例如恒星,完整或圆圈)对收敛速率的影响,并提出了证实这种依赖性的经验实验。
We introduce a new graphical bilinear bandit problem where a learner (or a \emph{central entity}) allocates arms to the nodes of a graph and observes for each edge a noisy bilinear reward representing the interaction between the two end nodes. We study the best arm identification problem in which the learner wants to find the graph allocation maximizing the sum of the bilinear rewards. By efficiently exploiting the geometry of this bandit problem, we propose a \emph{decentralized} allocation strategy based on random sampling with theoretical guarantees. In particular, we characterize the influence of the graph structure (e.g. star, complete or circle) on the convergence rate and propose empirical experiments that confirm this dependency.