在有限的沟通中，通过多代理协作加速通过多代理协作加速分布式分布式元学习

论文标题

在有限的沟通中，通过多代理协作加速通过多代理协作加速分布式分布式元学习

Accelerating Distributed Online Meta-Learning via Multi-Agent Collaboration under Limited Communication

论文作者

Lin, Sen, Dedeoglu, Mehmet, Zhang, Junshan

论文摘要

在线元学习正在成为一种在物联网生态系统中实现边缘智能的能力技术。然而，要学习一个良好的元模型以进行任务内快速改编，仅单个代理商就必须学习许多任务，这是所谓的“冷启动”问题。观察到在多代理网络中，不同代理商之间的学习任务通常具有某种模型相似性，我们问以下基本问题：“是否有可能通过有限的沟通来加速跨代理商的在线元学习，如果是的，如果可以实现多少好处？”要回答这个问题，我们提出了一个多元素的在线元网络框架和在线效果。通过表征代理任务的遗憾的上限，我们表明，多代理在线元学习的性能在很大程度上取决于代理可以通过有限的通信中的分布式网络级OCO受益于分布式网络级别的OCO，以实现有限的通信，但是，这尚不很好。为了应对这一挑战，我们设计了一个通过梯度跟踪的分布式在线梯度下降算法，每个代理在其中仅使用其邻居与邻居进行一次通信跟踪全球梯度，并导致平均遗憾$ o（\ sqrt {t/n} $每个代理商，表明一个因素遗憾的是$ \ sqrt sicles Speedup + sqrt Speed a} $ {1/n} $ {1/n} $ {1/n} $ {1/N} $ o（\ sqrt {t}）$后$ t $迭代，其中$ n $是代理的数量。在这种敏锐的性能加速的基础上，我们接下来开发了一种多代理的在线元学习算法，并表明它可以通过有限的通信以$ o（1/\ sqrt {nt}）$更快的速度获得最佳的任务平均遗憾，与单个Agr-Online Meta-Learning相比。广泛的实验证实了理论结果。

Online meta-learning is emerging as an enabling technique for achieving edge intelligence in the IoT ecosystem. Nevertheless, to learn a good meta-model for within-task fast adaptation, a single agent alone has to learn over many tasks, and this is the so-called 'cold-start' problem. Observing that in a multi-agent network the learning tasks across different agents often share some model similarity, we ask the following fundamental question: "Is it possible to accelerate the online meta-learning across agents via limited communication and if yes how much benefit can be achieved? " To answer this question, we propose a multi-agent online meta-learning framework and cast it as an equivalent two-level nested online convex optimization (OCO) problem. By characterizing the upper bound of the agent-task-averaged regret, we show that the performance of multi-agent online meta-learning depends heavily on how much an agent can benefit from the distributed network-level OCO for meta-model updates via limited communication, which however is not well understood. To tackle this challenge, we devise a distributed online gradient descent algorithm with gradient tracking where each agent tracks the global gradient using only one communication step with its neighbors per iteration, and it results in an average regret $O(\sqrt{T/N})$ per agent, indicating that a factor of $\sqrt{1/N}$ speedup over the optimal single-agent regret $O(\sqrt{T})$ after $T$ iterations, where $N$ is the number of agents. Building on this sharp performance speedup, we next develop a multi-agent online meta-learning algorithm and show that it can achieve the optimal task-average regret at a faster rate of $O(1/\sqrt{NT})$ via limited communication, compared to single-agent online meta-learning. Extensive experiments corroborate the theoretic results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题