论文标题

多代理动态算法配置

Multi-agent Dynamic Algorithm Configuration

论文作者

Xue, Ke, Xu, Jiacheng, Yuan, Lei, Li, Miqing, Qian, Chao, Zhang, Zongzhang, Yu, Yang

论文摘要

自动化算法配置可使用户免于乏味的试用调整任务。流行的算法配置调整范式是动态算法配置(DAC),其中代理通过增强学习(RL)在实例上学习动态配置策略。但是,在许多复杂的算法中,可能存在不同类型的配置超参数,并且这种异质性可能会给使用单一AGENT RL策略的经典DAC带来困难。在本文中,我们旨在解决此问题并提出多代理DAC(MA-DAC),其中一种代理用于一种类型的配置超参数。 MA-DAC将复杂算法的动态配置与多种类型的超参数作为上下文多代理Markov决策过程,并通过合作的多代理RL(MARL)算法来解决它。实例化,我们将MA-DAC应用于多目标优化问题的众所周知的优化算法。实验结果表明,与基于启发式规则,多武器匪徒和单人RL相比,MA-DAC不仅在实现卓越性能方面的有效性,而且还能够将其推广到不同的问题类别。此外,我们在本文中释放环境作为测试MARL算法的基准,希望促进MARL的应用。

Automated algorithm configuration relieves users from tedious, trial-and-error tuning tasks. A popular algorithm configuration tuning paradigm is dynamic algorithm configuration (DAC), in which an agent learns dynamic configuration policies across instances by reinforcement learning (RL). However, in many complex algorithms, there may exist different types of configuration hyperparameters, and such heterogeneity may bring difficulties for classic DAC which uses a single-agent RL policy. In this paper, we aim to address this issue and propose multi-agent DAC (MA-DAC), with one agent working for one type of configuration hyperparameter. MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm. To instantiate, we apply MA-DAC to a well-known optimization algorithm for multi-objective optimization problems. Experimental results show the effectiveness of MA-DAC in not only achieving superior performance compared with other configuration tuning approaches based on heuristic rules, multi-armed bandits, and single-agent RL, but also being capable of generalizing to different problem classes. Furthermore, we release the environments in this paper as a benchmark for testing MARL algorithms, with the hope of facilitating the application of MARL.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源