论文标题
游戏中的元学习
Meta-Learning in Games
论文作者
论文摘要
在有关游戏理论平衡发现的文献中,重点主要是孤立地解决单个游戏。但是,在实践中,战略性互动(从路由问题到在线广告拍卖)不等,动态发展,从而导致许多类似的游戏要解决。为了解决这一差距,我们介绍了元学习,以进行平衡发现和学习游戏。我们为各种基本和认真的游戏类别建立了第一个元学习保证,包括两人零游戏,通用游戏和Stackelberg游戏。特别是,我们获得了与不同游戏理论平衡的收敛速度,这些均衡取决于遇到的游戏序列之间的自然概念,同时恢复游戏序列是任意的,可以恢复已知的单场保证。一路上,我们通过一个简单而统一的框架证明了单场制度中的许多新结果,这可能引起独立的兴趣。最后,我们评估了扑克经纪人Libratus针对顶级人类专业人士面临的最后游戏的元学习算法。实验表明,使用我们的元学习技术可以比单独解决它们的元素学习技术更快地求解堆栈尺寸的游戏,通常是通过数量级来求解。
In the literature on game-theoretic equilibrium finding, focus has mainly been on solving a single game in isolation. In practice, however, strategic interactions -- ranging from routing problems to online advertising auctions -- evolve dynamically, thereby leading to many similar games to be solved. To address this gap, we introduce meta-learning for equilibrium finding and learning to play games. We establish the first meta-learning guarantees for a variety of fundamental and well-studied classes of games, including two-player zero-sum games, general-sum games, and Stackelberg games. In particular, we obtain rates of convergence to different game-theoretic equilibria that depend on natural notions of similarity between the sequence of games encountered, while at the same time recovering the known single-game guarantees when the sequence of games is arbitrary. Along the way, we prove a number of new results in the single-game regime through a simple and unified framework, which may be of independent interest. Finally, we evaluate our meta-learning algorithms on endgames faced by the poker agent Libratus against top human professionals. The experiments show that games with varying stack sizes can be solved significantly faster using our meta-learning techniques than by solving them separately, often by an order of magnitude.