论文标题
混合辅助学习中的社会多样性和社会偏好
Social diversity and social preferences in mixed-motive reinforcement learning
论文作者
论文摘要
最近关于纯粹冲突和纯粹的兴趣游戏中强化学习的研究强调了人口异质性的重要性。相反,关于混合动力游戏中的增强学习的研究主要是利用均质方法。鉴于混合动力游戏的定义特征 - 小组成员之间激励措施的不完善相关性 - 我们研究人口异质性对混合辅助增强学习的影响。我们借鉴了社会心理学的相互依存理论,并以社会价值取向(SVO)为基础,这是对群体成果分布的偏好的灵活形式化。随后,我们探讨了SVO多样性对两个混合动物马尔可夫游戏中强化学习剂种群的影响。我们证明,SVO中的异质性在类似于相互依赖理论所建议的类似的药物之间产生有意义且复杂的行为变化。这些混合动力困境中的经验结果表明,在异质种群中接受培训的药物相对于在同质种群中训练的人制定了特别普遍的高性能政策。
Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity. In contrast, studies of reinforcement learning in mixed-motive games have primarily leveraged homogeneous approaches. Given the defining characteristic of mixed-motive games--the imperfect correlation of incentives between group members--we study the effect of population heterogeneity on mixed-motive reinforcement learning. We draw on interdependence theory from social psychology and imbue reinforcement learning agents with Social Value Orientation (SVO), a flexible formalization of preferences over group outcome distributions. We subsequently explore the effects of diversity in SVO on populations of reinforcement learning agents in two mixed-motive Markov games. We demonstrate that heterogeneity in SVO generates meaningful and complex behavioral variation among agents similar to that suggested by interdependence theory. Empirical results in these mixed-motive dilemmas suggest agents trained in heterogeneous populations develop particularly generalized, high-performing policies relative to those trained in homogeneous populations.