赢得迭代囚犯的困境策略的特性

论文标题

赢得迭代囚犯的困境策略的特性

Properties of Winning Iterated Prisoner's Dilemma Strategies

论文作者

Glynatsi, Nikoleta E., Knight, Vincent, Harper, Marc

论文摘要

从TIT的著名表现到引入零确定的策略，以及使用复杂的学习结构（例如神经网络），研究人员已经探索了迭代囚犯的困境策略的表现。在各种锦标赛和人口动态中都引入和测试了许多新策略。然而，文献中的典型结果依赖于绩效，而在少数锦标赛中有些任意选择的策略，对结论的普遍性提出了怀疑。在这项工作中，我们分析了成千上万次计算机比赛中的195种策略，呈现了多种锦标赛类型的最佳性能策略，并提取其显着特征。结果表明，目前尚无一项策略在各种迭代的囚犯困境情景中表现良好，但是有几种属性会严重影响最佳性能策略。这是根据最近和更多样化的对手人群所描述的属性，以：善于，挑衅和慷慨，有点羡慕，聪明并适应环境。更确切地说，当策略的合作可能性与总锦标赛人口的总合作概率相匹配时，策略的表现最佳。高表现策略的功能有助于阐明为什么诸如Tat之类的策略在锦标赛中表现良好，以及为什么零确定的策略通常在比赛环境中表现不佳。此外，我们的发现对自治药物的未来培训具有影响，因为了解将这些代理的关键特征理解成是必不可少的。

Researchers have explored the performance of Iterated Prisoner's Dilemma strategies for decades, from the celebrated performance of Tit for Tat to the introduction of the zero-determinant strategies and the use of sophisticated learning structures such as neural networks. Many new strategies have been introduced and tested in a variety of tournaments and population dynamics. Typical results in the literature, however, rely on performance against a small number of somewhat arbitrarily selected strategies in a small number of tournaments, casting doubt on the generalizability of conclusions. In this work, we analyze a large collection of 195 strategies in thousands of computer tournaments, present the top performing strategies across multiple tournament types, and distill their salient features. The results show that there is not yet a single strategy that performs well in diverse Iterated Prisoner's Dilemma scenarios, nevertheless there are several properties that heavily influence the best performing strategies. This refines the properties described by Axelrod in light of recent and more diverse opponent populations to: be nice, be provocable and generous, be a little envious, be clever, and adapt to the environment. More precisely, we find that strategies perform best when their probability of cooperation matches the total tournament population's aggregate cooperation probabilities. The features of high performing strategies help cast some light on why strategies such as Tit For Tat performed historically well in tournaments and why zero-determinant strategies typically do not fare well in tournament settings. Furthermore, our findings have implications for the future training of autonomous agents, as understanding the crucial features for incorporation into these agents becomes essential.

下载PDF全文

下载文献需遵守相关版权规定

论文标题