好牧羊人：机械设计的甲骨文代理

论文标题

好牧羊人：机械设计的甲骨文代理

The Good Shepherd: An Oracle Agent for Mechanism Design

论文作者

Balaguer, Jan, Koster, Raphael, Summerfield, Christopher, Tacchetti, Andrea

论文摘要

从社交网络到交通路线，人工学习代理在现代机构中起着核心作用。因此，我们必须了解如何利用这些系统来促进与我们自己的价值观和愿望保持一致的结果和行为。近年来，多种学习吸引了大量关注，但与固定的非学习同事相互作用时，人工药物主要得到了评估。尽管这种评估方案具有优异的成绩，但它无法捕获必须处理适应性和不断学习成分的机构所面临的动态。在这里，我们解决了此限制，并在对其自适应共同玩家的学习轨迹进行评估时构建了良好的表现（“机制”）（“参与者”）。我们提出的算法由两个嵌套的学习环组成：一个内部循环，参与者学会最能响应固定机制；以及一个机制代理根据经验更新其策略的外循环。当与人工学习的代理人和人类作为共同玩家配对时，我们报告了机制剂的性能。我们的结果表明，我们的机制能够为参与者的策略寻求优惠的结果，这表明了现代机构有效并自动影响其选民的策略和行为的途径。

From social networks to traffic routing, artificial learning agents are playing a central role in modern institutions. We must therefore understand how to leverage these systems to foster outcomes and behaviors that align with our own values and aspirations. While multiagent learning has received considerable attention in recent years, artificial agents have been primarily evaluated when interacting with fixed, non-learning co-players. While this evaluation scheme has merit, it fails to capture the dynamics faced by institutions that must deal with adaptive and continually learning constituents. Here we address this limitation, and construct agents ("mechanisms") that perform well when evaluated over the learning trajectory of their adaptive co-players ("participants"). The algorithm we propose consists of two nested learning loops: an inner loop where participants learn to best respond to fixed mechanisms; and an outer loop where the mechanism agent updates its policy based on experience. We report the performance of our mechanism agents when paired with both artificial learning agents and humans as co-players. Our results show that our mechanisms are able to shepherd the participants strategies towards favorable outcomes, indicating a path for modern institutions to effectively and automatically influence the strategies and behaviors of their constituents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题