使用本地分类动力学的反事实数据增强

论文标题

使用本地分类动力学的反事实数据增强

Counterfactual Data Augmentation using Locally Factored Dynamics

论文作者

Pitis, Silviu, Creager, Elliot, Garg, Animesh

论文摘要

许多动态过程，包括机器人控制和增强学习（RL）中的常见场景，涉及一组相互作用的子过程。尽管子处理不是独立的，但它们的相互作用通常很稀疏，并且在任何给定时间步骤中的动力学通常都可以分解为本地独立的因果机制。可以利用这种局部因果结构来提高序列预测的样本效率和非政策增强学习。我们通过引入局部因果模型（LCM）来形式化，该模型是通过在状态空间子集中调节从全球因果模型引起的。我们提出了一种在给定面向对象的状态表示的情况下推断这些结构的方法，以及一种用于反事实数据增强（CODA）的新型算法。 CODA使用本地结构和经验重播来生成在全球模型中有因果关系有效的反事实体验。我们发现，CODA显着提高了RL代理在本地分解的任务中的性能，包括批处理和目标条件的设置。

Many dynamic processes, including common scenarios in robotic control and reinforcement learning (RL), involve a set of interacting subprocesses. Though the subprocesses are not independent, their interactions are often sparse, and the dynamics at any given time step can often be decomposed into locally independent causal mechanisms. Such local causal structures can be leveraged to improve the sample efficiency of sequence prediction and off-policy reinforcement learning. We formalize this by introducing local causal models (LCMs), which are induced from a global causal model by conditioning on a subset of the state space. We propose an approach to inferring these structures given an object-oriented state representation, as well as a novel algorithm for Counterfactual Data Augmentation (CoDA). CoDA uses local structures and an experience replay to generate counterfactual experiences that are causally valid in the global model. We find that CoDA significantly improves the performance of RL agents in locally factored tasks, including the batch-constrained and goal-conditioned settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题