论文标题
将深度RL模型提炼成可解释的神经模糊系统
Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems
论文作者
论文摘要
深度强化学习使用深层神经网络来编码一项策略,该策略在广泛的应用中取得了很好的性能,但被广泛视为黑匣子模型。神经模糊的控制器给出了更可解释的深网络替代方法。不幸的是,神经模糊的控制器通常需要大量规则来解决相对简单的任务,从而难以解释。在这项工作中,我们提出了一种算法,将策略从深Q网络提取为紧凑的神经模糊控制器。这使我们能够通过蒸馏来训练紧凑的神经模糊控制器,以解决他们无法直接解决的任务,结合了深度强化学习的灵活性和紧凑的规则基础的可解释性。我们在Openai体育馆的三个著名环境中演示了算法,在那里我们仅使用2至6个模糊规则匹配DQN代理的性能。
Deep Reinforcement Learning uses a deep neural network to encode a policy, which achieves very good performance in a wide range of applications but is widely regarded as a black box model. A more interpretable alternative to deep networks is given by neuro-fuzzy controllers. Unfortunately, neuro-fuzzy controllers often need a large number of rules to solve relatively simple tasks, making them difficult to interpret. In this work, we present an algorithm to distill the policy from a deep Q-network into a compact neuro-fuzzy controller. This allows us to train compact neuro-fuzzy controllers through distillation to solve tasks that they are unable to solve directly, combining the flexibility of deep reinforcement learning and the interpretability of compact rule bases. We demonstrate the algorithm on three well-known environments from OpenAI Gym, where we nearly match the performance of a DQN agent using only 2 to 6 fuzzy rules.