时空图形卷积神经网络，用于物理感知网格学习算法

论文标题

时空图形卷积神经网络，用于物理感知网格学习算法

Spatio-Temporal Graph Convolutional Neural Networks for Physics-Aware Grid Learning Algorithms

论文作者

Wu, Tong, Carreno, Ignacio Losada, Scaglione, Anna, Arnold, Daniel

论文摘要

本文提出了一种通过基于时空的频率图形的深度强化学习（STGCN-DRL）框架的无模型VAR控制（VVC）算法，其目标是在不平衡分配系统中控制智能逆变器。我们首先根据功率流程识别图形移位操作员（GSO）。然后，我们开发一个时空的图形频道（STGCN），测试了复发图转换器（RGCN）和卷积图形卷动反应（CGCN）体系结构，旨在捕获电压相思的时空相关性。 STGCN层执行策略功能的功能提取任务和增强学习体系结构的价值功能，然后我们利用近端策略优化（PPO）来搜索操作空间以获得最佳策略功能并近似最佳的值函数。我们进一步利用电压图信号的低通属性来引入GCN体系结构，该策略是其输入是一个被驱逐的状态向量的策略，即部分观察。关于不平衡的123个总线系统的案例研究验证了所提出的方法在减轻不稳定性和保持淋巴电压曲线范围内的出色性能。

This paper proposes a model-free Volt-VAR control (VVC) algorithm via the spatio-temporal graph ConvNet-based deep reinforcement learning (STGCN-DRL) framework, whose goal is to control smart inverters in an unbalanced distribution system. We first identify the graph shift operator (GSO) based on the power flow equations. Then, we develop a spatio-temporal graph ConvNet (STGCN), testing both recurrent graph ConvNets (RGCN) and convolutional graph ConvNets (CGCN) architectures, aimed at capturing the spatiotemporal correlation of voltage phasors. The STGCN layer performs the feature extraction task for the policy function and the value function of the reinforcement learning architecture, and then we utilize the proximal policy optimization (PPO) to search the action spaces for an optimum policy function and to approximate an optimum value function. We further utilize the low-pass property of voltage graph signal to introduce an GCN architecture for the the policy whose input is a decimated state vector, i.e. a partial observation. Case studies on the unbalanced 123-bus systems validate the excellent performance of the proposed method in mitigating instabilities and maintaining nodal voltage profiles within a desirable range.

下载PDF全文

下载文献需遵守相关版权规定

论文标题