使用图形卷积神经网络的多机构增强学习，用于发电市场中发电单元的最佳招标策略

论文标题

使用图形卷积神经网络的多机构增强学习，用于发电市场中发电单元的最佳招标策略

Multi-Agent Reinforcement Learning with Graph Convolutional Neural Networks for optimal Bidding Strategies of Generation Units in Electricity Markets

论文作者

Rokhforoz, Pegah, Fink, Olga

论文摘要

在电力市场中寻找最佳的招标策略将导致更高的利润。但是，由于系统不确定性，这是一个充满挑战的问题，这是由于其他一代单位的策略而引起的。分布式优化（每个实体或代理人都决定单独出价，已成为最新技术。但是，它无法克服系统不确定性的挑战。深度强化学习是在不确定环境中学习最佳策略的一种有前途的方法。然而，它无法在学习过程中整合有关空间系统拓扑的信息。本文提出了一种基于深钢筋学习（DRL）与图形卷积神经网络（GCN）的分布式学习算法。实际上，拟议的框架可以通过从环境中获得反馈来帮助代理商更新决策，从而可以克服不确定性的挑战。在该提出的算法中，节点之间的状态和连接是GCN的输入，这可以使代理知道系统的结构。有关系统拓扑的此信息可以帮助代理商改善其投标策略并增加利润。我们在不同情况下评估了IEEE 30总线系统上提出的算法。此外，为了研究拟议方法的概括能力，我们在IEEE 39-BUS系统上测试了经过训练的模型。结果表明，所提出的算法与DRL相比具有更大的泛化能力，并且在更改系统拓扑时可能会带来更高的利润。

Finding optimal bidding strategies for generation units in electricity markets would result in higher profit. However, it is a challenging problem due to the system uncertainty which is due to the unknown other generation units' strategies. Distributed optimization, where each entity or agent decides on its bid individually, has become state of the art. However, it cannot overcome the challenges of system uncertainties. Deep reinforcement learning is a promising approach to learn the optimal strategy in uncertain environments. Nevertheless, it is not able to integrate the information on the spatial system topology in the learning process. This paper proposes a distributed learning algorithm based on deep reinforcement learning (DRL) combined with a graph convolutional neural network (GCN). In fact, the proposed framework helps the agents to update their decisions by getting feedback from the environment so that it can overcome the challenges of the uncertainties. In this proposed algorithm, the state and connection between nodes are the inputs of the GCN, which can make agents aware of the structure of the system. This information on the system topology helps the agents to improve their bidding strategies and increase the profit. We evaluate the proposed algorithm on the IEEE 30-bus system under different scenarios. Also, to investigate the generalization ability of the proposed approach, we test the trained model on IEEE 39-bus system. The results show that the proposed algorithm has more generalization abilities compare to the DRL and can result in higher profit when changing the topology of the system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题