能源系统最佳计划的深度RL算法的性能比较

论文标题

能源系统最佳计划的深度RL算法的性能比较

Performance Comparison of Deep RL Algorithms for Energy Systems Optimal Scheduling

论文作者

Shengren, Hou, Salazar, Edgar Mauricio, Vergara, Pedro P., Palensky, Peter

论文摘要

利用其数据驱动和无模型的功能，深入加强学习（DRL）算法有可能应对由于引入基于可再生能源的一代而导致的不确定性级别的潜力。为了同时处理能源系统的运营成本和技术约束（例如，生成需求平衡）DRL算法在设计奖励功能时必须考虑权衡取舍。这种折衷引入了额外的超参数，这些超参数会影响DRL算法的性能和提供可行解决方案的能力。在本文中，介绍了包括DDPG，TD3，SAC和PPO在内的不同DRL算法的性能比较。我们旨在为能源系统最佳调度问题提供这些DRL算法的公平比较。结果表明，与能源系统最佳调度问题的数学编程模型相比，即使在看不见的操作场景中，DRL算法在实时良好质量解决方案中提供的能力也是如此。然而，在大量高峰消耗的情况下，这些算法未能提供可行的解决方案，这可以阻碍其实际实施。

Taking advantage of their data-driven and model-free features, Deep Reinforcement Learning (DRL) algorithms have the potential to deal with the increasing level of uncertainty due to the introduction of renewable-based generation. To deal simultaneously with the energy systems' operational cost and technical constraints (e.g, generation-demand power balance) DRL algorithms must consider a trade-off when designing the reward function. This trade-off introduces extra hyperparameters that impact the DRL algorithms' performance and capability of providing feasible solutions. In this paper, a performance comparison of different DRL algorithms, including DDPG, TD3, SAC, and PPO, are presented. We aim to provide a fair comparison of these DRL algorithms for energy systems optimal scheduling problems. Results show DRL algorithms' capability of providing in real-time good-quality solutions, even in unseen operational scenarios, when compared with a mathematical programming model of the energy system optimal scheduling problem. Nevertheless, in the case of large peak consumption, these algorithms failed to provide feasible solutions, which can impede their practical implementation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题