一种重新学习的方法来加强对智能建筑的控制

论文标题

一种重新学习的方法来加强对智能建筑的控制

A Relearning Approach to Reinforcement Learning for Control of Smart Buildings

论文作者

Naug, Avisek, Quiñones-Grueiro, Marcos, Biswas, Gautam

论文摘要

本文表明，使用渐进深度强化学习（RL）对控制政策的持续重新学习可以改善非平稳过程的政策学习。我们为数据驱动的“智能建筑环境”展示了这种方法，我们用作测试床，用于开发HVAC控制器，以减少大学校园中大型建筑物的能源消耗。建筑物运营和天气模式的非平稳性使得必须制定适应不断变化条件的控制策略。诸如近端策略优化（PPO）之类的政策RL算法代表了解决此非平稳性的一种方法，但是对实际系统的探索不是安全 - 关键系统的选择。作为替代方案，我们开发了一种增量的RL技术，该技术同时降低了建筑能源消耗而无需牺牲整体舒适度。我们将增量RL控制器的性能与未实现重新学习功能的静态RL控制器的性能进行了比较。随着时间的流逝，静态控制器的性能会大大降低，但是重新学习控制器可以适应变化的条件，同时确保舒适性和最佳的能量性能。

This paper demonstrates that continual relearning of control policies using incremental deep reinforcement learning (RL) can improve policy learning for non-stationary processes. We demonstrate this approach for a data-driven 'smart building environment' that we use as a test-bed for developing HVAC controllers for reducing energy consumption of large buildings on our university campus. The non-stationarity in building operations and weather patterns makes it imperative to develop control strategies that are adaptive to changing conditions. On-policy RL algorithms, such as Proximal Policy Optimization (PPO) represent an approach for addressing this non-stationarity, but exploration on the actual system is not an option for safety-critical systems. As an alternative, we develop an incremental RL technique that simultaneously reduces building energy consumption without sacrificing overall comfort. We compare the performance of our incremental RL controller to that of a static RL controller that does not implement the relearning function. The performance of the static controller diminishes significantly over time, but the relearning controller adjusts to changing conditions while ensuring comfort and optimal energy performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题