论文标题
建立功耗数据集:调查,分类学和未来方向
Building power consumption datasets: Survey, taxonomy and future directions
论文作者
论文摘要
在过去的十年中,扩展的努力已涌入能源效率。此后发布了几个能源消耗数据集,每个数据集的使用和限制都在不同。例如,建筑消耗模式来自多种来源,包括环境条件,用户占用,天气条件和消费者的喜好。因此,对可用数据集的正确理解将为提高能源效率提供强大的基础。从对现有数据库进行全面审查的必要性开始,提出了这项工作来调查,研究和可视化构建能源消耗数据集的数值和方法论性质。总共检查了三十一个数据库,并根据几个功能进行了比较,例如地理位置,收集期,受监控家庭数量,收集数据的采样率,次级电器的数量,提取的功能和发布日期。此外,还分析并比较了不同数据集中使用的数据传输,数据存储和隐私问题的数据收集平台和相关模块。根据分析研究,已经介绍了一个新型的数据集,即卡塔尔大学数据集,这是一个注释的功耗异常检测数据集。后者对于测试和训练异常检测算法非常有用,从而减少浪费能量。展望未来,提出了一组建议,以改善数据集集合,例如采用多模式数据收集,智能物品互联网数据收集,低成本硬件平台以及隐私和安全机制。此外,确定了改善数据集开发和利用的未来方向,包括使用新型机器学习解决方案,创新的可视化工具和可解释的推荐系统。
In the last decade, extended efforts have been poured into energy efficiency. Several energy consumption datasets were henceforth published, with each dataset varying in properties, uses and limitations. For instance, building energy consumption patterns are sourced from several sources, including ambient conditions, user occupancy, weather conditions and consumer preferences. Thus, a proper understanding of the available datasets will result in a strong basis for improving energy efficiency. Starting from the necessity of a comprehensive review of existing databases, this work is proposed to survey, study and visualize the numerical and methodological nature of building energy consumption datasets. A total of thirty-one databases are examined and compared in terms of several features, such as the geographical location, period of collection, number of monitored households, sampling rate of collected data, number of sub-metered appliances, extracted features and release date. Furthermore, data collection platforms and related modules for data transmission, data storage and privacy concerns used in different datasets are also analyzed and compared. Based on the analytical study, a novel dataset has been presented, namely Qatar university dataset, which is an annotated power consumption anomaly detection dataset. The latter will be very useful for testing and training anomaly detection algorithms, and hence reducing wasted energy. Moving forward, a set of recommendations is derived to improve datasets collection, such as the adoption of multi-modal data collection, smart Internet of things data collection, low-cost hardware platforms and privacy and security mechanisms. In addition, future directions to improve datasets exploitation and utilization are identified, including the use of novel machine learning solutions, innovative visualization tools and explainable recommender systems.