论文标题
BSDGAN:平衡传感器数据生成的对抗网络,用于人类活动识别
BSDGAN: Balancing Sensor Data Generative Adversarial Networks for Human Activity Recognition
论文作者
论文摘要
物联网技术的开发使各种传感器可以集成到移动设备中。基于传感器数据的人类活动识别(HAR)已成为机器学习和无处不在计算领域的积极研究主题。但是,由于人类活动的频率不一致,人类活动数据集中每个活动的数据量都会失衡。考虑到有限的传感器资源和手动标记的传感器数据的高成本,人类活动识别正面临着高度不平衡的活动数据集的挑战。在本文中,我们提出平衡传感器数据生成对抗网络(BSDGAN),以生成少数人类活动的传感器数据。提出的BSDGAN由生成器模型和一个歧视模型组成。考虑到人类活动数据集的极端失衡,使用自动编码器来初始化BSDGAN的训练过程,并确保可以学习每个活动的数据特征。生成的活动数据与原始数据集结合在一起,以平衡人类活动类别的活动数据量。我们在两个公开可用的不平衡人类活动数据集WISDM和UNIMIB上部署了多个人类活动识别模型。实验结果表明,所提出的BSDGAN可以有效地捕获实际人类活动传感器数据的数据特征,并生成逼真的合成传感器数据。同时,平衡的活动数据集可以有效地帮助活动识别模型提高识别精度。
The development of IoT technology enables a variety of sensors can be integrated into mobile devices. Human Activity Recognition (HAR) based on sensor data has become an active research topic in the field of machine learning and ubiquitous computing. However, due to the inconsistent frequency of human activities, the amount of data for each activity in the human activity dataset is imbalanced. Considering the limited sensor resources and the high cost of manually labeled sensor data, human activity recognition is facing the challenge of highly imbalanced activity datasets. In this paper, we propose Balancing Sensor Data Generative Adversarial Networks (BSDGAN) to generate sensor data for minority human activities. The proposed BSDGAN consists of a generator model and a discriminator model. Considering the extreme imbalance of human activity dataset, an autoencoder is employed to initialize the training process of BSDGAN, ensure the data features of each activity can be learned. The generated activity data is combined with the original dataset to balance the amount of activity data across human activity classes. We deployed multiple human activity recognition models on two publicly available imbalanced human activity datasets, WISDM and UNIMIB. Experimental results show that the proposed BSDGAN can effectively capture the data features of real human activity sensor data, and generate realistic synthetic sensor data. Meanwhile, the balanced activity dataset can effectively help the activity recognition model to improve the recognition accuracy.