论文标题
在互联网无人机中自动识别图像的强大半监督联盟学习
Robust Semi-supervised Federated Learning for Images Automatic Recognition in Internet of Drones
论文作者
论文摘要
空中访问网络已被认为是各种物联网(IoT)服务和应用程序的重要驱动力。特别是,以无人机互联网为中心的航空计算网络基础架构引发了自动图像识别的新革命。这项新兴技术依赖于共享地面真相,标记为无人机(UAV)群之间的数据来训练高质量的自动图像识别模型。但是,这种方法将带来数据隐私和数据可用性挑战。为了解决这些问题,我们首先提出了一个半监督联合学习(SSFL)框架,以供隐私提供无人机图像识别。具体而言,我们提出了混合策略的模型参数,以改善在两个现实的场景(标签 - 客户和标签 - 服务器)下FL和半监督学习方法的幼稚组合,这被称为联合混合(FEDMIX)。此外,使用不同环境(即统计异质性)使用不同摄像头模块收集的无人机收集的本地数据的数量,特征和分布存在显着差异。为了减轻统计异质性问题,我们根据客户参与培训的频率(即FedFREQ聚合规则)提出了一个聚合规则,该规则可以根据其频率调整相应的本地模型的重量。数值结果表明,我们所提出的方法的性能明显优于当前基线的性能,并且对不同的非IID客户数据级别具有鲁棒性。
Air access networks have been recognized as a significant driver of various Internet of Things (IoT) services and applications. In particular, the aerial computing network infrastructure centered on the Internet of Drones has set off a new revolution in automatic image recognition. This emerging technology relies on sharing ground truth labeled data between Unmanned Aerial Vehicle (UAV) swarms to train a high-quality automatic image recognition model. However, such an approach will bring data privacy and data availability challenges. To address these issues, we first present a Semi-supervised Federated Learning (SSFL) framework for privacy-preserving UAV image recognition. Specifically, we propose model parameters mixing strategy to improve the naive combination of FL and semi-supervised learning methods under two realistic scenarios (labels-at-client and labels-at-server), which is referred to as Federated Mixing (FedMix). Furthermore, there are significant differences in the number, features, and distribution of local data collected by UAVs using different camera modules in different environments, i.e., statistical heterogeneity. To alleviate the statistical heterogeneity problem, we propose an aggregation rule based on the frequency of the client's participation in training, namely the FedFreq aggregation rule, which can adjust the weight of the corresponding local model according to its frequency. Numerical results demonstrate that the performance of our proposed method is significantly better than those of the current baseline and is robust to different non-IID levels of client data.