论文标题
基于非IID私人数据的基于蒸馏的半监督联合学习,用于沟通有效的协作培训
Distillation-Based Semi-Supervised Federated Learning for Communication-Efficient Collaborative Training with Non-IID Private Data
论文作者
论文摘要
这项研究开发了一个联合学习(FL)框架,由于典型框架中的模型大小而在不损害模型性能的情况下,由于模型尺寸而在很大程度上克服了逐步的通信成本。为此,基于利用未标记的开放数据集的想法,我们建议基于蒸馏的半监督FL(DS-FL)算法,该算法在移动设备之间交换本地模型的输出,而不是典型框架使用的模型参数交换。在DS-FL中,通信成本仅取决于模型的输出尺寸,并且不会根据模型大小扩展。交换的模型输出用于标记打开数据集的每个样本,该数据集创建了另外标记的数据集。基于新数据集,对本地模型进行了进一步的培训,并且由于数据增强效果而增强了模型性能。我们进一步强调,在DS-FL中,设备数据集的异质性导致每个数据样本的模棱两可和训练收敛的降低。为了防止这种情况,我们提出平均熵降低,其中汇总模型输出有意锐化。此外,广泛的实验表明,相对于FL基准测试的DS-FL可将通信成本降低到99%,同时达到相似或更高的分类精度。
This study develops a federated learning (FL) framework overcoming largely incremental communication costs due to model sizes in typical frameworks without compromising model performance. To this end, based on the idea of leveraging an unlabeled open dataset, we propose a distillation-based semi-supervised FL (DS-FL) algorithm that exchanges the outputs of local models among mobile devices, instead of model parameter exchange employed by the typical frameworks. In DS-FL, the communication cost depends only on the output dimensions of the models and does not scale up according to the model size. The exchanged model outputs are used to label each sample of the open dataset, which creates an additionally labeled dataset. Based on the new dataset, local models are further trained, and model performance is enhanced owing to the data augmentation effect. We further highlight that in DS-FL, the heterogeneity of the devices' dataset leads to ambiguous of each data sample and lowing of the training convergence. To prevent this, we propose entropy reduction averaging, where the aggregated model outputs are intentionally sharpened. Moreover, extensive experiments show that DS-FL reduces communication costs up to 99% relative to those of the FL benchmark while achieving similar or higher classification accuracy.