大型模型的联合学习的在线模型压缩

论文标题

大型模型的联合学习的在线模型压缩

Online Model Compression for Federated Learning with Large Models

论文作者

Yang, Tien-Ju, Xiao, Yonghui, Motta, Giovanni, Beaufays, Françoise, Mathews, Rajiv, Chen, Mingqing

论文摘要

本文解决了在联邦学习设置下培训大型神经网络模型的挑战：高设备记忆使用和沟通成本。提出的在线模型压缩（OMC）提供了一个框架，该框架将模型参数以压缩格式存储并仅在需要时对其进行解压缩。我们在本文中使用量化作为压缩方法，并提出了三种方法，（1）使用可变量转换，（2）仅重量矩阵量化和（3）部分参数量化，以最大程度地降低对模型准确性的影响。根据我们对两个最近的语音识别神经网络和两个不同数据集的实验，OMC可以将模型参数的内存使用和通信成本降低高达59％，同时与完整的精确培训相比，同时获得了可比的准确性和训练速度。

This paper addresses the challenges of training large neural network models under federated learning settings: high on-device memory usage and communication cost. The proposed Online Model Compression (OMC) provides a framework that stores model parameters in a compressed format and decompresses them only when needed. We use quantization as the compression method in this paper and propose three methods, (1) using per-variable transformation, (2) weight matrices only quantization, and (3) partial parameter quantization, to minimize the impact on model accuracy. According to our experiments on two recent neural networks for speech recognition and two different datasets, OMC can reduce memory usage and communication cost of model parameters by up to 59% while attaining comparable accuracy and training speed when compared with full-precision training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题