合奏的转移学习：减少计算时间并保持多样性

论文标题

合奏的转移学习：减少计算时间并保持多样性

Transfer learning for ensembles: reducing computation time and keeping the diversity

论文作者

Shashkov, Ilya, Balabin, Nikita, Burnaev, Evgeny, Zaytsev, Alexey

论文摘要

将训练一个问题的深度神经网络转移到另一个问题上只需要少量数据，几乎没有额外的计算时间。对于深度学习模型的合奏，通常比单个模型优越。但是，深度神经网络的转移需要相对较高的计算费用。过度拟合的可能性也增加。我们的传输学习方法的方法包括两个步骤：（a）通过单个移位向量将所有模型中所有模型的编码器的重量转移，然后（b）随后对每个单独的模型进行微小的微调。该策略会加速训练过程，并有机会在使用Shift Vector的训练时间大大减少的训练时间中添加模型。我们通过计算时间比较不同的策略，合奏的准确性，不确定性估计和分歧，得出的结论是，与传统方法相比，我们的方法使用相同的计算复杂性提供了竞争结果。同样，我们的方法使整体模型的多样性更高。

Transferring a deep neural network trained on one problem to another requires only a small amount of data and little additional computation time. The same behaviour holds for ensembles of deep learning models typically superior to a single model. However, a transfer of deep neural networks ensemble demands relatively high computational expenses. The probability of overfitting also increases. Our approach for the transfer learning of ensembles consists of two steps: (a) shifting weights of encoders of all models in the ensemble by a single shift vector and (b) doing a tiny fine-tuning for each individual model afterwards. This strategy leads to a speed-up of the training process and gives an opportunity to add models to an ensemble with significantly reduced training time using the shift vector. We compare different strategies by computation time, the accuracy of an ensemble, uncertainty estimation and disagreement and conclude that our approach gives competitive results using the same computation complexity in comparison with the traditional approach. Also, our method keeps the ensemble's models' diversity higher.

下载PDF全文

下载文献需遵守相关版权规定

论文标题