基于人群的数据增强和语音识别正规化培训

论文标题

基于人群的数据增强和语音识别正规化培训

Population Based Training for Data Augmentation and Regularization in Speech Recognition

论文作者

Haziza, Daniel, Rapin, Jérémy, Synnaeve, Gabriel

论文摘要

在优化过程中，变化的数据增强策略和正则化已导致使用固定值的性能改进。我们表明，基于人群的培训是在固定预算内连续搜索那些超参数的有用工具。这大大简化了查找此类最佳时间表的实验负担和计算成本。我们通过以这种方式优化规格以及辍学来实验语音识别。它与在训练过程中不会改变那些超参数的基线相比，相对改善了8％。我们在LibrisPeech的测试中获得了5.18％的单词错误率。

Varying data augmentation policies and regularization over the course of optimization has led to performance improvements over using fixed values. We show that population based training is a useful tool to continuously search those hyperparameters, within a fixed budget. This greatly simplifies the experimental burden and computational cost of finding such optimal schedules. We experiment in speech recognition by optimizing SpecAugment this way, as well as dropout. It compares favorably to a baseline that does not change those hyperparameters over the course of training, with an 8% relative WER improvement. We obtain 5.18% word error rate on LibriSpeech's test-other.

下载PDF全文

下载文献需遵守相关版权规定

论文标题