减少模型抖动：生产环境中语义解析器的稳定重新训练

论文标题

减少模型抖动：生产环境中语义解析器的稳定重新训练

Reducing Model Jitter: Stable Re-training of Semantic Parsers in Production Environments

论文作者

Hidey, Christopher, Liu, Fei, Goel, Rahul

论文摘要

即使在使用相同的数据和超参数训练的情况下，通过使用不同的随机种子进行了训练，重新研究现代深度学习系统也会导致模型性能的变化。我们称这种现象模型抖动。在生产环境中通常会加剧此问题，其中模型在嘈杂的数据上进行了重新训练。在这项工作中，我们解决了稳定的重新训练问题，重点是对话语义解析器。我们首先通过引入模型协议度量标准并显示数据集噪声和模型大小的变化来量化模型抖动问题。然后，我们证明了各种抖动减少技术的有效性，例如结合和蒸馏。最后，我们讨论了这种技术之间的实际权衡，并表明共同依据为语义解析系统的抖动减少而言提供了一个愉悦的位置，资源使用率仅适度增加。

Retraining modern deep learning systems can lead to variations in model performance even when trained using the same data and hyper-parameters by simply using different random seeds. We call this phenomenon model jitter. This issue is often exacerbated in production settings, where models are retrained on noisy data. In this work we tackle the problem of stable retraining with a focus on conversational semantic parsers. We first quantify the model jitter problem by introducing the model agreement metric and showing the variation with dataset noise and model sizes. We then demonstrate the effectiveness of various jitter reduction techniques such as ensembling and distillation. Lastly, we discuss practical trade-offs between such techniques and show that co-distillation provides a sweet spot in terms of jitter reduction for semantic parsing systems with only a modest increase in resource usage.

下载PDF全文

下载文献需遵守相关版权规定

论文标题