对结构预测的能源模型的隐式培训

论文标题

对结构预测的能源模型的隐式培训

Implicit Training of Energy Model for Structure Prediction

论文作者

Shankar, Shiv, Piratla, Vihari

论文摘要

大多数深度学习研究都集中在开发新的模型和培训程序上。另一方面，训练目标通常仅限于标准损失的组合。当目标与评估指标保持良好状态时，这不是一个主要问题。但是，当处理复杂的结构化输出时，理想的目标可能很难优化，并且通常目标是对真正目标的代理的效力。在这项工作中，我们认为现有的基于推理网络的结构预测方法（Tu和Gimpel 2018; Tu，Pang和Gimpel 2020）间接学习以优化通过能量模型参数参数的动态损耗目标。然后，我们使用基于隐式级别的技术来学习相应的动态目标。我们的实验表明，隐式学习动态损失格局是改善结构预测模型性能的有效方法。

Most deep learning research has focused on developing new model and training procedures. On the other hand the training objective has usually been restricted to combinations of standard losses. When the objective aligns well with the evaluation metric, this is not a major issue. However when dealing with complex structured outputs, the ideal objective can be hard to optimize and the efficacy of usual objectives as a proxy for the true objective can be questionable. In this work, we argue that the existing inference network based structure prediction methods ( Tu and Gimpel 2018; Tu, Pang, and Gimpel 2020) are indirectly learning to optimize a dynamic loss objective parameterized by the energy model. We then explore using implicit-gradient based technique to learn the corresponding dynamic objectives. Our experiments show that implicitly learning a dynamic loss landscape is an effective method for improving model performance in structure prediction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题