半监督自动编码投影依赖性解析

论文标题

半监督自动编码投影依赖性解析

Semi-supervised Autoencoding Projective Dependency Parsing

论文作者

Zhang, Xiao, Goldwasser, Dan

论文摘要

我们描述了两个基于半监督图的投影依赖性解析的端到端自动编码模型。第一个模型是局部自动编码解析器（LAP），使用连续的延迟变量以顺序的方式编码输入。第二个模型是全球自动编码解析器（GAP），将输入编码为依赖树作为潜在变量，并具有精确的推断。这两个模型都由两个部分组成：通过深层神经网络（DNN）增强的编码器，这些编码器可以利用上下文信息将输入编码为潜在变量，而解码器是能够重建输入的生成模型。 LAP和GAP都允许使用具有共享参数的标签和未标记数据具有不同损失函数的统一结构。我们对WSJ和UD依赖性解析数据集进行了实验，这表明我们的模型可以利用未标记的数据来提高在给定有限的标记数据的情况下，并表现优于先前提出的半监督模型。

We describe two end-to-end autoencoding models for semi-supervised graph-based projective dependency parsing. The first model is a Locally Autoencoding Parser (LAP) encoding the input using continuous latent variables in a sequential manner; The second model is a Globally Autoencoding Parser (GAP) encoding the input into dependency trees as latent variables, with exact inference. Both models consist of two parts: an encoder enhanced by deep neural networks (DNN) that can utilize the contextual information to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. Both LAP and GAP admit a unified structure with different loss functions for labeled and unlabeled data with shared parameters. We conducted experiments on WSJ and UD dependency parsing data sets, showing that our models can exploit the unlabeled data to improve the performance given a limited amount of labeled data, and outperform a previously proposed semi-supervised model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题