论文标题

具有混合层次结构的复发性神经网络和自然语言处理的EM算法

Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing

论文作者

Luo, Zhaoxin, Zhu, Michael

论文摘要

如何获得越来越多的抽象水平的层次表示成为通过深层神经网络学习的关键问题之一。最近提出了各种RNN模型,以在文献中的建模语言中纳入明确和隐式分层信息。在本文中,我们提出了一种称为潜在指标层的新方法,以识别和学习隐式层次结构信息(例如短语),并进一步开发出一种EM算法来处理训练中的潜在指示层。潜在指示层进一步简化了文本的层次结构,这使我们能够无缝将不同级别的注意机制集成到结构中。我们将结果架构称为EM-HRNN模型。此外,我们制定了两种引导策略,以有效,有效地在长期文档上训练EM-HRNN模型。模拟研究和实际数据应用程序表明,带有引导训练的EM-HRNN模型在文档分类任务中的其他基于RNN的模型的表现优于其他基于RNN的模型。 EM-HRNN模型的性能可与一种基于变压器的方法相媲美,称为Bert-base,尽管前者的模型要小得多,并且不需要预训练。

How to obtain hierarchical representations with an increasing level of abstraction becomes one of the key issues of learning with deep neural networks. A variety of RNN models have recently been proposed to incorporate both explicit and implicit hierarchical information in modeling languages in the literature. In this paper, we propose a novel approach called the latent indicator layer to identify and learn implicit hierarchical information (e.g., phrases), and further develop an EM algorithm to handle the latent indicator layer in training. The latent indicator layer further simplifies a text's hierarchical structure, which allows us to seamlessly integrate different levels of attention mechanisms into the structure. We called the resulting architecture as the EM-HRNN model. Furthermore, we develop two bootstrap strategies to effectively and efficiently train the EM-HRNN model on long text documents. Simulation studies and real data applications demonstrate that the EM-HRNN model with bootstrap training outperforms other RNN-based models in document classification tasks. The performance of the EM-HRNN model is comparable to a Transformer-based method called Bert-base, though the former is much smaller model and does not require pre-training.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源