小报：一种预先训练的语言模型，具有对词和性格表示的对比度学习

论文标题

小报：一种预先训练的语言模型，具有对词和性格表示的对比度学习

CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations

论文作者

Chen, Borun, Tang, Hongyin, Bu, Jiahao, Zhang, Kai, Wang, Jingang, Wang, Qifan, Zheng, Hai-Tao, Wu, Wei, Yu, Liqian

论文摘要

预训练的语言模型（PLM）在自然语言理解中的许多下游任务中取得了显着的性能增长。已提出了各种中文PLM，以学习更好的中文表示。但是，大多数当前模型使用中文字符作为输入，并且无法编码中文单词中包含的语义信息。尽管最近的预训练模型同时融合了单词和字符，但它们通常会遭受不足的语义互动，并且无法捕获单词和字符之间的语义关系。为了解决上述问题，我们提出了一个简单而有效的PLM小扣手，该小动物采用了对单词和性格表示的对比度学习。特别是，Clower通过对多透明信息的对比学习将粗粒的信息（即单词）隐式编码为细粒度的表示（即字符）。在现实的情况下，小电动器具有很大的价值，因为它可以轻松地将其纳入任何现有的基于细粒的PLM中而无需修改生产管道。在一系列下游任务上进行的扩展实验表明，比几个最先进的盆地都表明了loter的卓越性能。

Pre-trained Language Models (PLMs) have achieved remarkable performance gains across numerous downstream tasks in natural language understanding. Various Chinese PLMs have been successively proposed for learning better Chinese language representation. However, most current models use Chinese characters as inputs and are not able to encode semantic information contained in Chinese words. While recent pre-trained models incorporate both words and characters simultaneously, they usually suffer from deficient semantic interactions and fail to capture the semantic relation between words and characters. To address the above issues, we propose a simple yet effective PLM CLOWER, which adopts the Contrastive Learning Over Word and charactER representations. In particular, CLOWER implicitly encodes the coarse-grained information (i.e., words) into the fine-grained representations (i.e., characters) through contrastive learning on multi-grained information. CLOWER is of great value in realistic scenarios since it can be easily incorporated into any existing fine-grained based PLMs without modifying the production pipelines.Extensive experiments conducted on a range of downstream tasks demonstrate the superior performance of CLOWER over several state-of-the-art baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题