通过对抗多任务学习，具有联合文本标准化的积极性语言检测

论文标题

通过对抗多任务学习，具有联合文本标准化的积极性语言检测

Aggressive Language Detection with Joint Text Normalization via Adversarial Multi-task Learning

论文作者

Wu, Shengqiong, Fei, Hao, Ji, Donghong

论文摘要

积极的语言检测（ALD）是在文本中检测滥用和令人反感的语言，是NLP社区中的关键应用之一。大多数现有作品都将ALD视为与神经模型的常规分类，同时忽略了社交媒体文本的固有冲突，而社交媒体文本的固有冲突非常不正常和不规则。在这项工作中，我们通过对抗多任务学习框架共同执行文本归一化（TN）来改善ALD。 ALD和TN的私人编码器分别集中在特定于任务的功能上，共享编码器在两个任务上学习了基本的共同特征。在对抗训练期间，任务歧视者可以区分ALD或TN的单独学习。四个ALD数据集的实验结果表明，我们的模型在不同的设置下通过大幅度优于所有基准，这表明需要与ALD学习TN的必要性。进行进一步的分析以更好地理解我们的方法。

Aggressive language detection (ALD), detecting the abusive and offensive language in texts, is one of the crucial applications in NLP community. Most existing works treat ALD as regular classification with neural models, while ignoring the inherent conflicts of social media text that they are quite unnormalized and irregular. In this work, we target improving the ALD by jointly performing text normalization (TN), via an adversarial multi-task learning framework. The private encoders for ALD and TN focus on the task-specific features retrieving, respectively, and the shared encoder learns the underlying common features over two tasks. During adversarial training, a task discriminator distinguishes the separate learning of ALD or TN. Experimental results on four ALD datasets show that our model outperforms all baselines under differing settings by large margins, demonstrating the necessity of joint learning the TN with ALD. Further analysis is conducted for a better understanding of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题