利用越南文本情感识别的越南社交媒体特征

论文标题

利用越南文本情感识别的越南社交媒体特征

Exploiting Vietnamese Social Media Characteristics for Textual Emotion Recognition in Vietnamese

论文作者

Nguyen, Khang Phuoc-Quy, Van Nguyen, Kiet

论文摘要

近年来，文本情感识别一直是一个有希望的研究主题。许多研究人员旨在建立更准确和强大的情绪检测系统。在本文中，我们进行了几项实验，以指示数据预处理如何影响机器学习方法在文本情绪识别方面。这些实验是在越南社交媒体情感语料库（UIT-VSMEC）上作为基准数据集进行的。我们探索越南社交媒体的特征，以提出不同的预处理技术，并通过情感上下文提取钥匙限制，以提高UIT-VSMEC上的机器性能。我们的实验评估表明，基于越南社交媒体特征的适当预处理技术，多项式逻辑回归（MLR）达到了64.40％的最佳F1评分，比UIT-VSMEC（59.74％）的作者建立的CNN模型的4.66％显着提高了4.66％。

Textual emotion recognition has been a promising research topic in recent years. Many researchers aim to build more accurate and robust emotion detection systems. In this paper, we conduct several experiments to indicate how data pre-processing affects a machine learning method on textual emotion recognition. These experiments are performed on the Vietnamese Social Media Emotion Corpus (UIT-VSMEC) as the benchmark dataset. We explore Vietnamese social media characteristics to propose different pre-processing techniques, and key-clause extraction with emotional context to improve the machine performance on UIT-VSMEC. Our experimental evaluation shows that with appropriate pre-processing techniques based on Vietnamese social media characteristics, Multinomial Logistic Regression (MLR) achieves the best F1-score of 64.40%, a significant improvement of 4.66% over the CNN model built by the authors of UIT-VSMEC (59.74%).

下载PDF全文

下载文献需遵守相关版权规定

论文标题