论文标题
利用越南文本情感识别的越南社交媒体特征
Exploiting Vietnamese Social Media Characteristics for Textual Emotion Recognition in Vietnamese
论文作者
论文摘要
近年来,文本情感识别一直是一个有希望的研究主题。许多研究人员旨在建立更准确和强大的情绪检测系统。在本文中,我们进行了几项实验,以指示数据预处理如何影响机器学习方法在文本情绪识别方面。这些实验是在越南社交媒体情感语料库(UIT-VSMEC)上作为基准数据集进行的。我们探索越南社交媒体的特征,以提出不同的预处理技术,并通过情感上下文提取钥匙限制,以提高UIT-VSMEC上的机器性能。我们的实验评估表明,基于越南社交媒体特征的适当预处理技术,多项式逻辑回归(MLR)达到了64.40%的最佳F1评分,比UIT-VSMEC(59.74%)的作者建立的CNN模型的4.66%显着提高了4.66%。
Textual emotion recognition has been a promising research topic in recent years. Many researchers aim to build more accurate and robust emotion detection systems. In this paper, we conduct several experiments to indicate how data pre-processing affects a machine learning method on textual emotion recognition. These experiments are performed on the Vietnamese Social Media Emotion Corpus (UIT-VSMEC) as the benchmark dataset. We explore Vietnamese social media characteristics to propose different pre-processing techniques, and key-clause extraction with emotional context to improve the machine performance on UIT-VSMEC. Our experimental evaluation shows that with appropriate pre-processing techniques based on Vietnamese social media characteristics, Multinomial Logistic Regression (MLR) achieves the best F1-score of 64.40%, a significant improvement of 4.66% over the CNN model built by the authors of UIT-VSMEC (59.74%).