论文标题
来自社交媒体数据的跨性别社区情绪分析:一种自然语言处理方法
Transgender Community Sentiment Analysis from Social Media Data: A Natural Language Processing Approach
论文作者
论文摘要
与普通人群相比,跨性别社区在心理健康状况中遇到巨大差异。解释跨性别者发布的社会中间数据可能会帮助我们更好地了解这些性少数群体的情感并采用早期干预措施。在这项研究中,我们将跨性别者发表的300个社交媒体评论分类为负面,积极和中立的情绪。使用5个机器学习算法和2种深神经网络,以基于带注释的数据来构建情感分析分类器。结果表明,我们的注释是可靠的,在所有三个类别中,Cohen的Kappa得分都高0.8。 LSTM模型的精度超过0.85,AUC的最佳性能为0.876。我们的下一步将专注于在较大的注释数据集上使用高级自然语言处理算法。
Transgender community is experiencing a huge disparity in mental health conditions compared with the general population. Interpreting the social medial data posted by transgender people may help us understand the sentiments of these sexual minority groups better and apply early interventions. In this study, we manually categorize 300 social media comments posted by transgender people to the sentiment of negative, positive, and neutral. 5 machine learning algorithms and 2 deep neural networks are adopted to build sentiment analysis classifiers based on the annotated data. Results show that our annotations are reliable with a high Cohen's Kappa score over 0.8 across all three classes. LSTM model yields an optimal performance of accuracy over 0.85 and AUC of 0.876. Our next step will focus on using advanced natural language processing algorithms on a larger annotated dataset.