论文标题

基于集成的本地知识基础进行社交媒体供稿进行预处理的框架

A Framework for Pre-processing of Social Media Feeds based on Integrated Local Knowledge Base

论文作者

Kolajo, Taiwo, Daramola, Olawande, Adebiyi, Ayodele, Aaditeshwar, Seth

论文摘要

关于社交媒体供稿语义分析的大多数研究都没有考虑与lang词,缩写和嵌入社交媒体帖子中的缩写有关的歧义问题。这些嘈杂的术语具有隐含的含义,并且是必须分析的丰富语义环境的一部分,以从社交媒体提要中获得完整的见解。本文提出了一个改进的框架,用于预处理社交媒体提要,以提高性能。为此,将包括本地知识来源(Naijalingo),城市词典和互联网语组成的综合知识库(IKB)与改编的Lesk算法相结合,以促进对社交媒体供稿的语义分析。实验结果表明,在三个机器学习模型上测试的方法是支持向量机,多层感知器和卷积神经网络时,该方法的执行效果要比现有方法更好。该框架在标准化数据集上的精度为94.07%,当用于从推文中提取观点时,本地化数据集的精度为99.78%。局部数据集上的性能的改进揭示了将本地知识源的使用整合到分析社交媒体提要的过程中,尤其是在解释具有上下文含义含义的s语/首字母缩写/缩写/缩写方面的优势。

Most of the previous studies on the semantic analysis of social media feeds have not considered the issue of ambiguity that is associated with slangs, abbreviations, and acronyms that are embedded in social media posts. These noisy terms have implicit meanings and form part of the rich semantic context that must be analysed to gain complete insights from social media feeds. This paper proposes an improved framework for pre-processing of social media feeds for better performance. To do this, the use of an integrated knowledge base (ikb) which comprises a local knowledge source (Naijalingo), urban dictionary and internet slang was combined with the adapted Lesk algorithm to facilitate semantic analysis of social media feeds. Experimental results showed that the proposed approach performed better than existing methods when it was tested on three machine learning models, which are support vector machines, multilayer perceptron, and convolutional neural networks. The framework had an accuracy of 94.07% on a standardized dataset, and 99.78% on localised dataset when used to extract sentiments from tweets. The improved performance on the localised dataset reveals the advantage of integrating the use of local knowledge sources into the process of analysing social media feeds particularly in interpreting slangs/acronyms/abbreviations that have contextually rooted meanings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源