论文标题
协会规则文本挖掘的基于人群的元尿学
Population-based metaheuristics for Association Rule Text Mining
论文作者
论文摘要
如今,Internet上的大多数数据都以非结构化格式持有,例如网站和电子邮件。分析这些数据的重要性一直在日益增长。与结构化数据的数据挖掘类似,用于处理非结构化数据的文本挖掘方法也受到了研究界的越来越多的关注。该论文涉及关联规则文本挖掘的问题。为了解决该问题,提出了PSO-ARTM方法,该方法包括三个步骤:文本预处理,使用基于人群的元启发式学和文本后处理。该方法应用于从专业铁人三项运动员博客和网站上发布的新闻中获得的事务数据库。获得的结果表明,所提出的方法适用于关联规则文本挖掘,因此为进一步开发提供了一种有希望的方法。
Nowadays, the majority of data on the Internet is held in an unstructured format, like websites and e-mails. The importance of analyzing these data has been growing day by day. Similar to data mining on structured data, text mining methods for handling unstructured data have also received increasing attention from the research community. The paper deals with the problem of Association Rule Text Mining. To solve the problem, the PSO-ARTM method was proposed, that consists of three steps: Text preprocessing, Association Rule Text Mining using population-based metaheuristics, and text postprocessing. The method was applied to a transaction database obtained from professional triathlon athletes' blogs and news posted on their websites. The obtained results reveal that the proposed method is suitable for Association Rule Text Mining and, therefore, offers a promising way for further development.