论文标题

研究排名引起的网络动态

Studying Ranking-Incentivized Web Dynamics

论文作者

Vasilisky, Ziv, Tennenholtz, Moshe, Kurland, Oren

论文摘要

许多网页作者的排名激励措施在Web动力学中起着重要作用。也就是说,选择将页面评为感兴趣的查询的作者经常通过操纵其页面来响应这些查询的排名;目标是提高页面的未来排名。最近使用游戏理论研究了这种动力学的各个理论方面。但是,由于缺乏公开可用的数据集,对动态的经验分析受到了高度限制。我们提供了基于TREC的ClueWeb09数据集的初始数据集。具体来说,我们使用Internet存档的Wayback Machine来构建一个文档集合,其中包含Clueweb文档的过去快照,这些快照是通过针对Chueweb查询进行的一些初始搜索来高度排名的。该数据集中文档变更的时间分析表明,最近针对文档作者之间的小规模受控排名竞争提出的发现也适用于Web数据。具体而言,文档的作者倾向于模仿过去高度排名的文档的内容,这种做法可以改善排名。

The ranking incentives of many authors of Web pages play an important role in the Web dynamics. That is, authors who opt to have their pages highly ranked for queries of interest, often respond to rankings for these queries by manipulating their pages; the goal is to improve the pages' future rankings. Various theoretical aspects of this dynamics have recently been studied using game theory. However, empirical analysis of the dynamics is highly constrained due to lack of publicly available datasets.We present an initial such dataset that is based on TREC's ClueWeb09 dataset. Specifically, we used the WayBack Machine of the Internet Archive to build a document collection that contains past snapshots of ClueWeb documents which are highly ranked by some initial search performed for ClueWeb queries. Temporal analysis of document changes in this dataset reveals that findings recently presented for small-scale controlled ranking competitions between documents' authors also hold for Web data. Specifically, documents' authors tend to mimic the content of documents that were highly ranked in the past, and this practice can result in improved ranking.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源