论文标题
在极端多标签分类中缺少标签,长尾和倾向
On Missing Labels, Long-tails and Propensities in Extreme Multi-label Classification
论文作者
论文摘要
Jain等人引入的倾向模型。 2016年已成为在极端多标签分类(XMLC)中处理缺失和长尾标签的标准方法。在本文中,我们对这种方法进行了批判性修订,表明尽管具有理论上的声音,但它在当代XMLC作品中的应用仍是有争议的。我们详尽地讨论了基于倾向的方法的缺陷,并提出了几种食谱,其中一些与搜索引擎和推荐系统中使用的解决方案有关,我们认为这构成了XMLC中遵循的有希望的替代方案。
The propensity model introduced by Jain et al. 2016 has become a standard approach for dealing with missing and long-tail labels in extreme multi-label classification (XMLC). In this paper, we critically revise this approach showing that despite its theoretical soundness, its application in contemporary XMLC works is debatable. We exhaustively discuss the flaws of the propensity-based approach, and present several recipes, some of them related to solutions used in search engines and recommender systems, that we believe constitute promising alternatives to be followed in XMLC.