论文标题

长尾多标签分类的互动匹配

Interaction Matching for Long-Tail Multi-Label Classification

论文作者

MacAvaney, Sean, Dernoncourt, Franck, Chang, Walter, Goharian, Nazli, Frieder, Ophir

论文摘要

我们提出了一种优雅的有效方法,可通过合并互动匹配来解决现有多标签分类模型中的局限性,该概念可用于临时搜索结果排名。通过执行软N-gram交互匹配,我们将标签与自然语言描述(在大多数多标记任务中具有常见)匹配。我们的方法可用于增强现有的多标签分类方法,这些方法偏向经常出现的标签。我们对两项具有挑战性的任务进行评估:临床笔记的自动医疗编码和软件教程文本实体的自动标记。我们的结果表明,我们的方法可以在宏观性能的相对相对提高11%,其中大多数增益型标签在训练集中很少出现(即标签的长尾巴)。

We present an elegant and effective approach for addressing limitations in existing multi-label classification models by incorporating interaction matching, a concept shown to be useful for ad-hoc search result ranking. By performing soft n-gram interaction matching, we match labels with natural language descriptions (which are common to have in most multi-labeling tasks). Our approach can be used to enhance existing multi-label classification approaches, which are biased toward frequently-occurring labels. We evaluate our approach on two challenging tasks: automatic medical coding of clinical notes and automatic labeling of entities from software tutorial text. Our results show that our method can yield up to an 11% relative improvement in macro performance, with most of the gains stemming labels that appear infrequently in the training set (i.e., the long tail of labels).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源