论文标题

découvrirde nouvelles班

Découvrir de nouvelles classes dans des données tabulaires

论文作者

Troisemaine, Colin, Flocon-Cholet, Joachim, Gosselin, Stéphane, Vaton, Sandrine, Reiffers-Masson, Alexandre, Lemaire, Vincent

论文摘要

在新颖的类发现(NCD)中,目标是在一个未标记的集合中找到新的类,并给定一组已知但不同的类。尽管NCD最近引起了社区的关注,但尽管非常普遍的数据表示,但尚未提出异质表格数据的框架。在本文中,我们提出了TabularNCD,这是一种在表格数据中发现新类别的新方法。我们展示了一种从已知的类别中提取知识的方法,以指导包含异质变量的表格数据中新型类的发现过程。该过程的一部分是通过定义伪标签的新方法来完成的,我们遵循多任务学习中的最新发现,以优化关节目标函数。我们的方法表明,NCD不仅适用于图像,而且适用于异质表格数据。

In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes. While NCD has recently gained attention from the community, no framework has yet been proposed for heterogeneous tabular data, despite being a very common representation of data. In this paper, we propose TabularNCD, a new method for discovering novel classes in tabular data. We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in the context of tabular data which contains heterogeneous variables. A part of this process is done by a new method for defining pseudo labels, and we follow recent findings in Multi-Task Learning to optimize a joint objective function. Our method demonstrates that NCD is not only applicable to images but also to heterogeneous tabular data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源