论文标题

除PAC框架以外的多类可学习性:通用费率和部分概念类别

Multiclass Learnability Beyond the PAC Framework: Universal Rates and Partial Concept Classes

论文作者

Kalavasis, Alkis, Velegkas, Grigoris, Karbasi, Amin

论文摘要

在本文中,我们研究了在可实现的环境中使用有限数量的不同标签$ k $的多类分类的问题。我们将传统的PAC模型扩展到a)与分布相关的学习率,b)在数据依赖性假设下学习率。首先,我们考虑通用学习环境(Bousquet,Hanneke,Moran,Van Handel和Yehudayoff,Stoc '21),为此,我们为每个固定分布所持有的可实现的学习率提供了完整的表征。特别是,我们显示以下三分法:对于任何概念类别,最佳学习率是指数级,线性或任意缓慢的。此外,我们还提供了基本假设类别的复杂度度量,这些假设类别是何时发生的。其次,我们考虑使用结构化数据(例如位于低维歧管或满足边缘条件下的数据)的多类分类问题,该设置由部分概念类别(Alon,Hanneke,Holzman和Moran,focs '21)捕获。部分概念是在输入空间的某些部分中不确定的函数。我们将总概念类别的传统PAC可学习性扩展到多类环境中的部分概念类,并研究部分概念和总概念之间的差异。

In this paper we study the problem of multiclass classification with a bounded number of different labels $k$, in the realizable setting. We extend the traditional PAC model to a) distribution-dependent learning rates, and b) learning rates under data-dependent assumptions. First, we consider the universal learning setting (Bousquet, Hanneke, Moran, van Handel and Yehudayoff, STOC '21), for which we provide a complete characterization of the achievable learning rates that holds for every fixed distribution. In particular, we show the following trichotomy: for any concept class, the optimal learning rate is either exponential, linear or arbitrarily slow. Additionally, we provide complexity measures of the underlying hypothesis class that characterize when these rates occur. Second, we consider the problem of multiclass classification with structured data (such as data lying on a low dimensional manifold or satisfying margin conditions), a setting which is captured by partial concept classes (Alon, Hanneke, Holzman and Moran, FOCS '21). Partial concepts are functions that can be undefined in certain parts of the input space. We extend the traditional PAC learnability of total concept classes to partial concept classes in the multiclass setting and investigate differences between partial and total concepts.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源