论文标题

部分可观测时空混沌系统的无模型预测

ProtoBandit: Efficient Prototype Selection via Multi-Armed Bandits

论文作者

Chaudhuri, Arghya Roy, Jawanpuria, Pratik, Mishra, Bamdev

论文摘要

在这项工作中,我们提出了一个基于多臂匪徒的框架,用于从源数据集$ s $识别一套紧凑的信息数据实例(即原型),该实例最能代表给定的目标集合$ t $。给定数据集的原型示例提供了对基本数据分布的可解释见解,并协助基于示例的推理,从而影响了人类决策的每个领域。当前的最新原型选择方法需要$ O(| s || T |)$相似性比较源和目标数据点之间的相似性比较,这对于大规模设置而言变得非常昂贵。我们建议通过在典型示例和多武器匪徒的空间中采用随机贪婪的搜索来减轻这种限制,以减少相似性比较的数量。我们的随机算法Protebandit确定了一组$ k $原型的$ O(k^3 | s |)$相似性比较,该比较与目标集的大小无关。我们分析的一个有趣的结果是对于$ K $ - 中间的聚类问题$ t = S $设置),其中我们表明我们的算法Protebandit近似于围绕Medoids(PAM)方法的构建步骤解决方案(k^3 | S |)$复杂性。从经验上讲,我们观察到,原始体可以减少几个大小订单($ 100-1000美元)的相似性计算调用的数量,同时获得与最先进方法相似的解决方案。

In this work, we propose a multi-armed bandit-based framework for identifying a compact set of informative data instances (i.e., the prototypes) from a source dataset $S$ that best represents a given target set $T$. Prototypical examples of a given dataset offer interpretable insights into the underlying data distribution and assist in example-based reasoning, thereby influencing every sphere of human decision-making. Current state-of-the-art prototype selection approaches require $O(|S||T|)$ similarity comparisons between source and target data points, which becomes prohibitively expensive for large-scale settings. We propose to mitigate this limitation by employing stochastic greedy search in the space of prototypical examples and multi-armed bandits for reducing the number of similarity comparisons. Our randomized algorithm, ProtoBandit, identifies a set of $k$ prototypes incurring $O(k^3|S|)$ similarity comparisons, which is independent of the size of the target set. An interesting outcome of our analysis is for the $k$-medoids clustering problem $T = S$ setting) in which we show that our algorithm ProtoBandit approximates the BUILD step solution of the partitioning around medoids (PAM) method in $O(k^3|S|)$ complexity. Empirically, we observe that ProtoBandit reduces the number of similarity computation calls by several orders of magnitudes ($100-1000$ times) while obtaining solutions similar in quality to those from state-of-the-art approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源