论文标题
简单:$ k $ -subset采样的梯度估计器
SIMPLE: A Gradient Estimator for $k$-Subset Sampling
论文作者
论文摘要
$ k $ -subset采样在机器学习中无处不在,通过稀疏实现正规化和可解释性。挑战在于渲染$ k $ -subset的抽样,适合端到端学习。这通常涉及放松重新聚集样品以进行反向传播,并冒着引入高偏差和高方差的风险。在这项工作中,我们回到了前票的离散$ k $ -subset采样。这与使用梯度相对于确切的边缘,有效地计算出的梯度作为真实梯度的代理。我们表明,与最先进的估计器相比,我们的梯度估计器(简单)表现出较低的偏差和差异,包括$ k = 1 $时的直通胶囊估计器。经验结果表明,在学习解释和稀疏线性回归方面的表现有所提高。我们提供了一种用于计算$ k $ subset分布的精确ELBO的算法,与SOTA相比,损失的损失明显降低。
$k$-subset sampling is ubiquitous in machine learning, enabling regularization and interpretability through sparsity. The challenge lies in rendering $k$-subset sampling amenable to end-to-end learning. This has typically involved relaxing the reparameterized samples to allow for backpropagation, with the risk of introducing high bias and high variance. In this work, we fall back to discrete $k$-subset sampling on the forward pass. This is coupled with using the gradient with respect to the exact marginals, computed efficiently, as a proxy for the true gradient. We show that our gradient estimator, SIMPLE, exhibits lower bias and variance compared to state-of-the-art estimators, including the straight-through Gumbel estimator when $k = 1$. Empirical results show improved performance on learning to explain and sparse linear regression. We provide an algorithm for computing the exact ELBO for the $k$-subset distribution, obtaining significantly lower loss compared to SOTA.