论文标题
梯度流和随机阈值:稀疏反转和分类
Gradient flows and randomised thresholding: sparse inversion and classification
论文作者
论文摘要
稀疏的反转和分类问题在现代数据科学和成像中无处不在。它们通常被称为非平滑最小化问题。在稀疏的反转中,我们最大程度地减少了数据保真度项和L1/Lasso常规仪的总和。在分类中,我们认为,例如,数据保真度项的总和和非平滑的金茨堡 - landau Energy。在解决此类问题时,标准(子)梯度下降方法已显示出效率低下。拆分技术更有用:在这里,目标函数分为两个子目标函数的总和 - 每个函数都可以有效地优化。通过相对于两个子目标函数中的每个函数,分裂收益通过进行优化步骤。 在这项工作中,我们从随机连续时间的角度研究分裂。实际上,我们定义了一个差分包含,该包含在每个时间点都遵循两个子键函数的负分别差异。子目标函数的选择由二进制连续时间马尔可夫过程控制。所得的动力学系统是基础亚级别流的随机近似。我们研究了这种随机近似,用于L1调节的稀疏反转流以及离散的Allen-Cahn方程最小化Ginzburg-Landau Energy。在这两种情况下,我们都会研究随机动力学系统的长期行为及其在任何精确度上近似基础亚级别流量的能力。我们在简单的稀疏估计问题以及低维分类问题中说明了我们的理论发现。
Sparse inversion and classification problems are ubiquitous in modern data science and imaging. They are often formulated as non-smooth minimisation problems. In sparse inversion, we minimise, e.g., the sum of a data fidelity term and an L1/LASSO regulariser. In classification, we consider, e.g., the sum of a data fidelity term and a non-smooth Ginzburg--Landau energy. Standard (sub)gradient descent methods have shown to be inefficient when approaching such problems. Splitting techniques are much more useful: here, the target function is partitioned into a sum of two subtarget functions -- each of which can be efficiently optimised. Splitting proceeds by performing optimisation steps alternately with respect to each of the two subtarget functions. In this work, we study splitting from a stochastic continuous-time perspective. Indeed, we define a differential inclusion that follows one of the two subtarget function's negative subdifferential at each point in time. The choice of the subtarget function is controlled by a binary continuous-time Markov process. The resulting dynamical system is a stochastic approximation of the underlying subgradient flow. We investigate this stochastic approximation for an L1-regularised sparse inversion flow and for a discrete Allen-Cahn equation minimising a Ginzburg--Landau energy. In both cases, we study the longtime behaviour of the stochastic dynamical system and its ability to approximate the underlying subgradient flow at any accuracy. We illustrate our theoretical findings in a simple sparse estimation problem and also in low- and high-dimensional classification problems.