简单的交替最小化证明解决了完整的词典学习

论文标题

简单的交替最小化证明解决了完整的词典学习

Simple Alternating Minimization Provably Solves Complete Dictionary Learning

论文作者

Liang, Geyu, Zhang, Gavin, Fattahi, Salar, Zhang, Richard Y.

论文摘要

本文重点介绍了无嘈杂的完整词典学习问题，在该问题中，目标是将一组给定的信号表示为少数学术词典的少数原子的线性组合。关于词典学习的理论和实践研究面临两个主要挑战：在处理大规模数据集时，缺乏实际使用的启发式算法的理论保证及其可扩展性。为了解决这些问题，我们提出了一种简单有效的算法，该算法可在无噪声环境中应用于非概念和离散提出问题时恢复地面真相。我们还将提出的方法扩展到迷你批次和在线设置，其中数据是巨大的或随着时间的流逝而不断到达的。我们提出的方法的核心是一种有效的预处理技术，它将未知字典转换为近乎正常的词典，为此，我们证明了一种简单的交替最小化技术在最小条件下线性地收敛到地面真相。与现有技术相比，我们关于合成和真实数据集的数值实验展示了我们方法的优越性。

This paper focuses on the noiseless complete dictionary learning problem, where the goal is to represent a set of given signals as linear combinations of a small number of atoms from a learned dictionary. There are two main challenges faced by theoretical and practical studies of dictionary learning: the lack of theoretical guarantees for practically-used heuristic algorithms and their poor scalability when dealing with huge-scale datasets. Towards addressing these issues, we propose a simple and efficient algorithm that provably recovers the ground truth when applied to the nonconvex and discrete formulation of the problem in the noiseless setting. We also extend our proposed method to mini-batch and online settings where the data is huge-scale or arrives continuously over time. At the core of our proposed method lies an efficient preconditioning technique that transforms the unknown dictionary to a near-orthonormal one, for which we prove a simple alternating minimization technique converges linearly to the ground truth under minimal conditions. Our numerical experiments on synthetic and real datasets showcase the superiority of our method compared with the existing techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题