论文标题
一般一阶方法的估计误差
The estimation error of general first order methods
论文作者
论文摘要
现代的大型统计模型需要估算数千个参数。这通常是通过迭代算法(例如梯度下降,预计梯度下降或加速版本)来实现的。这些方法的基本限制是什么?从优化的角度来理解这个问题时,当基础目标是凸。在该领域的工作是将差距与全局最优性的特征,这是迭代次数的函数。但是,这些结果仅在统计最佳差距方面具有间接的影响。 在这里,我们考虑了两个具有高维估计问题的家族:高维回归和低级别矩阵估计,并引入了一类“一般一阶方法”,旨在有效估计基本参数。这类算法足够广泛,可以包括经典的一阶优化(对于凸和非凸目标),也包括其他类型的算法。在一个随机设计假设下,我们在高维渐近学中存在的估计误差中得出了下限,在高维渐近学中,观测值和参数数量差异。从存在算法的意义上说,这些下限是最佳的,其估计误差与渐近忽略不计的术语相匹配。我们通过应用于稀疏相位检索和稀疏主组件分析来说明我们的一般结果。
Modern large-scale statistical models require to estimate thousands to millions of parameters. This is often accomplished by iterative algorithms such as gradient descent, projected gradient descent or their accelerated versions. What are the fundamental limits to these approaches? This question is well understood from an optimization viewpoint when the underlying objective is convex. Work in this area characterizes the gap to global optimality as a function of the number of iterations. However, these results have only indirect implications in terms of the gap to statistical optimality. Here we consider two families of high-dimensional estimation problems: high-dimensional regression and low-rank matrix estimation, and introduce a class of `general first order methods' that aim at efficiently estimating the underlying parameters. This class of algorithms is broad enough to include classical first order optimization (for convex and non-convex objectives), but also other types of algorithms. Under a random design assumption, we derive lower bounds on the estimation error that hold in the high-dimensional asymptotics in which both the number of observations and the number of parameters diverge. These lower bounds are optimal in the sense that there exist algorithms whose estimation error matches the lower bounds up to asymptotically negligible terms. We illustrate our general results through applications to sparse phase retrieval and sparse principal component analysis.