论文标题
随机傅立叶特征的随机矩阵分析:超越高斯内核,精确的相变和相应的双下降
A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent
论文作者
论文摘要
本文表征了随机傅立叶功能(RFF)回归的确切渐近学,在现实的环境中,数据示例$ n $,其尺寸$ p $以及功能空间$ n $的尺寸都是大且可比的。在此制度中,随机RFF矩阵不再收敛到众所周知的限制高斯核矩阵(就像单独使用$ n \ to \ infty $时一样),但它仍然具有通过我们的分析来捕获的可拖动行为。该分析还提供了对大$ N,P,N $的培训和测试回归错误的准确估计。基于这些估计,提供了两个定性不同的学习阶段的精确表征,包括它们之间的相变。并且相应的双下降测试误差曲线是从此相变行为得出的。这些结果不取决于对数据分布的强烈假设,并且它们完全匹配了现实世界数据集的经验结果。
This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on real-world data sets.