论文标题
动态的低级快速高斯变换
A Dynamic Low-Rank Fast Gaussian Transform
论文作者
论文摘要
\ emph {fast Gaussian transform}(fgt)启用$ n \ times n $高斯内核矩阵$ \ mathsf {k} _ {i,j} = \ exp( - x_ji -x_jj \ x_j \ _2^2)$ h和usiverary usivary vestry的子分数乘法\ Mathbb {r}^n $,其中$ x_1,\ dots,x_n \ in \ mathbb {r}^d $是一组\ emph {pixed}源点。该内核在机器学习和随机特征图中起着核心作用。然而,在大多数现代数据分析应用程序中,数据集在动态变化(通常具有较低的等级),并从scratch中重新计算(基于内核)算法的FGT会导致主要的计算开销($ \ gtrsim n $ n $ n $ n $ n $ nime for单个源更新$ \ in \ in \ mathbb {r}^d $)。这些应用激励A \ emph {动态FGT}算法,该算法在\ emph {kernel-dense估计}(kDe)的查询中保持了一组动态源,以\ emph {sublineareartime}中的查询保持,同时保留MAT-VEC乘法准确性和速度。 假设动态的数据点$ x_i $位于(可能更改)$ k $维二比子空间($ k \ leq d $),我们的主要结果是一个有效的动态FGT算法,支持以下在$ \ log log^log^{o(k)}(k)}(n/\ varepsilon $ tirt aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd aidd and and time and time and)。与$ \ varepsilon $添加精度相对于来源的查询点的``核密度''。该算法的核心是一种动态数据结构,用于维持源和目标框之间的\ emph {投影}``交互级别'',并将其分解为泰勒和赫米特膨胀的有限截断。
The \emph{Fast Gaussian Transform} (FGT) enables subquadratic-time multiplication of an $n\times n$ Gaussian kernel matrix $\mathsf{K}_{i,j}= \exp ( - \| x_i - x_j \|_2^2 ) $ with an arbitrary vector $h \in \mathbb{R}^n$, where $x_1,\dots, x_n \in \mathbb{R}^d$ are a set of \emph{fixed} source points. This kernel plays a central role in machine learning and random feature maps. Nevertheless, in most modern data analysis applications, datasets are dynamically changing (yet often have low rank), and recomputing the FGT from scratch in (kernel-based) algorithms incurs a major computational overhead ($\gtrsim n$ time for a single source update $\in \mathbb{R}^d$). These applications motivate a \emph{dynamic FGT} algorithm, which maintains a dynamic set of sources under \emph{kernel-density estimation} (KDE) queries in \emph{sublinear time} while retaining Mat-Vec multiplication accuracy and speed. Assuming the dynamic data-points $x_i$ lie in a (possibly changing) $k$-dimensional subspace ($k\leq d$), our main result is an efficient dynamic FGT algorithm, supporting the following operations in $\log^{O(k)}(n/\varepsilon)$ time: (1) Adding or deleting a source point, and (2) Estimating the ``kernel-density'' of a query point with respect to sources with $\varepsilon$ additive accuracy. The core of the algorithm is a dynamic data structure for maintaining the \emph{projected} ``interaction rank'' between source and target boxes, decoupled into finite truncation of Taylor and Hermite expansions.