论文标题
在Fréchet距离下的子库的更快近似覆盖率
Faster Approximate Covering of Subcurves under the Fréchet Distance
论文作者
论文摘要
亚axtrajectory聚类是轨迹聚类问题的重要变体,在轨迹聚类问题中,在收集到的轨迹数据中的轨迹模式的开始和终点是未提前知道的。我们以给定的多边形曲线的设定覆盖问题的形式研究了这个问题:找到代表性曲线的最小数字$ k $,以使输入曲线上的任何点都包含在一个具有fréchet距离的子库中,该子曲线最多具有给定的$Δ$,对代表性曲线。我们专注于代表性曲线是线段的情况,并通过来自几何套装覆盖区域的经典技术来解决这个NP硬化问题:我们使用Brönniman和Goodrich首先提出的多重权重更新方法的变体用于与小型VC差异的设置覆盖率。我们获得了一种双晶型算法,该算法计算一组$ o(k \ log(k))$线段,该线段涵盖了给定的多边形曲线,最多$ o(δ)$的fréchet距离下的$ n $ vertices。我们表明该算法在$ \ widetilde {o}(k^2 n + k n^3)$时期内运行,并使用$ \ widetilde {o}(k n + n^3)$ space。对于$ c $包装的二维输入曲线,我们将预期的运行时间限制在$ \ wideTilde {o}(k^2 c^2 n)$和$ \ widetilde {o}(kn + c^2 n)$的空间。在$ \ mathbb {r}^d $中,$ n $的依赖关系是二次的。此外,我们还提供了算法的一种变体,该变体在候选设置上使用隐式权重更新,从而在$ n $中实现接近线性的运行时间,而在输入曲线上没有任何假设,同时保持相同的近似值范围。这是以对相对弧长的较小(多毛体)依赖性为代价的。
Subtrajectory clustering is an important variant of the trajectory clustering problem, where the start and endpoints of trajectory patterns within the collected trajectory data are not known in advance. We study this problem in the form of a set cover problem for a given polygonal curve: find the smallest number $k$ of representative curves such that any point on the input curve is contained in a subcurve that has Fréchet distance at most a given $Δ$ to a representative curve. We focus on the case where the representative curves are line segments and approach this NP-hard problem with classical techniques from the area of geometric set cover: we use a variant of the multiplicative weights update method which was first suggested by Brönniman and Goodrich for set cover instances with small VC-dimension. We obtain a bicriteria-approximation algorithm that computes a set of $O(k\log(k))$ line segments that cover a given polygonal curve of $n$ vertices under Fréchet distance at most $O(Δ)$. We show that the algorithm runs in $\widetilde{O}(k^2 n + k n^3)$ time in expectation and uses $ \widetilde{O}(k n + n^3)$ space. For two dimensional input curves that are $c$-packed, we bound the expected running time by $\widetilde{O}(k^2 c^2 n)$ and the space by $ \widetilde{O}(kn + c^2 n)$. In $\mathbb{R}^d$ the dependency on $n$ instead is quadratic. In addition, we present a variant of the algorithm that uses implicit weight updates on the candidate set and thereby achieves near-linear running time in $n$ without any assumptions on the input curve, while keeping the same approximation bounds. This comes at the expense of a small (polylogarithmic) dependency on the relative arclength.