论文标题
部分可观测时空混沌系统的无模型预测
DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries
论文作者
论文摘要
我们研究了$ \ textit {vector set sear search} $的问题,其中$ \ textit {vector set queries} $。此任务类似于传统的近邻居搜索,但收藏集中的查询和每个元素都是向量的$ \ textit {sets} $。我们将这个问题确定为语义搜索应用程序的核心子例程,并发现现有解决方案速度不佳。为此,我们提出了一种新的近似搜索算法,甜点($ {\ bf d} $ essert $ {\ bf e} $ ffeciently $ {\ bf s} $ earches $ earches $ {\ bf s} $ { t} $ ables)。甜点是一种通用工具,具有强大的理论保证和出色的经验表现。当我们将甜点集成到最先进的语义搜索模型Colbert中时,我们发现MS MASCO和LOTTE检索基准的2-5倍加速,召回率最少,强调了我们的建议的有效性和实际适用性。
We study the problem of $\textit{vector set search}$ with $\textit{vector set queries}$. This task is analogous to traditional near-neighbor search, with the exception that both the query and each element in the collection are $\textit{sets}$ of vectors. We identify this problem as a core subroutine for semantic search applications and find that existing solutions are unacceptably slow. Towards this end, we present a new approximate search algorithm, DESSERT (${\bf D}$ESSERT ${\bf E}$ffeciently ${\bf S}$earches ${\bf S}$ets of ${\bf E}$mbeddings via ${\bf R}$etrieval ${\bf T}$ables). DESSERT is a general tool with strong theoretical guarantees and excellent empirical performance. When we integrate DESSERT into ColBERT, a state-of-the-art semantic search model, we find a 2-5x speedup on the MS MARCO and LoTTE retrieval benchmarks with minimal loss in recall, underscoring the effectiveness and practical applicability of our proposal.