论文标题
在单词上对MSO逻辑的排名
Ranked enumeration of MSO logic on words
论文作者
论文摘要
在过去的几年中,对延迟有限的枚举算法引起了许多数据管理任务的关注。给定查询和数据,任务是预处理数据,然后列举查询的所有答案。当发现相关解决方案后,或我们想停止枚举时,这种枚举方案通常很有用。但是,使用当前方案,对解决方案的给定顺序没有任何限制,此顺序通常取决于使用的技术,而不是与用户的相关性。 在本文中,我们研究了溶液排名时对单词的列出列出的枚举。我们提出了一个基于MSO成本功能的框架,该框架允许在单词上表达与每个解决方案相关的成本的单词公式。然后,我们演示了我们的框架的一般性,例如,该框架所包含的文档跨度和常规复杂事件处理查询,并将排名添加到其中。本文的主要技术结果是一种算法,用于以成本成本效率的越来越高的顺序列举公式的所有解决方案,即,解决方案之间的线性预处理阶段和对数延迟。该算法的新颖性是基于使用功能数据结构,特别是通过扩展功能性的Brodal队列来适合MSO对单词上的排名枚举。
In the last years, enumeration algorithms with bounded delay have attracted a lot of attention for several data management tasks. Given a query and the data, the task is to preprocess the data and then enumerate all the answers to the query one by one and without repetitions. This enumeration scheme is typically useful when the solutions are treated on the fly or when we want to stop the enumeration once the pertinent solutions have been found. However, with the current schemes, there is no restriction on the order how the solutions are given and this order usually depends on the techniques used and not on the relevance for the user. In this paper we study the enumeration of monadic second order logic (MSO) over words when the solutions are ranked. We present a framework based on MSO cost functions that allows to express MSO formulae on words with a cost associated with each solution. We then demonstrate the generality of our framework which subsumes, for instance, document spanners and regular complex event processing queries and adds ranking to them. The main technical result of the paper is an algorithm for enumerating all the solutions of formulae in increasing order of cost efficiently, namely, with a linear preprocessing phase and logarithmic delay between solutions. The novelty of this algorithm is based on using functional data structures, in particular, by extending functional Brodal queues to suit with the ranked enumeration of MSO on words.