论文标题

用学习的VPA验证流式JSON文档

Validating Streaming JSON Documents with Learned VPAs

论文作者

Bruyère, Véronique, Perez, Guillermo A., Staquet, Gaëtan

论文摘要

我们提出了一种新的流算法,以验证JSON文档,以根据JSON模式给出的一组约束。在JSON文档可以保留的可能值中,对象是键值对的无序集合,而数组则是订购值的值。我们证明,始终存在一个明显的下降自动机(VPA),该自动机(VPA)接受与JSON模式相同的JSON文档集。利用这一结果,我们的方法依赖于为提供的模式学习VPA。由于学到的VPA在对象的键值对上采用固定顺序,因此我们以特殊的图表抽象其过渡,并使用VPA及其图提出了有效的流算算法,以决定JSON文档是否对模式有效。我们在许多随机的JSON文档上评估了算法的实现,并将其与经典验证算法进行比较。

We present a new streaming algorithm to validate JSON documents against a set of constraints given as a JSON schema. Among the possible values a JSON document can hold, objects are unordered collections of key-value pairs while arrays are ordered collections of values. We prove that there always exists a visibly pushdown automaton (VPA) that accepts the same set of JSON documents as a JSON schema. Leveraging this result, our approach relies on learning a VPA for the provided schema. As the learned VPA assumes a fixed order on the key-value pairs of the objects, we abstract its transitions in a special kind of graph, and propose an efficient streaming algorithm using the VPA and its graph to decide whether a JSON document is valid for the schema. We evaluate the implementation of our algorithm on a number of random JSON documents, and compare it to the classical validation algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源