论文标题
混音:LSM-Trees的有效范围查询
REMIX: Efficient Range Query for LSM-trees
论文作者
论文摘要
基于LSM-TREE的键值(KV)存储在高速写入的多层结构中组织数据。传统LSM-Trees的范围查询必须从多个表文件中寻求和分类数据,这很昂贵,通常会导致平庸的阅读性能。为了提高LSM-Trees的范围查询效率,我们引入了一个名为Remix的空间效能KV索引数据结构,该数据结构记录了跨越多个表文件的kV数据的全球排序视图。在多个混音指标数据文件上查询范围查询可以使用二进制搜索快速定位目标键,并在没有键比较的情况下按顺序检索后续键。我们构建了RemixDB,这是一种基于LSM-TREE的KV商店,采用了写入有效的压实策略,并采用了快速点和范围查询的混音。实验结果表明,混音可以在基于LSM-TREE的KV商店中基本上改善范围查询性能。
LSM-tree based key-value (KV) stores organize data in a multi-level structure for high-speed writes. Range queries on traditional LSM-trees must seek and sort-merge data from multiple table files on the fly, which is expensive and often leads to mediocre read performance. To improve range query efficiency on LSM-trees, we introduce a space-efficient KV index data structure, named REMIX, that records a globally sorted view of KV data spanning multiple table files. A range query on multiple REMIX-indexed data files can quickly locate the target key using a binary search, and retrieve subsequent keys in sorted order without key comparisons. We build RemixDB, an LSM-tree based KV-store that adopts a write-efficient compaction strategy and employs REMIXes for fast point and range queries. Experimental results show that REMIXes can substantially improve range query performance in a write-optimized LSM-tree based KV-store.