论文标题

构建和分析LSM压实设计空间(更新版本)

Constructing and Analyzing the LSM Compaction Design Space (Updated Version)

论文作者

Sarkar, Subhadeep, Staratzis, Dimitris, Zhu, Zichen, Athanassoulis, Manos

论文摘要

日志结构合并(LSM)树通过附加传入数据提供有效的摄入,因此被广泛用作生产NOSQL数据存储的存储层。为了启用竞争性阅读性能,LSM-Trees会定期重新组织数据,以通过迭代压实形成具有指数级别的能力水平的树。从书面放大,写入吞吐量,点和范围查找性能,空间放大和删除性能方面,契约从根本上影响了LSM引擎的性能。因此,选择适当的压实策略至关重要,与此同时,由于LSM录音设计空间是巨大的,在很大程度上没有探索,并且在文献中尚未正式定义。结果,大多数基于LSM的发动机都使用固定的压实策略,通常由工程师手工挑选,该策略决定如何和何时压实数据。 在本文中,我们介绍了LSM-Compactions的设计空间,并评估有关关键性能指标的最新压实策略。为了实现这一目标,我们的第一个贡献是引入一个可以正式定义任何压实策略的四个设计原始素:(i)压实触发器,(ii)数据布局,(iii)压实粒度和(iv)数据运动策略。这些原语可以共同综合现有和全新的压实策略。我们的第二个贡献是实验分析10种压实策略。我们提出了12个观察结果和7个高级外卖消息,其中显示了LSM系统如何导航压实设计空间。

Log-structured merge (LSM) trees offer efficient ingestion by appending incoming data, and thus, are widely used as the storage layer of production NoSQL data stores. To enable competitive read performance, LSM-trees periodically re-organize data to form a tree with levels of exponentially increasing capacity, through iterative compactions. Compactions fundamentally influence the performance of an LSM-engine in terms of write amplification, write throughput, point and range lookup performance, space amplification, and delete performance. Hence, choosing the appropriate compaction strategy is crucial and, at the same time, hard as the LSM-compaction design space is vast, largely unexplored, and has not been formally defined in the literature. As a result, most LSM-based engines use a fixed compaction strategy, typically hand-picked by an engineer, which decides how and when to compact data. In this paper, we present the design space of LSM-compactions, and evaluate state-of-the-art compaction strategies with respect to key performance metrics. Toward this goal, our first contribution is to introduce a set of four design primitives that can formally define any compaction strategy: (i) the compaction trigger, (ii) the data layout, (iii) the compaction granularity, and (iv) the data movement policy. Together, these primitives can synthesize both existing and completely new compaction strategies. Our second contribution is to experimentally analyze 10 compaction strategies. We present 12 observations and 7 high-level takeaway messages, which show how LSM systems can navigate the compaction design space.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源