论文标题

迁移或不迁移:分布式流处理中操作员迁移的分析

To Migrate or not to Migrate: An Analysis of Operator Migration in Distributed Stream Processing

论文作者

Volnes, Espen, Plagemann, Thomas, Goebel, Vera

论文摘要

数据流处理系统中最重要的问题之一是使用操作员迁移以具有成本效益的方式处理高度可变的工作负载,并在任何给定时间按需适应需求。操作员迁移是一个复杂的过程,涉及状态的变化和运行查询的流管理,通常不会丢失数据,并且对执行的中断却很小。这项调查从历史角度以及移民目标的角度概述了操作员迁移的解决方案。它引入了操作员迁移的概念模型,以建立统一的术语并对现有解决方案进行分类。分析了该地区的现有工作,以将迁移的机制与迁移数据的决定分开。如果是后者,则强调了对操作员迁移很重要的成本效益分析,但通常仅隐含地解决或完全被忽略。对可用解决方案的描述可从算法观点从读者那里很好地了解设计替代方案。我们通过一项实证研究对此进行补充,以提供有关不同设计替代方案对迁移机制的影响的定量见解。

One of the most important issues in data stream processing systems is to use operator migration to handle highly variable workloads in a cost-efficient manner and adapt to the needs at any given time on demand. Operator migration is a complex process that involves changes in the state and stream management of a running query, typically without any loss of data, and with as little disruption to the execution as possible. This survey provides an overview of solutions for operator migration from a historical perspective as well as the perspective of the goal of migration. It introduces a conceptual model of operator migration to establish a unified terminology and classify existing solutions. Existing work in the area is analyzed to separate the mechanism of migration from the decision to migrate the data. In case of the latter, a cost-benefit analysis is emphasized that is important for operator migration but is often only implicitly addressed, or is neglected altogether. A description of the available solutions provides the reader with a good understanding of the design alternatives from an algorithmic viewpoint. We complement this with an empirical study to provide quantitative insights on the impact of different design alternatives on the mechanisms of migration.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源