论文标题

Pash:轻触摸数据并行外壳处理

PaSh: Light-touch Data-Parallel Shell Processing

论文作者

Vasilakis, Nikos, Kallas, Konstantinos, Mamouras, Konstantinos, Benetopoulos, Achilleas, Cvetković, Lazar

论文摘要

本文介绍了{\ scshape pash},这是一个并行化posix shell脚本的系统。 Given a script, {\scshape PaSh} converts it to a dataflow graph, performs a series of semantics-preserving program transformations that expose parallelism, and then converts the dataflow graph back into a script -- one that adds POSIX constructs to explicitly guide parallelism coupled with {\scshape PaSh}-provided {\scshape Unix}-aware runtime primitives for addressing与性能和正确性有关的问题。轻巧的注释语言允许命令开发人员在其命令上表达关键的并行性属性。 POSIX和GNU命令的随附的并行性研究(两个大型且常用的组)指导注释语言和优化的聚合库{\ scshape pash}使用。最后,{\ scshape pash}'{\ scshape pash}对44个未修改的{\ scshape unix}的广泛评估显示了显着的加速($ 0.89 $ - $ 61.1 \ $ 61.1 \ times $,avg:avg:$ 6.7 \ times $ $ 6.7 \ times $)。

This paper presents {\scshape PaSh}, a system for parallelizing POSIX shell scripts. Given a script, {\scshape PaSh} converts it to a dataflow graph, performs a series of semantics-preserving program transformations that expose parallelism, and then converts the dataflow graph back into a script -- one that adds POSIX constructs to explicitly guide parallelism coupled with {\scshape PaSh}-provided {\scshape Unix}-aware runtime primitives for addressing performance- and correctness-related issues. A lightweight annotation language allows command developers to express key parallelizability properties about their commands. An accompanying parallelizability study of POSIX and GNU commands -- two large and commonly used groups -- guides the annotation language and optimized aggregator library that {\scshape PaSh} uses. Finally, {\scshape PaSh}'s {\scshape PaSh}'s extensive evaluation over 44 unmodified {\scshape Unix} scripts shows significant speedups ($0.89$--$61.1\times$, avg: $6.7\times$) stemming from the combination of its program transformations and runtime primitives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源