AIIDA的工作流程：工程高通量，基于事件的引擎，用于健壮和模块化计算工作流程

论文标题

AIIDA的工作流程：工程高通量，基于事件的引擎，用于健壮和模块化计算工作流程

Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

论文作者

Uhrin, Martin, Huber, Sebastiaan P., Yu, Jusong, Marzari, Nicola, Pizzi, Giovanni

论文摘要

在过去的二十年中，计算科学领域已经看到了将高通量计算和大数据分析作为科学发现过程的基本支柱的巨大转变。这需要开发工具和技术来处理大量数据的生成，存储和处理。在这项工作中，我们深入介绍了为AIIDA供电的工作流引擎，这是一种广泛采用的，高度灵活的和数据库支持的信息学基础架构，重点是数据可重复性。我们详细介绍了许多由几个重要目标所启示的设计选择：从单个笔记本电脑上运行到高性能超级计算机的能力，从二次到几周的分数中管理工作，并同时扩大了数千个作业，并最大程度地扩展了这些作业。简而言之，AIIDA的目标是成为高通量计算科学的瑞士军刀。除了体系结构外，我们还概述了重要的API设计选择，以使工作流作者有很多自由，同时指导他们撰写强大而模块化的工作流程，最终使他们能够对他们的科学知识进行编码，以使更广泛的科学界受益。

Over the last two decades, the field of computational science has seen a dramatic shift towards incorporating high-throughput computation and big-data analysis as fundamental pillars of the scientific discovery process. This has necessitated the development of tools and techniques to deal with the generation, storage and processing of large amounts of data. In this work we present an in-depth look at the workflow engine powering AiiDA, a widely adopted, highly flexible and database-backed informatics infrastructure with an emphasis on data reproducibility. We detail many of the design choices that were made which were informed by several important goals: the ability to scale from running on individual laptops up to high-performance supercomputers, managing jobs with runtimes spanning from fractions of a second to weeks and scaling up to thousands of jobs concurrently, and all this while maximising robustness. In short, AiiDA aims to be a Swiss army knife for high-throughput computational science. As well as the architecture, we outline important API design choices made to give workflow writers a great deal of liberty whilst guiding them towards writing robust and modular workflows, ultimately enabling them to encode their scientific knowledge to the benefit of the wider scientific community.

下载PDF全文

下载文献需遵守相关版权规定

论文标题