论文标题

来自高吞吐量聚合物分子动力学模拟的原始仿真数据自动化和共享分析的云平台

A cloud platform for automating and sharing analysis of raw simulation data from high throughput polymer molecular dynamics simulations

论文作者

Xie, Tian, Kwon, Ha-Kyung, Schweigert, Daniel, Gong, Sheng, France-Lanord, Arthur, Khajeh, Arash, Crabb, Emily, Puzon, Michael, Fajardo, Chris, Powelson, Will, Shao-Horn, Yang, Grossman, Jeffrey C.

论文摘要

存储数十万个材料结构及其相应特性的开放材料数据库已成为现代计算材料科学的基石。然而,模拟的原始输出,例如分子动力学模拟的轨迹和密度功能理论计算的电荷密度,通常由于其较大的尺寸而没有共享。在这项工作中,我们描述了一个基于云的平台,以促进共享原始数据,并启用云中快速的后处理以提取用户定义的新属性。作为初始演示,我们的数据库目前包括6286个用于无定形聚合物电解质的分子动力学轨迹和5.7 thabytes数据库。我们使用专家设计的功能和机器学习模型,在https://github.com/tri-amdd/htp_md上创建一个公共分析库,以从原始数据中提取多个属性。该分析是通过云中的计算自动运行的,然后结果填充可以公开访问的数据库。我们的平台鼓励用户通过公共接口贡献新的轨迹数据和分析功能。新分析的属性将纳入数据库。最后,我们在https://www.htpmd.matr.io上创建一个前端用户界面,以浏览和可视化数据。我们设想该平台将是一种为计算材料科学界共享原始数据和新见解的新方法。

Open material databases storing hundreds of thousands of material structures and their corresponding properties have become the cornerstone of modern computational materials science. Yet, the raw outputs of the simulations, such as the trajectories from molecular dynamics simulations and charge densities from density functional theory calculations, are generally not shared due to their huge size. In this work, we describe a cloud-based platform to facilitate the sharing of raw data and enable the fast post-processing in the cloud to extract new properties defined by the user. As an initial demonstration, our database currently includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes and 5.7 terabytes of data. We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract multiple properties from the raw data, using both expert designed functions and machine learning models. The analysis is run automatically with computation in the cloud, and results then populate a database that can be accessed publicly. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Newly analyzed properties will be incorporated into the database. Finally, we create a front-end user interface at https://www.htpmd.matr.io for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the computational materials science community.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源