论文标题

在执行共同的I/O操作中,HDF5,ZARR和NETCDF4的比较

A Comparison of HDF5, Zarr, and netCDF4 in Performing Common I/O Operations

论文作者

Ambatipudi, Sriniket, Byna, Suren

论文摘要

科学数据通常存储在文件中,因为它们在管理,传输和共享数据方面提供了简单性。这些文件通常以特定的布置结构结构,并包含元数据以了解数据存储的结构。在各种科学域中使用了许多文件格式,可为存储和检索数据提供抽象。由于旨在快速而轻松地存储大量科学数据的大量文件格式,出现的问题是:“哪种科学文件格式最适合一般用例?”在这项研究中,我们为常见文件操作编辑了一组基准,即创建,打开,读取,写和关闭,并使用这些基准的结果比较了三种流行格式:HDF5,NetCDF4和Zarr。

Scientific data is often stored in files because of the simplicity they provide in managing, transferring, and sharing data. These files are typically structured in a specific arrangement and contain metadata to understand the structure the data is stored in. There are numerous file formats in use in various scientific domains that provide abstractions for storing and retrieving data. With the abundance of file formats aiming to store large amounts of scientific data quickly and easily, a question that arises is, "Which scientific file format is best suited for a general use case?" In this study, we compiled a set of benchmarks for common file operations, i.e., create, open, read, write, and close, and used the results of these benchmarks to compare three popular formats: HDF5, netCDF4, and Zarr.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源