论文标题
61,489晶体形成有机分子的原子结构和轨道能
Atomic structures and orbital energies of 61,489 crystal-forming organic molecules
论文作者
论文摘要
材料科学中的数据科学和机器学习需要大量的技术相关分子或材料。当前,具有逼真的分子几何形状和光谱特性的公开分子数据集很少见。在这里,我们提供了从剑桥结构数据库(CSD)中提取的61,489个分子(表示OE62)中提取的61,489个分子的不同基准光谱数据集。在Perdew-Burke-Ernzerhof(PBE)密度功能理论(DFT)中,包括所有62k分子的范德华校正,报道了分子平衡的几何形状。对于这些几何形状,OE62在PBE提供了总能量和轨道特征值,而PBE杂种(PBE0)的DFT的功能水平为真空中的所有62K分子以及PBE0的所有62K分子以及在PBE0水平上以及30,876分子的子集中的PBE0水平。对于真空中的5,239个分子,数据集提供的准粒子能量具有多体扰动理论,其$ g_0w_0 $ $近似为PBE0起点(类似于GW100基准标准的GW5000(M. Van Setten等基准)(M. van Setten et al。J.Chem.J.Chem.J.Chem。Chem.Consement.12,506(2016)。
Data science and machine learning in materials science require large datasets of technologically relevant molecules or materials. Currently, publicly available molecular datasets with realistic molecular geometries and spectral properties are rare. We here supply a diverse benchmark spectroscopy dataset of 61,489 molecules extracted from organic crystals in the Cambridge Structural Database (CSD), denoted OE62. Molecular equilibrium geometries are reported at the Perdew-Burke-Ernzerhof (PBE) level of density functional theory (DFT) including van der Waals corrections for all 62k molecules. For these geometries, OE62 supplies total energies and orbital eigenvalues at the PBE and the PBE hybrid (PBE0) functional level of DFT for all 62k molecules in vacuum as well as at the PBE0 level for a subset of 30,876 molecules in (implicit) water. For 5,239 molecules in vacuum, the dataset provides quasiparticle energies computed with many-body perturbation theory in the $G_0W_0$ approximation with a PBE0 starting point (denoted GW5000 in analogy to the GW100 benchmark set (M. van Setten et al. J. Chem. Theory Comput. 12, 5076 (2016))).