论文标题
通过单分子基础模型的双向生成结构和性能
Bidirectional Generation of Structure and Properties Through a Single Molecular Foundation Model
论文作者
论文摘要
大型基础模型在人工智能中的最新成功促使化学预培训模型的出现。尽管对大分子预训练的模型的兴趣日益增加,这些模型为下游任务提供了信息的表示,但在分子结构域上进行多模式预训练方法的尝试还是有限的。为了解决这个问题,我们提出了一种新型的多模式预培训模型,该模型结合了结构和生化特性的模态,从最新的多峰学习技术中汲取了灵感。我们提出的数据处理和训练目标的模型管道将结构/属性特征在公共嵌入空间中保持一致,这使该模型能够考虑分子结构和属性之间的双向信息。这些贡献出现了协同知识,使我们能够通过单个模型来解决多模式和单峰下游任务。通过广泛的实验,我们证明了我们的模型在解决各种有意义的化学挑战方面表现出显着的能力,包括有条件的分子产生,财产预测,分子分类和反应预测。
The recent success of large foundation models in artificial intelligence has prompted the emergence of chemical pre-trained models. Despite the growing interest in large molecular pre-trained models that provide informative representations for downstream tasks, attempts for multimodal pre-training approaches on the molecule domain were limited. To address this, we present a novel multimodal molecular pre-trained model that incorporates the modalities of structure and biochemical properties, drawing inspiration from recent advances in multimodal learning techniques. Our proposed model pipeline of data handling and training objectives aligns the structure/property features in a common embedding space, which enables the model to regard bidirectional information between the molecules' structure and properties. These contributions emerge synergistic knowledge, allowing us to tackle both multimodal and unimodal downstream tasks through a single model. Through extensive experiments, we demonstrate that our model shows remarkable capabilities in solving various meaningful chemical challenges, including conditional molecule generation, property prediction, molecule classification, and reaction prediction.