论文标题
COGMOL:使用深生成模型的COVID-19的目标特异性和选择性药物设计
CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
论文作者
论文摘要
SARS-COV-2的新颖性质要求开发有效的从头药物设计方法。在这项研究中,我们提出了一个端到端框架,称为COGMOL(分子受控生成),用于设计针对具有高亲和力和脱靶选择性的新型病毒蛋白的新型药物样的小分子。 COGMOL结合了分子微笑变化自动编码器(VAE)的自适应预训练和有效的多属性受控采样方案,该方案使用了对潜在特征训练的属性预测变量的指导。为了生成新颖和最佳的药物样分子,用于看不见的病毒靶标,COGMOL利用了一种蛋白质 - 分子结合亲和力预测因子,该预测是使用微笑的vae嵌入和蛋白质序列嵌入训练的,该预测是从大型体体中学到的。 COGMOL框架应用于三种SARS-COV-2靶蛋白:主要蛋白酶,峰值蛋白的受体结合结构域和非结构性蛋白9复制酶。与训练数据相比,生成的候选物在分子和化学支架水平上都是新颖的。 COGMOL还包括用于评估母分子毒性及其代谢物具有多任务毒性分类器的代谢性的iNsilico筛选,具有化学缩回合成预测因子的合成可行性,以及与对接模拟结合的靶结构。扩展坞揭示了产生的分子与靶蛋白结构的有利结合,其中87-95%的高亲和力分子显示自由能<-6 kcal/mol。与批准的药物相比,大多数设计的化合物表现出低母体分子和代谢产物毒性和高合成可行性。总而言之,COGMOL处理具有高目标特异性和选择性的可合成,低毒性的类似药物样分子的多构造设计,并且不需要目标依赖目标的框架或目标结构信息。
The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme that uses guidance from attribute predictors trained on latent features. To generate novel and optimal drug-like molecules for unseen viral targets, CogMol leverages a protein-molecule binding affinity predictor that is trained using SMILES VAE embeddings and protein sequence embeddings learned unsupervised from a large corpus. CogMol framework is applied to three SARS-CoV-2 target proteins: main protease, receptor-binding domain of the spike protein, and non-structural protein 9 replicase. The generated candidates are novel at both molecular and chemical scaffold levels when compared to the training data. CogMol also includes insilico screening for assessing toxicity of parent molecules and their metabolites with a multi-task toxicity classifier, synthetic feasibility with a chemical retrosynthesis predictor, and target structure binding with docking simulations. Docking reveals favorable binding of generated molecules to the target protein structure, where 87-95 % of high affinity molecules showed docking free energy < -6 kcal/mol. When compared to approved drugs, the majority of designed compounds show low parent molecule and metabolite toxicity and high synthetic feasibility. In summary, CogMol handles multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity, and does not need target-dependent fine-tuning of the framework or target structure information.