熊猫：使用序列信息预测突变上蛋白质结合亲和力的变化

论文标题

熊猫：使用序列信息预测突变上蛋白质结合亲和力的变化

PANDA: Predicting the change in proteins binding affinity upon mutations using sequence information

论文作者

Abbasi, Wajid Arshad, Abbas, Syed Ali, Andleeb, Saiqa

论文摘要

准确地确定突变上蛋白质结合亲和力的变化对于发现和设计新型治疗剂并协助诱变研究很重要。突变对结合亲和力的变化的确定需要使用计算方法来帮助的复杂，昂贵且耗时的湿lab实验。大多数计算预测技术都需要蛋白质结构，以将其适用于已知结构的蛋白质复合物。在这项工作中，我们探讨了突变后蛋白质结合亲和力变化的基于序列的预测。我们已经使用蛋白质序列信息而不是蛋白质结构以及机器学习技术来准确预测突变时蛋白质结合亲和力的变化。我们提出的基于序列的蛋白质结合亲和力预测变量（称为熊猫）比现有验证集以及在外部独立测试数据集上的现有方法具有更好的准确性。在外部测试数据集上，我们提出的方法给出了最大的Pearson相关系数为0.52，与最先进的基于蛋白质结构的方法相比，基于Mutabind的现有蛋白质结构方法，该方法的最大Pearson相关系数为0.59。与现有的基于蛋白质结构的方法相比，我们提出的基于蛋白质序列的方法，以预测突变对突变的结合亲和力变化，具有广泛的适用性和可比性的性能。基于云的熊猫的网络服务器实现及其python代码，请访问https://sites.google.com/view/wajidarshad/software和https://github.com/wajidarshad/panda。

Accurately determining a change in protein binding affinity upon mutations is important for the discovery and design of novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be aided with computational methods. Most of the computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA gives better accuracy than existing methods over the same validation set as well as on an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. A cloud-based webserver implementation of PANDA and its python code is available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/panda.

下载PDF全文

下载文献需遵守相关版权规定

论文标题