论文标题
肽MHC结合预测的基于排名的卷积神经网络模型
Ranking-based Convolutional Neural Network Models for Peptide-MHC Binding Prediction
论文作者
论文摘要
T细胞受体可以识别与主要组织相容性复合物(MHC)I类蛋白结合的外来肽,从而触发适应性免疫反应。因此,鉴定可以与MHC I类分子结合的肽在肽疫苗的设计中起着至关重要的作用。已经开发了许多计算方法,例如最先进的等位基因特异性方法MHCflurry,以预测肽和MHC分子之间的结合亲和力。在本手稿中,我们开发了两个基于Convm和SPCONVM的基于等位基因的卷积神经网络(CNN)的方法,以解决结合预测问题。具体而言,我们提出了问题,以通过基于排名的学习目标优化肽-MHC绑定的排名。这种优化对结合亲和力的测量不准确性更强大,并且可以耐受性,因此可以更准确地对结合肽的优先级排序。此外,我们在Convm和SPCONVM中开发了一种新的编码方法,以更好地识别结合事件的最重要的氨基酸。我们的实验结果表明,我们的模型显着优于最先进的方法,包括MHCflurry,AUC的平均百分比为6.70%,而ROC5在128个等位基因中的平均百分比为17.10%。
T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I molecules plays a vital role in the design of peptide vaccines. Many computational methods, for example, the state-of-the-art allele-specific method MHCflurry, have been developed to predict the binding affinities between peptides and MHC molecules. In this manuscript, we develop two allele-specific Convolutional Neural Network (CNN)-based methods named ConvM and SpConvM to tackle the binding prediction problem. Specifically, we formulate the problem as to optimize the rankings of peptide-MHC bindings via ranking-based learning objectives. Such optimization is more robust and tolerant to the measurement inaccuracy of binding affinities, and therefore enables more accurate prioritization of binding peptides. In addition, we develop a new position encoding method in ConvM and SpConvM to better identify the most important amino acids for the binding events. Our experimental results demonstrate that our models significantly outperform the state-of-the-art methods including MHCflurry with an average percentage improvement of 6.70% on AUC and 17.10% on ROC5 across 128 alleles.