论文标题

DPST:用氨基酸感知变压器进行从头肽测序

DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers

论文作者

Yang, Yan, Hossain, Zakir, Asif, Khandaker, Pan, Liyuan, Rahman, Shafin, Stone, Eric

论文摘要

从头肽测序旨在从串联质谱法(MS)数据中回收肽的氨基酸序列。在推断期间,现有的从头分析方法列举了所有氨基酸类别的MS证据。它导致对MS数据的接收场过度进行过度修改,并限制了与未经编码的氨基酸相关的MS证据。我们的方法DPST用两个关键组成部分来规避这些局限性:(1)根据MS之间的基于氨基酸的连接性,置信值聚合编码器以草图频谱表示; (2)一种逐渐吸收上下文化频谱表示的全球局部融合解码器,并具有对局部MS证据和氨基酸先验的预定义的先入率。我们的组件源自封闭形式的解决方案,并有选择地参加信息丰富的氨基酸意识MS表示。通过广泛的实证研究,我们证明了DPST的优越性,表明它的表现优于最先进的方法的余量为12%-19%的肽准确性。

De novo peptide sequencing aims to recover amino acid sequences of a peptide from tandem mass spectrometry (MS) data. Existing approaches for de novo analysis enumerate MS evidence for all amino acid classes during inference. It leads to over-trimming on receptive fields of MS data and restricts MS evidence associated with following undecoded amino acids. Our approach, DPST, circumvents these limitations with two key components: (1) A confidence value aggregation encoder to sketch spectrum representations according to amino-acid-based connectivity among MS; (2) A global-local fusion decoder to progressively assimilate contextualized spectrum representations with a predefined preconception of localized MS evidence and amino acid priors. Our components originate from a closed-form solution and selectively attend to informative amino-acid-aware MS representations. Through extensive empirical studies, we demonstrate the superiority of DPST, showing that it outperforms state-of-the-art approaches by a margin of 12% - 19% peptide accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源