论文标题

盆地概念:一种通过复杂网络和熵最大化对生物序列分类的无对齐方法

BASiNETEntropy: an alignment-free method for classification of biological sequences through complex networks and entropy maximization

论文作者

Breve, Murilo Montanini, Pimenta-Zanon, Matheus Henrique, Lopes, Fabrício Martins

论文摘要

核酸的发现和DNA的结构在对生命的理解方面带来了很大的进步。下一代测序技术的开发导致了大规模生成的数据,对此,计算方法对于分析和知识发现至关重要。特别是,由于它们在生物体中的功能多样性以及在许多生物过程中具有不同功能的不同类别的发现,因此RNA受到了很多关注。因此,对RNA序列的正确识别对于提供相关信息以了解生物的功能越来越重要。这项工作通过提出一种通过复杂的网络和熵最大化来分类生物序列的新方法来解决此上下文。提出了最大的熵原理来确定有关RNA类的最有用的边缘,从而生成了过滤的复杂网络。在13种的不同RNA类的分类中评估了所提出的方法。将所提出的方法与PLEK,CPC2和盆地方法进行了比较,比所有方法都优于所有方法。盆地概念将所有RNA序列分类为高精度和低标准偏差,显示出果断和稳健性。所提出的方法以R语言为开源实现,并在https://cran.r-project.org/web/packages/basinetentropy上免费获得。

The discovery of nucleic acids and the structure of DNA have brought considerable advances in the understanding of life. The development of next-generation sequencing technologies has led to a large-scale generation of data, for which computational methods have become essential for analysis and knowledge discovery. In particular, RNAs have received much attention because of the diversity of their functionalities in the organism and the discoveries of different classes with different functions in many biological processes. Therefore, the correct identification of RNA sequences is increasingly important to provide relevant information to understand the functioning of organisms. This work addresses this context by presenting a new method for the classification of biological sequences through complex networks and entropy maximization. The maximum entropy principle is proposed to identify the most informative edges about the RNA class, generating a filtered complex network. The proposed method was evaluated in the classification of different RNA classes from 13 species. The proposed method was compared to PLEK, CPC2 and BASiNET methods, outperforming all compared methods. BASiNETEntropy classified all RNA sequences with high accuracy and low standard deviation in results, showing assertiveness and robustness. The proposed method is implemented in an open source in R language and is freely available at https://cran.r-project.org/web/packages/BASiNETEntropy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源