论文标题
RNA-seq数据科学:从原始数据到有效的解释
RNA-seq data science: From raw data to effective interpretation
论文作者
论文摘要
在过去的十年中,RNA测序(RNA-Seq)已成为现代生物学和临床应用中的一种典范技术。在近年来,由于生物信息学界的持续努力开发准确且可扩展的计算工具,它在近年来获得了巨大的知名度。 RNA-Seq是使用现代测序平台分析样品的RNA含量的一种方法。它以核苷酸序列的形式(称为读取)生成大量的转录组数据。 RNA-seq分析可以探测基因和相应的转录本,这对于回答重要的生物学问题至关重要,例如检测新型外显子,转录本,基因表达和研究替代剪接结构。但是,由于现代测序技术的局限性,使用计算方法从原始数据中获取有意义的生物学信号是具有挑战性的。利用这些技术挑战的需求推动了许多新型计算工具的快速发展,这些工具根据技术的进步发展和多样化,导致当前无数RNA-Seq工具人口。我们的评论提供了RNA-Seq技术的系统概述和2008年至2020年发表的各个领域的235个可用的RNA-Seq工具,讨论了与RNA测序,分析和软件开发有关的生物信息学的跨学科性质。
RNA-sequencing (RNA-seq) has become an exemplar technology in modern biology and clinical applications over the past decade. It has gained immense popularity in the recent years driven by continuous efforts of the bioinformatics community to develop accurate and scalable computational tools. RNA-seq is a method of analyzing the RNA content of a sample using the modern sequencing platforms. It generates enormous amounts of transcriptomic data in the form of nucleotide sequences, known as reads. RNA-seq analysis enables the probing of genes and corresponding transcripts which is essential for answering important biological questions, such as detecting novel exons, transcripts, gene expressions, and studying alternative splicing structure. However, obtaining meaningful biological signals from raw data using computational methods is challenging due to the limitations of modern sequencing technologies. The need to leverage these technological challenges have pushed the rapid development of many novel computational tools which have evolved and diversified in accordance with technological advancements, leading to the current myriad population of RNA-seq tools. Our review provides a systemic overview of RNA-seq technology and 235 available RNA-seq tools across various domains published from 2008 to 2020, discussing the interdisciplinary nature of bioinformatics involved in RNA sequencing, analysis, and software development.