论文标题
当地血统推论的深度学习分类器
A deep learning classifier for local ancestry inference
论文作者
论文摘要
局部血统推论(LAI)确定了一个人基因组的每个部分的祖先,并且是多种同胞医学和人口遗传研究的重要一步。 LAI已使用了几种技术,包括隐藏的马尔可夫模型和随机森林。在这里,我们将LAI任务作为图像分割问题制定,并使用带有编码器解码器体系结构的深卷积神经网络开发新的LAI工具。我们使用来自五个大陆祖先中每一个的982个未混合个体的完整基因组序列训练我们的模型,并使用从同一人群中选择的另外279个个体得出的模拟混合数据进行评估。我们表明,我们的模型能够将混合物作为一项零击任务学习,产生的祖先作业几乎与现有金标准工具RFMIX的祖先分配相同。
Local ancestry inference (LAI) identifies the ancestry of each segment of an individual's genome and is an important step in medical and population genetic studies of diverse cohorts. Several techniques have been used for LAI, including Hidden Markov Models and Random Forests. Here, we formulate the LAI task as an image segmentation problem and develop a new LAI tool using a deep convolutional neural network with an encoder-decoder architecture. We train our model using complete genome sequences from 982 unadmixed individuals from each of five continental ancestry groups, and we evaluate it using simulated admixed data derived from an additional 279 individuals selected from the same populations. We show that our model is able to learn admixture as a zero-shot task, yielding ancestry assignments that are nearly as accurate as those from the existing gold standard tool, RFMix.