论文标题
建模全染色体的目标搜索
Modelling chromosome-wide target search
论文作者
论文摘要
最常见的基因调节机制是转录因子蛋白与调节序列结合以增加或减少RNA转录的时候。但是,在搜索这些序列时,转录因子面临两个主要挑战。首先,相对于基因组长度而言,它们消失了。其次,许多几乎相同的序列散布在基因组上,导致蛋白质暂停搜索。但是,正如大肠杆菌LACI调节的计算研究中所指出的那样,如果考虑DNA循环,这种几乎目标可能会降低搜索时间。在本文中,我们探讨了这是否也发生在整个染色体的距离上。为此,我们开发了一个跨尺度的计算框架,该框架结合了建立的促进式扩散模型,用于基地级搜索和一个捕获全染色体全染色体飞跃的网络模型。为了使我们的模型现实,我们使用HI-C数据集作为长期率DNA片段和结合曲线之间3D接近的代理,以超过100个转录因子。使用我们的跨尺度模型,我们发现指向单个目标的中位搜索时间严重取决于网络合并节点强度(链接权重的总和)和局部解离速率。同样,通过随机化这些速率,我们发现一些实际的3D目标配置比随机对应物更快或较慢。这一发现暗示,染色体的3D结构漏斗对相关DNA区域的基本转录因子。
The most common gene regulation mechanism is when a transcription factor protein binds to a regulatory sequence to increase or decrease RNA transcription. However, transcription factors face two main challenges when searching for these sequences. First, they are vanishingly short relative to the genome length. Second, many nearly identical sequences are scattered across the genome, causing proteins to suspend the search. But as pointed out in a computational study of LacI regulation in Escherichia coli, such almost-targets may lower search times if considering DNA looping. In this paper, we explore if this also occurs over chromosome-wide distances. To this end, we developed a cross-scale computational framework that combines established facilitated-diffusion models for basepair-level search and a network model capturing chromosome-wide leaps. To make our model realistic, we used Hi-C data sets as a proxy for 3D proximity between long-ranged DNA segments and binding profiles for more than 100 transcription factors. Using our cross-scale model, we found that median search times to individual targets critically depend on a network metric combining node strength (sum of link weights) and local dissociation rates. Also, by randomizing these rates, we found that some actual 3D target configurations stand out as considerably faster or slower than their random counterparts. This finding hints that chromosomes' 3D structure funnels essential transcription factors to relevant DNA regions.