论文标题

在两个完全标记的树之间的距离方面

On Two Measures of Distance between Fully-Labelled Trees

论文作者

Bernardini, Giulia, Bonizzoni, Paola, Gawrychowski, Paweł

论文摘要

在过去的十年中,数据的量和多种新推理方法的大幅度增加了,用于重建各种癌症的详细进化史。这需要设计有效的程序来比较代表肿瘤系统发育突变演变的根树。 Bernardini等。 [CPM 2019]最近引入了重排距离的概念,该概念是由这种必要性动机的完全标记的树木。该概念源自两个操作:一个缩小节点标签的操作,另一个影响了树的拓扑。每个操作仅定义一个可以在多项式时间计算的距离,而将两者结合的实际重排距离被证明是NP-HARD。 我们回答了以前的工作没有得到答复的两个公开问题。首先,计算置换距离的复杂性是什么?其次,是否存在一种恒定因素近似算法来估计两棵任意树之间的重排距离?我们通过通过双向减少来显示第一个来回答第一个,即计算$ n $ nodes上两棵树之间的排列距离等效,直到多毛因子因素,以在稀疏的两部分图中找到最大的基数匹配。特别是,通过插入Liu和Sidford [Arxiv 2020]的算法,我们获得了一个$ O(N^{4/3+O(1)})$ TIME算法,用于计算$ N $ NODES上两棵树之间的排列距离。然后,我们积极回答第二个问题,并设计一个线性的恒定因子近似算法,该算法不需要在树上任何假设。

The last decade brought a significant increase in the amount of data and a variety of new inference methods for reconstructing the detailed evolutionary history of various cancers. This brings the need of designing efficient procedures for comparing rooted trees representing the evolution of mutations in tumor phylogenies. Bernardini et al. [CPM 2019] recently introduced a notion of the rearrangement distance for fully-labelled trees motivated by this necessity. This notion originates from two operations: one that permutes the labels of the nodes, the other that affects the topology of the tree. Each operation alone defines a distance that can be computed in polynomial time, while the actual rearrangement distance, that combines the two, was proven to be NP-hard. We answer two open question left unanswered by the previous work. First, what is the complexity of computing the permutation distance? Second, is there a constant-factor approximation algorithm for estimating the rearrangement distance between two arbitrary trees? We answer the first one by showing, via a two-way reduction, that calculating the permutation distance between two trees on $n$ nodes is equivalent, up to polylogarithmic factors, to finding the largest cardinality matching in a sparse bipartite graph. In particular, by plugging in the algorithm of Liu and Sidford [ArXiv 2020], we obtain an $O(n^{4/3+o(1)})$ time algorithm for computing the permutation distance between two trees on $n$ nodes. Then we answer the second question positively, and design a linear-time constant-factor approximation algorithm that does not need any assumption on the trees.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源