论文标题
与内存无关的平行矩阵乘法通信下限
Tight Memory-Independent Parallel Matrix Multiplication Communication Lower Bounds
论文作者
论文摘要
长期以来,已建立了用于矩阵乘法算法的通信下限。但是,大多数渐近分析方法都忽略了恒定因子,或者没有获得最紧的可能值。最近的工作表明,更仔细的分析改善了某些经典矩阵乘法下限的最著名常数,并有助于确定更有效的算法,这些算法准确地确切地符合领先术语并提高实际性能。这项工作的主要结果是建立与内存无关的通信下限,并用紧密的常数用于并行矩阵乘法。在三种情况下,我们的常数改善了以前的工作,这些情况取决于矩阵的长宽比的相对大小。
Communication lower bounds have long been established for matrix multiplication algorithms. However, most methods of asymptotic analysis have either ignored the constant factors or not obtained the tightest possible values. Recent work has demonstrated that more careful analysis improves the best known constants for some classical matrix multiplication lower bounds and helps to identify more efficient algorithms that match the leading-order terms in the lower bounds exactly and improve practical performance. The main result of this work is the establishment of memory-independent communication lower bounds with tight constants for parallel matrix multiplication. Our constants improve on previous work in each of three cases that depend on the relative sizes of the aspect ratios of the matrices.