论文标题

部分可观测时空混沌系统的无模型预测

Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices

论文作者

Ferragina, Paolo, Gagie, Travis, Köppl, Dominik, Manzini, Giovanni, Navarro, Gonzalo, Striani, Manuel, Tosoni, Francesco

论文摘要

如今,机器学习(ML)技术正在生成大量数据收集,因此如何有效地设计其存储和操作的问题变得至关重要。在本文中,我们为实价矩阵提出了一种新的无损压缩方案,该方案在压缩率和线性代数操作的时间方面可实现有效的性能。实验表明,作为压缩机,我们的工具显然优于GZIP,并且在压缩比方面通常在XZ的20%以内。此外,我们的压缩格式支持与压缩表示形式大小成比例的矩阵矢量乘法,这与需要完全解压缩压缩矩阵的GZIP和XZ不同。据我们所知,我们的无损压缩机是第一个实现时间和空间复杂性,与输入的$ K $ ther订单统计熵表示的理论限制相匹配。 为了实现进一步的时间/空间减少,我们提出了针对新颖的列相似性评分的列赖算法。我们对ML矩阵各种数据集的实验表明,在适度的预处理时间内,我们的列重新排序可以在矩阵矢量乘法期间进一步降低高达16%的峰值。 最后,我们将提案与最新的压缩线性代数(CLA)方法进行比较,表明我们的提案始终始终至少两次(在多线程设置中),并且在大多数测试的数据集中,我们的运行速度始终更快(在多线程中),并且可以更好地压缩空间占用。该实验证实了我们为压缩矩阵方法显示的可证明有效的理论界限。

As nowadays Machine Learning (ML) techniques are generating huge data collections, the problem of how to efficiently engineer their storage and operations is becoming of paramount importance. In this article we propose a new lossless compression scheme for real-valued matrices which achieves efficient performance in terms of compression ratio and time for linear-algebra operations. Experiments show that, as a compressor, our tool is clearly superior to gzip and it is usually within 20% of xz in terms of compression ratio. In addition, our compressed format supports matrix-vector multiplications in time and space proportional to the size of the compressed representation, unlike gzip and xz that require the full decompression of the compressed matrix. To our knowledge our lossless compressor is the first one achieving time and space complexities which match the theoretical limit expressed by the $k$-th order statistical entropy of the input. To achieve further time/space reductions, we propose column-reordering algorithms hinging on a novel column-similarity score. Our experiments on various data sets of ML matrices show that, with a modest preprocessing time, our column reordering can yield a further reduction of up to 16% in the peak memory usage during matrix-vector multiplication. Finally, we compare our proposal against the state-of-the-art Compressed Linear Algebra (CLA) approach showing that ours runs always at least twice faster (in a multi-thread setting) and achieves better compressed space occupancy for most of the tested data sets. This experimentally confirms the provably effective theoretical bounds we show for our compressed-matrix approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源