分布式非负张量火车分解

论文标题

分布式非负张量火车分解

Distributed Non-Negative Tensor Train Decomposition

论文作者

Bhattarai, Manish, Chennupati, Gopinath, Skau, Erik, Vangara, Raviteja, Djidjev, Hirsto, Alexandrov, Boian

论文摘要

Exascale计算的时代为许多科学，工程和商业领域的创新和发现开辟了新的场所。但是，随着Exaflops的发展，还出现了由高性能计算产生的大型高维数据。高维数据表示为多维阵列，也就是张量。张量中潜在（不直接观察）结构的存在允许通过经典张量分解技术对数据进行唯一的表示和压缩。但是，经典的张量方法并不总是稳定的，或者它们的内存需求指数可能是指数的，这使得它们不适合高维张量。张量火车（TT）是引入最新的张量网络，用于分解高维张量。 TT在三维张量的网络中转换了最初的高维张量，该网络仅需要线性存储。许多现实世界的数据，例如密度，温度，人口，概率等是非负的，为了简单的解释，优先保留非阴性的算法。在这里，我们介绍了分布式的非负张量训练，并展示了其可扩展性以及对合成和现实世界大数据集的压缩性。

The era of exascale computing opens new venues for innovations and discoveries in many scientific, engineering, and commercial fields. However, with the exaflops also come the extra-large high-dimensional data generated by high-performance computing. High-dimensional data is presented as multidimensional arrays, aka tensors. The presence of latent (not directly observable) structures in the tensor allows a unique representation and compression of the data by classical tensor factorization techniques. However, the classical tensor methods are not always stable or they can be exponential in their memory requirements, which makes them not suitable for high-dimensional tensors. Tensor train (TT) is a state-of-the-art tensor network introduced for factorization of high-dimensional tensors. TT transforms the initial high-dimensional tensor in a network of three-dimensional tensors that requires only a linear storage. Many real-world data, such as, density, temperature, population, probability, etc., are non-negative and for an easy interpretation, the algorithms preserving non-negativity are preferred. Here, we introduce a distributed non-negative tensor-train and demonstrate its scalability and the compression on synthetic and real-world big datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题