论文标题
高维数据的有效张量回归
An efficient tensor regression for high-dimensional data
论文作者
论文摘要
当前使用的大多数用于高维数据的张量回归模型基于Tucker分解,该模型具有良好的性能,但随着张量的增加(例如,大于四到五个),它在压缩张量方面的效率很快就非常快。但是,对于处理时间序列数据中最简单的张量自动化,其系数张量已经具有六个。本文修订了新提出的张量列车(TT)分解,然后将其应用于张量回归中,以便可以获得一个不错的统计解释。新的张量回归可以很好地匹配数据与层次结构,甚至可以为带有阶乘结构的数据提供更好的解释,这些数据应该通过塔克分解的模型更好地拟合。更重要的是,由于TT分解可以更有效地压缩系数张量,因此可以轻松地将新的张量回归用于案例。该方法还将其扩展到时间序列数据的张量自动性,并为张量回归和自动估计的普通最小二乘估计得出了非肌电特性。引入了一种新算法来搜索估计器,还讨论了其理论上的理由。通过模拟研究验证了所提出方法的理论和计算特性,而与现有方法相比的优势通过两个真实示例进行了说明。
Most currently used tensor regression models for high-dimensional data are based on Tucker decomposition, which has good properties but loses its efficiency in compressing tensors very quickly as the order of tensors increases, say greater than four or five. However, for the simplest tensor autoregression in handling time series data, its coefficient tensor already has the order of six. This paper revises a newly proposed tensor train (TT) decomposition and then applies it to tensor regression such that a nice statistical interpretation can be obtained. The new tensor regression can well match the data with hierarchical structures, and it even can lead to a better interpretation for the data with factorial structures, which are supposed to be better fitted by models with Tucker decomposition. More importantly, the new tensor regression can be easily applied to the case with higher order tensors since TT decomposition can compress the coefficient tensors much more efficiently. The methodology is also extended to tensor autoregression for time series data, and nonasymptotic properties are derived for the ordinary least squares estimations of both tensor regression and autoregression. A new algorithm is introduced to search for estimators, and its theoretical justification is also discussed. Theoretical and computational properties of the proposed methodology are verified by simulation studies, and the advantages over existing methods are illustrated by two real examples.