论文标题
setar-tree:一种新颖而准确的树算法,用于全球时间序列预测
SETAR-Tree: A Novel and Accurate Tree Algorithm for Global Time Series Forecasting
论文作者
论文摘要
由于其简单性和数学属性,统计学家在过去几十年中已广泛使用阈值自回旋(TAR)模型。另一方面,在预测社区中,基于通用树的回归算法(森林,梯度增强)由于易于使用和准确性而变得流行。在本文中,我们探讨了焦油模型和回归树之间的密切连接。这些使我们能够使用有关焦油模型的文献中丰富的方法来定义层次焦油模型作为一个回归树,该回归树在跨系列中训练全球,我们称为setar-tree。与不主要关注预测和计算叶子节点的平均值的通用树模型相反,我们引入了一种新的预测特异性树算法,该算法在叶子中训练全球汇集回归(PR)模型,以便在叶子中训练跨系列信息,允许学习跨系列信息,还可以学习一些时间范围的跨度和一些特定时间的跨度和停止的程序。通过进行焦油模型中常用的统计线性测试以及测量每个节点分开时的误差降低百分比来控制树的深度。因此,所提出的树模型需要最小的外部超参数调整,并根据其默认配置提供竞争结果。我们还使用该树算法来开发森林,在该森林中,在预测过程中将各种固定树的预测合并在一起。在我们对八个公开数据集的评估中,提议的树木和森林模型能够达到与一组最先进的基于树的算法和预测基于四个评估指标的基准测试的准确性。
Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.