论文标题
DSLOB:用于基准测试预测算法的合成限制订单数据集
DSLOB: A Synthetic Limit Order Book Dataset for Benchmarking Forecasting Algorithms under Distributional Shift
论文作者
论文摘要
在电子交易市场中,限制订单簿(LOB)提供有关给定安全性以各种价格水平的待定购买/出售订单的信息。最近,人们对使用LOB数据来解决下游机器学习任务(例如预测)越来越感兴趣。但是,处理分布(OOD)LOB数据的问题很具有挑战性,因为在当前可公开可用的LOB数据集中,分布偏移没有标记。因此,至关重要的是,构建一个具有标记的OOD样品的合成LOB数据集,该数据集用作开发模型的测试床,以概括为看不见的情况。在这项工作中,我们利用一个多代理市场模拟器来构建一个名为DSLOB的合成LOB数据集,具有和没有市场压力方案,该数据集允许设计受控的分配偏移基准测试。使用所提出的合成数据集,我们提供了三种不同最新预测方法的预测性能的整体分析。我们的结果反映了研究人员增加努力以鲁棒性对高频时间序列数据中的分布变化的算法的努力的需求。
In electronic trading markets, limit order books (LOBs) provide information about pending buy/sell orders at various price levels for a given security. Recently, there has been a growing interest in using LOB data for resolving downstream machine learning tasks (e.g., forecasting). However, dealing with out-of-distribution (OOD) LOB data is challenging since distributional shifts are unlabeled in current publicly available LOB datasets. Therefore, it is critical to build a synthetic LOB dataset with labeled OOD samples serving as a testbed for developing models that generalize well to unseen scenarios. In this work, we utilize a multi-agent market simulator to build a synthetic LOB dataset, named DSLOB, with and without market stress scenarios, which allows for the design of controlled distributional shift benchmarking. Using the proposed synthetic dataset, we provide a holistic analysis on the forecasting performance of three different state-of-the-art forecasting methods. Our results reflect the need for increased researcher efforts to develop algorithms with robustness to distributional shifts in high-frequency time series data.