论文标题

用于记忆有效变压器的分组自我发注意机制

Grouped self-attention mechanism for a memory-efficient Transformer

论文作者

Jung, Bumjun, Mukuta, Yusuke, Harada, Tatsuya

论文摘要

时间序列数据分析很重要,因为许多现实世界中的任务,例如预测天气,电力消耗和股票市场涉及预测随着时间的变化的数据。时间序列数据通常在长期观察过程中记录,由于它们的周期性特征和长期依赖性随着时间的推移而长。因此,捕获长期依赖性是时间序列数据预测的重要因素。为了解决这些问题,我们提出了两个新型模块,分组自我注意力(GSA)和压缩跨注意事项(CCA)。在两个模块的情况下,我们在小型高参数限制下达到了$ o(l)$的计算空间和时间复杂性,并具有序列长度$ l $,并且可以在考虑全局信息的同时捕获局部性。在时间序列数据集上进行的实验结果表明,我们提出的模型有效地表现出与现有方法相比或更好的计算复杂性和性能降低。

Time-series data analysis is important because numerous real-world tasks such as forecasting weather, electricity consumption, and stock market involve predicting data that vary over time. Time-series data are generally recorded over a long period of observation with long sequences owing to their periodic characteristics and long-range dependencies over time. Thus, capturing long-range dependency is an important factor in time-series data forecasting. To solve these problems, we proposed two novel modules, Grouped Self-Attention (GSA) and Compressed Cross-Attention (CCA). With both modules, we achieved a computational space and time complexity of order $O(l)$ with a sequence length $l$ under small hyperparameter limitations, and can capture locality while considering global information. The results of experiments conducted on time-series datasets show that our proposed model efficiently exhibited reduced computational complexity and performance comparable to or better than existing methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源