Halveformer: A Novel Architecture Combined with Linear Models for Long Sequences Time Series Forecasting

In real-life scenarios, long sequence time series forecasting (LSTF) has broad applications, such as electricity demand planning and abnormal weather prediction. In recent years, transformer-based models have achieved significant advancements in LSTF tasks, as they excel at capturing long-term depen...

Full description

Saved in:
Bibliographic Details
Published in2024 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8
Main Authors Feng, Yuan, Xia, Kai, Qu, Xiaoyu, Hu, Xuwei, Sun, Long, Zhang, Zilong
Format Conference Proceeding
LanguageEnglish
Published IEEE 30.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In real-life scenarios, long sequence time series forecasting (LSTF) has broad applications, such as electricity demand planning and abnormal weather prediction. In recent years, transformer-based models have achieved significant advancements in LSTF tasks, as they excel at capturing long-term dependencies in time series data. Long-term forecasting capability is particularly crucial in LSTF tasks. However, transformer-based models suffer from high computational costs, extensive memory usage, and limitations of the inherent encoder-decoder architecture, making them inefficient and challenging to directly apply in real-world scenarios, especially for long-period forecasting. Moreover, recent research has indicated that cross-attention mechanisms may cause temporal information loss, and sparse attention mechanisms can create information utilization bottlenecks. To address these issues, this paper proposes a novel architecture named "Halveformer," which combines linear models with the encoder to enhance both model performance and efficiency. Through experimental research on seven benchmark datasets, we demonstrate that Halveformer significantly outperforms existing advanced methods. We hope this innovative concept can pave the way for new research directions in LTSF tasks and prompt a reevaluation of the effectiveness of solutions based on the inherent encoder-decoder architecture.
ISSN:2161-4407
DOI:10.1109/IJCNN60899.2024.10651438