Twins transformer: rolling bearing fault diagnosis based on cross-attention fusion of time and frequency domain features

Current self-attention based Transformer models in the field of fault diagnosis are limited to identifying correlation information within a single sequence and are unable to capture both time and frequency domain fault characteristics of the original signal. To address these limitations, this resear...

Full description

Saved in:

Bibliographic Details
Published in	Measurement science & technology Vol. 35; no. 9; p. 96113
Main Authors	Gao, Zhikang, Wang, Yanxue, Li, Xinming, Yao, Jiachi
Format	Journal Article
Language	English
Published	01.09.2024
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Current self-attention based Transformer models in the field of fault diagnosis are limited to identifying correlation information within a single sequence and are unable to capture both time and frequency domain fault characteristics of the original signal. To address these limitations, this research introduces a two-channel Transformer fault diagnosis model that integrates time and frequency domain features through a cross-attention mechanism. Initially, the original time-domain fault signal is converted to the frequency domain using the Fast Fourier Transform, followed by global and local feature extraction via a Convolutional Neural Network. Next, through the self-attention mechanism on the two-channel Transformer, separate fault features associated with long distances within each sequence are modeled and then fed into the feature fusion module of the cross-attention mechanism. During the fusion process, frequency domain features serve as the query sequence Q and time domain features as the key-value pairs K. By calculating the attention weights between Q and K, the model excavates deeper fault features of the original signal. Besides preserving the intrinsic associative information within sequences learned via the self-attention mechanism, the Twins Transformer also models the degree of association between different sequence features using the cross-attention mechanism. Finally, the proposed model’s performance was validated using four different experiments on four bearing datasets, achieving average accuracy rates of 99.67%, 98.76%, 98.47% and 99.41%. These results confirm the model’s effective extraction of time and frequency domain correlation features, demonstrating fast convergence, superior performance and high accuracy.
ISSN:	0957-0233 1361-6501
DOI:	10.1088/1361-6501/ad53f1