TY - JOUR
T1 - LiteFormer
T2 - A Lightweight and Efficient Transformer for Rotating Machine Fault Diagnosis
AU - Sun, Wenjun
AU - Yan, Ruqiang
AU - Jin, Ruibing
AU - Xu, Jiawen
AU - Yang, Yuan
AU - Chen, Zhenghua
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2024/6/1
Y1 - 2024/6/1
N2 - Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, especially in fault diagnosis. First, the quadratic complexity of its self-attention scheme extremely increases the computation cost, which poses a challenge to apply Transformer to a computationally limited platform like an industry system. In addition, the sequence-based modeling in the Transformer increases the training difficulty and requires a large-scale training dataset. This drawback becomes serious when Transformer is applied in fault diagnosis where only limited data is available. To mitigate these issues, we rethink this common approach and propose a new Transformer, which is more suitable for fault diagnosis. In this article, we first show that the attention module can be actually replaced with or even surpassed by a convolution layer under some conditions in mathematics and experiments. Then, we adopt the convolutions into the Transformer, where the computation burden issue is alleviated and the fault classification accuracy is significantly improved. Furthermore, to increase the computation efficiency, a lightweight Transformer called LiteFormer, is developed by utilizing the depth-wise convolutional layer. Extensive experiments are carried out on four datasets: Case Western Reserve University dataset; Paderborn University dataset; and two gearbox datasets of drivetrain dynamic simulator. Through our experiments, our LiteFormer not only reduces the computation cost in model training, but also sets new state-of-the-art results, surpassing other counterparts in both fault classification accuracy and model robustness.
AB - Transformer has shown impressive performance on global feature modeling in many applications. However, two drawbacks induced by its intrinsic architecture limit its application, especially in fault diagnosis. First, the quadratic complexity of its self-attention scheme extremely increases the computation cost, which poses a challenge to apply Transformer to a computationally limited platform like an industry system. In addition, the sequence-based modeling in the Transformer increases the training difficulty and requires a large-scale training dataset. This drawback becomes serious when Transformer is applied in fault diagnosis where only limited data is available. To mitigate these issues, we rethink this common approach and propose a new Transformer, which is more suitable for fault diagnosis. In this article, we first show that the attention module can be actually replaced with or even surpassed by a convolution layer under some conditions in mathematics and experiments. Then, we adopt the convolutions into the Transformer, where the computation burden issue is alleviated and the fault classification accuracy is significantly improved. Furthermore, to increase the computation efficiency, a lightweight Transformer called LiteFormer, is developed by utilizing the depth-wise convolutional layer. Extensive experiments are carried out on four datasets: Case Western Reserve University dataset; Paderborn University dataset; and two gearbox datasets of drivetrain dynamic simulator. Through our experiments, our LiteFormer not only reduces the computation cost in model training, but also sets new state-of-the-art results, surpassing other counterparts in both fault classification accuracy and model robustness.
KW - Convolution
KW - Transformer
KW - efficient
KW - fault diagnosis
KW - lightweight
UR - https://www.scopus.com/pages/publications/85176358165
U2 - 10.1109/TR.2023.3322860
DO - 10.1109/TR.2023.3322860
M3 - 文章
AN - SCOPUS:85176358165
SN - 0018-9529
VL - 73
SP - 1258
EP - 1269
JO - IEEE Transactions on Reliability
JF - IEEE Transactions on Reliability
IS - 2
ER -