Abstract
Transformer has been widely applied in the research of rotating machinery fault diagnosis due to its ability to explore the internal correlation of vibration signals. However, challenges still exist despite the countless efforts. Generally, Transformer is more prone to overfitting than CNN on small-scale datasets. In practical engineering, collecting sufficient fault samples for training is difficult, resulting in poor generalization of Transformer. In addition, the measured signals are often accompanied with severe noise, further reducing the generalization performance of the model. Meanwhile, the collected signals often follow different distributions due to the changing operating conditions, which places higher demands on the generalizability of Transformer. This paper proposes a Bayesian variational Transformer (Bayesformer) to cope with the abovementioned problems. In Bayesformer, all the attention weights are treated as latent random variables, rather than determined values as the previous studies. This allows to train an ensemble of networks, instead of a single one, enhancing the generalizability of the model. Three experimental studies are conducted to illustrate the developed model and superior diagnostic performance is showed throughout the experiments.
Original language | English |
---|---|
Article number | 110936 |
Journal | Mechanical Systems and Signal Processing |
Volume | 207 |
Early online date | 17 Nov 2023 |
DOIs | |
Publication status | Published - 15 Jan 2024 |
Funding
This research is supported by the National Natural Science Foundation of China (No. 52275104), the Science and Technology Innovation Program of Hunan Province (No. 2023RC3097), and the Natural Science Fund for Excellent Young Scholars of Hunan Province (No. 2021JJ20017).
Keywords
- Bayesian variational learning
- ensemble
- generalization
- rotating machinery fault diagnosis
- transformer
- variational attention mechanism