TY - JOUR
T1 - Self-contrastive Learning-optimized General Agent for long-tailed fault diagnosis of shipboard antennas leveraging adaptive data distribution
AU - Cui, Qianwen
AU - He, Shuilong
AU - Hu, Chaofan
AU - Bao, Jiading
AU - Peng, Yanhua
AU - Chen, Jinglong
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/2/1
Y1 - 2025/2/1
N2 - To address the challenges of low accuracy and limited generalization in long-tailed fault diagnosis, an adaptive data distribution-based reinforcement learning General Agent is proposed. The method primarily targets more discriminative, domain-invariant feature learning by pre-training the deep Q-network with unlabeled positive samples. Supervisory signals derived from the data's intrinsic structure enhance class boundary detection while maximizing intra-class feature similarity. Next, empirical data prioritization based on state-action values and TD-error enables efficient utilization of rare but critical experiences, significantly improving sampling efficiency. Concurrently, an adaptive distribution strategy refines a hierarchical reward system by dynamically calibrating the reward function according to real-time accuracy feedback. The deep Q-network, structured with ResNet as the backbone, integrates Efficient Channel Attention (ECA) and Global Attention Mechanism (GAM) to enhance decision-making robustness. Tested on a long-tailed shipboard antenna dataset, the proposed method autonomously identifies fault patterns, demonstrating clear advantages in efficiency, robustness, generalization, and interpretability.
AB - To address the challenges of low accuracy and limited generalization in long-tailed fault diagnosis, an adaptive data distribution-based reinforcement learning General Agent is proposed. The method primarily targets more discriminative, domain-invariant feature learning by pre-training the deep Q-network with unlabeled positive samples. Supervisory signals derived from the data's intrinsic structure enhance class boundary detection while maximizing intra-class feature similarity. Next, empirical data prioritization based on state-action values and TD-error enables efficient utilization of rare but critical experiences, significantly improving sampling efficiency. Concurrently, an adaptive distribution strategy refines a hierarchical reward system by dynamically calibrating the reward function according to real-time accuracy feedback. The deep Q-network, structured with ResNet as the backbone, integrates Efficient Channel Attention (ECA) and Global Attention Mechanism (GAM) to enhance decision-making robustness. Tested on a long-tailed shipboard antenna dataset, the proposed method autonomously identifies fault patterns, demonstrating clear advantages in efficiency, robustness, generalization, and interpretability.
KW - D3QN
KW - Long-tailed distributions
KW - Prioritized experience replay
KW - Shipboard antennas
KW - Unsupervised contrastive pretraining
UR - https://www.scopus.com/pages/publications/85203802520
U2 - 10.1016/j.measurement.2024.115726
DO - 10.1016/j.measurement.2024.115726
M3 - 文章
AN - SCOPUS:85203802520
SN - 0263-2241
VL - 241
JO - Measurement: Journal of the International Measurement Confederation
JF - Measurement: Journal of the International Measurement Confederation
M1 - 115726
ER -