TY - GEN
T1 - Memory Disagreement
T2 - 33rd ACM Web Conference, WWW 2024
AU - Pei, Hongbin
AU - Xiong, Yuheng
AU - Wang, Pinghui
AU - Tao, Jing
AU - Liu, Jialun
AU - Deng, Huiqi
AU - Ma, Jie
AU - Guan, Xiaohong
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/5/13
Y1 - 2024/5/13
N2 - In the realm of semi-supervised graph learning, pseudo-labeling is a pivotal strategy to utilize both labeled and unlabeled nodes for model training. Currently, confidence score is the most frequently used pseudo-labeling measure, however, it suffers from poor calibration and issues in out-of-distribution data. In this paper, we propose memory disagreement (MoDis for short), a novel uncertainty measure for pseudo-labeling. We uncover that training dynamics offer significant insights into prediction uncertainty - - if a graph model makes consistent predictions for an unlabeled node throughout training, the corresponding predicted label is likely to be correct. Thus, the node should be suitable for pseudo-labeling. The basic idea is supported by recent studies on training dynamics. We implement MoDis as the entropy of an accumulated distribution that summarizes the disagreement of the model's predictions throughout training. We further enhance and analyze MoDis in case studies, which show nodes with low MoDis are suitable for pseudo-labeling as these nodes tend to be distant from boundaries in both graph and representation space. We design MoDis based pseudo-label selection algorithm and corresponding pseudo-labeling algorithm, which are applicable to various graph neural networks. We empirically validate MoDis on eight benchmark graph datasets. The experimental results show that pseudo labels given by MoDis have better quality in correctness and information gain, and the algorithm benefits various graph neural networks, achieving an average relative improvement of 3.11% and reaching up to 30.24% when compared to the wildly-used uncertainty measure, confidence score. Moreover, we demonstrate the efficacy of MoDis on out-of-distribution nodes.
AB - In the realm of semi-supervised graph learning, pseudo-labeling is a pivotal strategy to utilize both labeled and unlabeled nodes for model training. Currently, confidence score is the most frequently used pseudo-labeling measure, however, it suffers from poor calibration and issues in out-of-distribution data. In this paper, we propose memory disagreement (MoDis for short), a novel uncertainty measure for pseudo-labeling. We uncover that training dynamics offer significant insights into prediction uncertainty - - if a graph model makes consistent predictions for an unlabeled node throughout training, the corresponding predicted label is likely to be correct. Thus, the node should be suitable for pseudo-labeling. The basic idea is supported by recent studies on training dynamics. We implement MoDis as the entropy of an accumulated distribution that summarizes the disagreement of the model's predictions throughout training. We further enhance and analyze MoDis in case studies, which show nodes with low MoDis are suitable for pseudo-labeling as these nodes tend to be distant from boundaries in both graph and representation space. We design MoDis based pseudo-label selection algorithm and corresponding pseudo-labeling algorithm, which are applicable to various graph neural networks. We empirically validate MoDis on eight benchmark graph datasets. The experimental results show that pseudo labels given by MoDis have better quality in correctness and information gain, and the algorithm benefits various graph neural networks, achieving an average relative improvement of 3.11% and reaching up to 30.24% when compared to the wildly-used uncertainty measure, confidence score. Moreover, we demonstrate the efficacy of MoDis on out-of-distribution nodes.
KW - epistemic uncertainty
KW - graph neural networks
KW - self-training
UR - https://www.scopus.com/pages/publications/85194060392
U2 - 10.1145/3589334.3645398
DO - 10.1145/3589334.3645398
M3 - 会议稿件
AN - SCOPUS:85194060392
T3 - WWW 2024 - Proceedings of the ACM Web Conference
SP - 434
EP - 445
BT - WWW 2024 - Proceedings of the ACM Web Conference
PB - Association for Computing Machinery, Inc
Y2 - 13 May 2024 through 17 May 2024
ER -