TY - GEN
T1 - Tackling Instance-Dependent Label Noise with Class Rebalance and Geometric Regularization
AU - Cao, Shuzhi
AU - Ruan, Jianfei
AU - Dong, Bo
AU - Shi, Bin
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/8/24
Y1 - 2024/8/24
N2 - In label-noise learning, accurately identifying the transition matrix is crucial for developing statistically consistent classifiers. This task is complicated by instance-dependent noise, which introduces identifiability challenges in the absence of stringent assumptions. Existing methods use neural networks to estimate the transition matrix by initially extracting confident clean instances. However, this extraction process is hindered by severe inter-class imbalance and a bias toward selecting unambiguous intra-class instances, leading to a distorted understanding of noise patterns. To tackle these challenges, our paper introduces a Class Rebalance and Geometric Regularization-based Framework (CRGR). CRGR employs a smoothed, noise-tolerant reweighting mechanism to equilibrate inter-class representation, thereby mitigating the risk of model overfitting to dominant classes. Additionally, recognizing that instances with similar characteristics often exhibit parallel noise patterns, we propose that the transition matrix should mirror the similarity of the feature space. This insight promotes the inclusion of ambiguous instances in training, serving as a form of geometric regularization. Such a strategy enhances the model's ability to navigate diverse noise patterns and strengthens its generalization capabilities. By addressing both inter-class and intra-class biases, CRGR offers a more balanced and robust classification model. Extensive experiments on both synthetic and real-world datasets demonstrate CRGR's superiority over existing state-of-the-art methods, significantly boosting classification accuracy and showcasing its effectiveness in handling instance-dependent noise.
AB - In label-noise learning, accurately identifying the transition matrix is crucial for developing statistically consistent classifiers. This task is complicated by instance-dependent noise, which introduces identifiability challenges in the absence of stringent assumptions. Existing methods use neural networks to estimate the transition matrix by initially extracting confident clean instances. However, this extraction process is hindered by severe inter-class imbalance and a bias toward selecting unambiguous intra-class instances, leading to a distorted understanding of noise patterns. To tackle these challenges, our paper introduces a Class Rebalance and Geometric Regularization-based Framework (CRGR). CRGR employs a smoothed, noise-tolerant reweighting mechanism to equilibrate inter-class representation, thereby mitigating the risk of model overfitting to dominant classes. Additionally, recognizing that instances with similar characteristics often exhibit parallel noise patterns, we propose that the transition matrix should mirror the similarity of the feature space. This insight promotes the inclusion of ambiguous instances in training, serving as a form of geometric regularization. Such a strategy enhances the model's ability to navigate diverse noise patterns and strengthens its generalization capabilities. By addressing both inter-class and intra-class biases, CRGR offers a more balanced and robust classification model. Extensive experiments on both synthetic and real-world datasets demonstrate CRGR's superiority over existing state-of-the-art methods, significantly boosting classification accuracy and showcasing its effectiveness in handling instance-dependent noise.
KW - class rebalance
KW - confident clean instance
KW - geometric regularization
KW - instance-dependent label noise
UR - https://www.scopus.com/pages/publications/85203718172
U2 - 10.1145/3637528.3671707
DO - 10.1145/3637528.3671707
M3 - 会议稿件
AN - SCOPUS:85203718172
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 211
EP - 221
BT - KDD 2024 - Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024
Y2 - 25 August 2024 through 29 August 2024
ER -