TY - JOUR
T1 - Correntropy based label loss for multi-classification on deep neural networks
AU - Deng, Qing
AU - Zhou, Nan
AU - Luo, Wenjun
AU - Du, Yuanhua
AU - Shi, Kaibo
AU - Chen, Badong
N1 - Publisher Copyright:
© 2025
PY - 2025/9/14
Y1 - 2025/9/14
N2 - The success of deep learning heavily relies on large-scale labeled datasets, but manually labeled datasets that inevitably have errors. However, the network is highly susceptible to noisy labels, which seriously degenerate the learning performance of the network. Thus, training neural networks on datasets with noisy labels is a substantial challenge. As a nonlinear and local similarity metric, correntropy is not sensitive to outliers. Based on the features of correntropy, this paper proposes a novel loss function called Correntropy based Label Loss (CLL). The CLL can connect the output of the Softmax layer and utilize the properties of the layer's output; thus, it is suitable for multi-classification problems with one-hot encoded labels and is naturally applied to networks with the Softmax layer. Specifically, when the distance between two random variables exceeds a particular threshold, the influence on the network can be alleviated by appropriately selected kernel bandwidth. Therefore, for data contaminated by noisy labels, the loss function CLL can alleviate the effects of the noisy labels in the data and let the network effectively learn information from the data with correct labels. We give theoretical and gradient analyses of the CLL loss to prove that CLL is robust to the noisy labels. In the MNIST dataset with a symmetric noise 60%, the model trained by CLL has an accuracy of up to 96.81%, which is 43.43% higher than the CE. Furthermore, the experiments conducted on the five publicly available datasets illustrate that the network with CLL loss outperforms the other state-of-the-art robust loss in most cases.
AB - The success of deep learning heavily relies on large-scale labeled datasets, but manually labeled datasets that inevitably have errors. However, the network is highly susceptible to noisy labels, which seriously degenerate the learning performance of the network. Thus, training neural networks on datasets with noisy labels is a substantial challenge. As a nonlinear and local similarity metric, correntropy is not sensitive to outliers. Based on the features of correntropy, this paper proposes a novel loss function called Correntropy based Label Loss (CLL). The CLL can connect the output of the Softmax layer and utilize the properties of the layer's output; thus, it is suitable for multi-classification problems with one-hot encoded labels and is naturally applied to networks with the Softmax layer. Specifically, when the distance between two random variables exceeds a particular threshold, the influence on the network can be alleviated by appropriately selected kernel bandwidth. Therefore, for data contaminated by noisy labels, the loss function CLL can alleviate the effects of the noisy labels in the data and let the network effectively learn information from the data with correct labels. We give theoretical and gradient analyses of the CLL loss to prove that CLL is robust to the noisy labels. In the MNIST dataset with a symmetric noise 60%, the model trained by CLL has an accuracy of up to 96.81%, which is 43.43% higher than the CE. Furthermore, the experiments conducted on the five publicly available datasets illustrate that the network with CLL loss outperforms the other state-of-the-art robust loss in most cases.
KW - Correntropy
KW - Deep learning
KW - Noisy label learning
KW - Robust loss
UR - https://www.scopus.com/pages/publications/105006760878
U2 - 10.1016/j.neucom.2025.130500
DO - 10.1016/j.neucom.2025.130500
M3 - 文章
AN - SCOPUS:105006760878
SN - 0925-2312
VL - 646
JO - Neurocomputing
JF - Neurocomputing
M1 - 130500
ER -