TY - GEN
T1 - DKT
T2 - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
AU - Gao, Xinyuan
AU - He, Yuhang
AU - Dong, Songlin
AU - Cheng, Jie
AU - Wei, Xing
AU - Gong, Yihong
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In the context of incremental class learning, deep neural networks are prone to catastrophic forgetting, where the accuracy of old classes declines substantially as new knowledge is learned. While recent studies have sought to address this issue, most approaches suffer from either the stability-plasticity dilemma or excessive computational and parameter requirements. To tackle these challenges, we propose a novel framework, the Diverse Knowledge Transfer Transformer (DKT), which incorporates two knowledge transfer mechanisms that use attention mechanisms to transfer both task-specific and task-general knowledge to the current task, along with a duplex classifier to address the stability-plasticity dilemma. Additionally, we design a loss function that clusters similar categories and discriminates between old and new tasks in the feature space. The proposed method requires only a small number of extra parameters, which are negligible in comparison to the increasing number of tasks. We perform extensive experiments on CIFAR100, ImageNet100, and ImageNet1000 datasets, which demonstrate that our method outperforms other competitive methods and achieves state-of-the-art performance. Our source code is available at https://github.com/MIVXJTU/DKT.
AB - In the context of incremental class learning, deep neural networks are prone to catastrophic forgetting, where the accuracy of old classes declines substantially as new knowledge is learned. While recent studies have sought to address this issue, most approaches suffer from either the stability-plasticity dilemma or excessive computational and parameter requirements. To tackle these challenges, we propose a novel framework, the Diverse Knowledge Transfer Transformer (DKT), which incorporates two knowledge transfer mechanisms that use attention mechanisms to transfer both task-specific and task-general knowledge to the current task, along with a duplex classifier to address the stability-plasticity dilemma. Additionally, we design a loss function that clusters similar categories and discriminates between old and new tasks in the feature space. The proposed method requires only a small number of extra parameters, which are negligible in comparison to the increasing number of tasks. We perform extensive experiments on CIFAR100, ImageNet100, and ImageNet1000 datasets, which demonstrate that our method outperforms other competitive methods and achieves state-of-the-art performance. Our source code is available at https://github.com/MIVXJTU/DKT.
KW - Transfer
KW - continual
KW - low-shot
KW - meta
KW - or long-tail learning
UR - https://www.scopus.com/pages/publications/85172379695
U2 - 10.1109/CVPR52729.2023.02321
DO - 10.1109/CVPR52729.2023.02321
M3 - 会议稿件
AN - SCOPUS:85172379695
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 24236
EP - 24245
BT - Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
PB - IEEE Computer Society
Y2 - 18 June 2023 through 22 June 2023
ER -