Abstract
Deep models have shown to be vulnerable to catastrophic forgetting, a phenomenon that the recognition performance on old data degrades when a pre-trained model is fine-tuned on new data. Knowledge distillation (KD) is a popular incremental approach to alleviate catastrophic forgetting. However, it usually fixes the absolute values of neural responses for isolated historical instances, without considering the intrinsic structure of the responses by a convolutional neural network (CNN) model. To overcome this limitation, we recognize the importance of the global property of the whole instance set and treat it as a behavior characteristic of a CNN model relevant to model incremental learning. On this basis: 1) we design an instance neighborhood-preserving (INP) loss to maintain the order of pair-wise instance similarities of the old model in the feature space; 2) we devise a label priority-preserving (LPP) loss to preserve the label ranking lists within instance-wise label probability vectors in the output space; and 3) we introduce an efficient derivable ranking algorithm for calculating the two loss functions. Extensive experiments conducted on CIFAR100 and ImageNet show that our approach achieves the state-of-the-art performance.
| Original language | English |
|---|---|
| Pages (from-to) | 7529-7540 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| Volume | 34 |
| Issue number | 10 |
| DOIs | |
| State | Published - 1 Oct 2023 |
Keywords
- Catastrophic forgetting
- continual learning
- continuous learning
- incremental learning
Fingerprint
Dive into the research topics of 'Model Behavior Preserving for Class-Incremental Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver