TY - JOUR
T1 - Switching
T2 - understanding the class-reversed sampling in tail sample memorization
AU - Zhang, Chi
AU - Hu, Benyi
AU - Liuzhang, Yuhang
AU - Wang, Le
AU - Liu, Li
AU - Liu, Yuehu
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.
PY - 2022/3
Y1 - 2022/3
N2 - Long-tailed visual recognition poses significant challenges to traditional machine learning and emerging deep networks due to its inherent class imbalance. Existing reweighting and re-sampling methods, although effective, lack a fundamental theory while leaving the paradoxical effects of long tail unsolved, where network failing with head classes under-represented and tail classes overfitted. In this paper, we investigate long-tailed recognition from a memorization-generalization point of view, which not only unravels the whys of previous methods, but also derives a new principled solution. Specifically, we first empirically identify the regularity of classes under long-tailed distributions, finding that long-tailed challenge is essentially a trade-off between the representation of high-regularity head classes and generalization to low-regularity tail classes. To memorize tail samples without seriously damaging the representation of head samples, we propose a simple yet effective sampling strategy for ordinary mini-batch SGD optimization process, Switching, which switches from instance-balanced sampling to class-reversed sampling for only once at small learning rate. By theoretical analysis, we show that the upper bound on the generalization error of the proposed sampling strategy is lower than instance-balanced sampling conditionally. In our experiments, the proposed method can reach feasible performance more efficiently than current methods. Further experiments validate the superiority of the proposed Switching strategy, implying that the long-tailed learning trade-off could be parsimoniously tackled only in the memorization stage with a small learning rate and over-exposure of tail samples.
AB - Long-tailed visual recognition poses significant challenges to traditional machine learning and emerging deep networks due to its inherent class imbalance. Existing reweighting and re-sampling methods, although effective, lack a fundamental theory while leaving the paradoxical effects of long tail unsolved, where network failing with head classes under-represented and tail classes overfitted. In this paper, we investigate long-tailed recognition from a memorization-generalization point of view, which not only unravels the whys of previous methods, but also derives a new principled solution. Specifically, we first empirically identify the regularity of classes under long-tailed distributions, finding that long-tailed challenge is essentially a trade-off between the representation of high-regularity head classes and generalization to low-regularity tail classes. To memorize tail samples without seriously damaging the representation of head samples, we propose a simple yet effective sampling strategy for ordinary mini-batch SGD optimization process, Switching, which switches from instance-balanced sampling to class-reversed sampling for only once at small learning rate. By theoretical analysis, we show that the upper bound on the generalization error of the proposed sampling strategy is lower than instance-balanced sampling conditionally. In our experiments, the proposed method can reach feasible performance more efficiently than current methods. Further experiments validate the superiority of the proposed Switching strategy, implying that the long-tailed learning trade-off could be parsimoniously tackled only in the memorization stage with a small learning rate and over-exposure of tail samples.
KW - Class-reversed sampling
KW - Long-tailed classification
KW - Network memorization and generalization
KW - Rademacher complexity
UR - https://www.scopus.com/pages/publications/85122393743
U2 - 10.1007/s10994-021-06087-3
DO - 10.1007/s10994-021-06087-3
M3 - 文章
AN - SCOPUS:85122393743
SN - 0885-6125
VL - 111
SP - 1073
EP - 1101
JO - Machine Learning
JF - Machine Learning
IS - 3
ER -