TY - GEN
T1 - CMB
T2 - 2022 International Joint Conference on Neural Networks, IJCNN 2022
AU - Zhou, Hengyi
AU - Liu, Longjun
AU - Zhang, Haonan
AU - He, Hongyi
AU - Zheng, Nanning
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Structural re-parameterization is a raising field, which aims at improving the performance of convolutional neural networks (CNNs) through training an over-parameterization model and transferring it into a compact inference model. However, the performance improvements of prior structural re-parameterization works often come at the cost of heavy extra training resources, which increases carbon emissions and limits the potential applications on large-scale industrial tasks. To this end, first, we conduct experiments with a series of blocks composed of multiple identical branches to investigate the mechanism behind the structural re-parameterization, and then provide an interpretation. Moreover, motivated by the studies of effective receptive fields in the biological visual systems and neural networks, we propose a novel compact block named circular mask block (CMB). Given a neural network, we replace the regular convolutional layer with CMB to construct a training architecture, which can be trained to gain an accuracy boost with No extra training parameters and limited extra training FLOPs. After training, the training architecture can be transformed into the original architecture for inference. Extensive experiments are performed on CIFAR-10 and ImageNet to evaluate the effectiveness of our method. For example, we improve 0.85% top-1 accuracy of ResNet-50 on ImageNet without extra training parameters and only 11.32M extra training FLOPs, which saves 434x training FLOPs compared with prior works.
AB - Structural re-parameterization is a raising field, which aims at improving the performance of convolutional neural networks (CNNs) through training an over-parameterization model and transferring it into a compact inference model. However, the performance improvements of prior structural re-parameterization works often come at the cost of heavy extra training resources, which increases carbon emissions and limits the potential applications on large-scale industrial tasks. To this end, first, we conduct experiments with a series of blocks composed of multiple identical branches to investigate the mechanism behind the structural re-parameterization, and then provide an interpretation. Moreover, motivated by the studies of effective receptive fields in the biological visual systems and neural networks, we propose a novel compact block named circular mask block (CMB). Given a neural network, we replace the regular convolutional layer with CMB to construct a training architecture, which can be trained to gain an accuracy boost with No extra training parameters and limited extra training FLOPs. After training, the training architecture can be transformed into the original architecture for inference. Extensive experiments are performed on CIFAR-10 and ImageNet to evaluate the effectiveness of our method. For example, we improve 0.85% top-1 accuracy of ResNet-50 on ImageNet without extra training parameters and only 11.32M extra training FLOPs, which saves 434x training FLOPs compared with prior works.
KW - convolutional layer
KW - neural networks
KW - structural re-parameterization
KW - training parameters
UR - https://www.scopus.com/pages/publications/85140749731
U2 - 10.1109/IJCNN55064.2022.9892874
DO - 10.1109/IJCNN55064.2022.9892874
M3 - 会议稿件
AN - SCOPUS:85140749731
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2022 International Joint Conference on Neural Networks, IJCNN 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 July 2022 through 23 July 2022
ER -