TY - JOUR
T1 - Second-Order Spectral Transform Block for 3D Shape Classification and Retrieval
AU - Yu, Ruixuan
AU - Sun, Jian
AU - Li, Huibin
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020
Y1 - 2020
N2 - In this paper, we propose a novel network block, dubbed as second-order spectral transform block, for 3D shape retrieval and classification. This network block generalizes the second-order pooling to 3D surface by designing a learnable non-linear transform on the spectrum of the pooled descriptor. The proposed block consists of following two components. First, the second-order average (SO-Avr) and max-pooling (SO-Max) operations are designed on 3D surface to aggregate local descriptors, which are shown to be more discriminative than the popular average-pooling or max-pooling. Second, a learnable spectral transform parameterized by mixture of power function is proposed to perform non-linear feature mapping in the space of pooled descriptors, i.e., manifold of symmetric positive definite matrix for SO-Avr, and space of symmetric matrix for SO-Max. The proposed block can be plugged into existing network architectures to aggregate local shape descriptors for boosting their performance. We apply it to a shallow network for non-rigid 3D shape analysis and to existing networks for rigid shape analysis, where it improves the first-tier retrieval accuracy by 7.2% on SHREC'14 Real dataset and achieves state-of-the-art classification accuracy on ModelNet40. As an extension, we apply our block to 2D image classification, showing its superiority compared with traditional second-order pooling methods. We also provide theoretical and experimental analysis on stability of the proposed second-order spectral transform block.
AB - In this paper, we propose a novel network block, dubbed as second-order spectral transform block, for 3D shape retrieval and classification. This network block generalizes the second-order pooling to 3D surface by designing a learnable non-linear transform on the spectrum of the pooled descriptor. The proposed block consists of following two components. First, the second-order average (SO-Avr) and max-pooling (SO-Max) operations are designed on 3D surface to aggregate local descriptors, which are shown to be more discriminative than the popular average-pooling or max-pooling. Second, a learnable spectral transform parameterized by mixture of power function is proposed to perform non-linear feature mapping in the space of pooled descriptors, i.e., manifold of symmetric positive definite matrix for SO-Avr, and space of symmetric matrix for SO-Max. The proposed block can be plugged into existing network architectures to aggregate local shape descriptors for boosting their performance. We apply it to a shallow network for non-rigid 3D shape analysis and to existing networks for rigid shape analysis, where it improves the first-tier retrieval accuracy by 7.2% on SHREC'14 Real dataset and achieves state-of-the-art classification accuracy on ModelNet40. As an extension, we apply our block to 2D image classification, showing its superiority compared with traditional second-order pooling methods. We also provide theoretical and experimental analysis on stability of the proposed second-order spectral transform block.
KW - 3D shape analysis
KW - second-order pooling
KW - shape representation
KW - spectral transform
UR - https://www.scopus.com/pages/publications/85081050507
U2 - 10.1109/TIP.2020.2967579
DO - 10.1109/TIP.2020.2967579
M3 - 文章
AN - SCOPUS:85081050507
SN - 1057-7149
VL - 29
SP - 4530
EP - 4543
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 8967234
ER -