Second-Order Spectral Transform Block for 3D Shape Classification and Retrieval

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

In this paper, we propose a novel network block, dubbed as second-order spectral transform block, for 3D shape retrieval and classification. This network block generalizes the second-order pooling to 3D surface by designing a learnable non-linear transform on the spectrum of the pooled descriptor. The proposed block consists of following two components. First, the second-order average (SO-Avr) and max-pooling (SO-Max) operations are designed on 3D surface to aggregate local descriptors, which are shown to be more discriminative than the popular average-pooling or max-pooling. Second, a learnable spectral transform parameterized by mixture of power function is proposed to perform non-linear feature mapping in the space of pooled descriptors, i.e., manifold of symmetric positive definite matrix for SO-Avr, and space of symmetric matrix for SO-Max. The proposed block can be plugged into existing network architectures to aggregate local shape descriptors for boosting their performance. We apply it to a shallow network for non-rigid 3D shape analysis and to existing networks for rigid shape analysis, where it improves the first-tier retrieval accuracy by 7.2% on SHREC'14 Real dataset and achieves state-of-the-art classification accuracy on ModelNet40. As an extension, we apply our block to 2D image classification, showing its superiority compared with traditional second-order pooling methods. We also provide theoretical and experimental analysis on stability of the proposed second-order spectral transform block.

Original languageEnglish
Article number8967234
Pages (from-to)4530-4543
Number of pages14
JournalIEEE Transactions on Image Processing
Volume29
DOIs
StatePublished - 2020

Keywords

  • 3D shape analysis
  • second-order pooling
  • shape representation
  • spectral transform

Fingerprint

Dive into the research topics of 'Second-Order Spectral Transform Block for 3D Shape Classification and Retrieval'. Together they form a unique fingerprint.

Cite this