TY - GEN
T1 - Multi-cue Normalized Non-Negative Sparse Encoder for image classification
AU - Zhang, Shizhou
AU - Wang, Jinjun
AU - Liang, Yudong
AU - Gong, Yihong
AU - Zheng, Nanning
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/8/4
Y1 - 2015/8/4
N2 - Recently, the sparse coding based image representation has achieved state-of-the-art recognition results on many benchmarks. In this paper, we propose Multi-cue Normalized Non-Negative Sparse Encoder (MN3SE) which enforces both the non-negative constraint and the shift-invariant constraint on top of the traditional sparse coding criteria, and takes multi-cue to further boost the performance. The former constraint reduces information loose by the negative coefficients and improves the coding stability, and the latter allows the sparseness to be self-adaptive to the local feature. The proposed coding scheme is then approximated by an neural network based encoder for speed-up. More importantly, the multi-layer neural network architecture allows us to apply a multi-task learning strategy to fuse information from multi-cue. Specifically, we take one type of descriptor, such as SIFT as the input, and enforce the learned encoder to produce sparse code that can reconstruct not only SIFT but also other types of descriptors such as color moments. In this way, we could achieve not only 10 to 33 times speed up for sparse-coding, the multi-cue enforced learning strategy gives the image feature extracted by MN3SE superior image classification accuracy.
AB - Recently, the sparse coding based image representation has achieved state-of-the-art recognition results on many benchmarks. In this paper, we propose Multi-cue Normalized Non-Negative Sparse Encoder (MN3SE) which enforces both the non-negative constraint and the shift-invariant constraint on top of the traditional sparse coding criteria, and takes multi-cue to further boost the performance. The former constraint reduces information loose by the negative coefficients and improves the coding stability, and the latter allows the sparseness to be self-adaptive to the local feature. The proposed coding scheme is then approximated by an neural network based encoder for speed-up. More importantly, the multi-layer neural network architecture allows us to apply a multi-task learning strategy to fuse information from multi-cue. Specifically, we take one type of descriptor, such as SIFT as the input, and enforce the learned encoder to produce sparse code that can reconstruct not only SIFT but also other types of descriptors such as color moments. In this way, we could achieve not only 10 to 33 times speed up for sparse-coding, the multi-cue enforced learning strategy gives the image feature extracted by MN3SE superior image classification accuracy.
KW - Image classification
KW - Multi-cue
KW - Non-Negative constraint
KW - Shift-invariant constraint
KW - Sparse Encoder
UR - https://www.scopus.com/pages/publications/84946067225
U2 - 10.1109/ICME.2015.7177531
DO - 10.1109/ICME.2015.7177531
M3 - 会议稿件
AN - SCOPUS:84946067225
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2015 IEEE International Conference on Multimedia and Expo, ICME 2015
PB - IEEE Computer Society
T2 - IEEE International Conference on Multimedia and Expo, ICME 2015
Y2 - 29 June 2015 through 3 July 2015
ER -