Towards efficient learning of optimal spatial Bag-of-Words representations

  • Lu Jiang
  • , Wei Tong
  • , Deyu Meng
  • , Alexander G. Hauptmann

Research output: Contribution to conferencePaperpeer-review

13 Scopus citations

Abstract

Spatial Pyramid Matching (SPM) assumes that the spatial Bag-of-Words (BoW) representation is independent of data. However, evidence has shown that the assumption usually leads to a suboptimal representation. In this paper, we propose a novel method called Jensen-Shannon (JS) Tiling to learn the BoW representation from data directly at the BoW level. The proposed JS Tiling is especially appropriate for large-scale datasets as it is orders of magnitude faster than existing methods, but with comparable or even better classification precision. Experimental results on four benchmarks including two TRECVID12 datasets validate that JS Tiling outperforms the SPM and the state-of-the-art methods. The runtime comparison demonstrates that selecting BoW representations by JS Tiling is more than 1,000 times faster than running classifiers. Besides, JS Tiling is an important component contributing to CMU Teams' final submission in TRECVID 2012 Multimedia Event Detection.

Original languageEnglish
Pages121-128
Number of pages8
DOIs
StatePublished - 2014
Event2014 4th ACM International Conference on Multimedia Retrieval, ICMR 2014 - Glasgow, United Kingdom
Duration: 1 Apr 20144 Apr 2014

Conference

Conference2014 4th ACM International Conference on Multimedia Retrieval, ICMR 2014
Country/TerritoryUnited Kingdom
CityGlasgow
Period1/04/144/04/14

Keywords

  • Bag of visual words
  • Feature representation
  • Jensen-Shannon tiling
  • Pooling method
  • Spatial pyramid
  • SPM

Fingerprint

Dive into the research topics of 'Towards efficient learning of optimal spatial Bag-of-Words representations'. Together they form a unique fingerprint.

Cite this