跳到主要导航 跳到搜索 跳到主要内容

Learning to search efficiently in high dimensions

  • Zhen Li
  • , Huazhong Ning
  • , Liangliang Cao
  • , Tong Zhang
  • , Yihong Gong
  • , Thomas S. Huang
  • University of Illinois at Urbana-Champaign
  • Alphabet Inc.
  • IBM
  • Rutgers - The State University of New Jersey, New Brunswick
  • NEC Corporation

科研成果: 书/报告/会议事项章节会议稿件同行评审

18 引用 (Scopus)

摘要

High dimensional similarity search in large scale databases becomes an important challenge due to the advent of Internet. For such applications, specialized data structures are required to achieve computational efficiency. Traditional approaches relied on algorithmic constructions that are often data independent (such as Locality Sensitive Hashing) or weakly dependent (such as kd-trees, κ-means trees). While supervised learning algorithms have been applied to related problems, those proposed in the literature mainly focused on learning hash codes optimized for compact embedding of the data rather than search efficiency. Consequently such an embedding has to be used with linear scan or another search algorithm. Hence learning to hash does not directly address the search efficiency issue. This paper considers a new framework that applies supervised learning to directly optimize a data structure that supports efficient large scale search. Our approach takes both search quality and computational cost into consideration. Specifically, we learn a boosted search forest that is optimized using pair-wise similarity labeled examples. The output of this search forest can be efficiently converted into an inverted indexing data structure, which can leverage modern text search infrastructure to achieve both scalability and efficiency. Experimental results show that our approach significantly outperforms the start-of-the-art learning to hash methods (such as spectral hashing), as well as state-of-the-art high dimensional search algorithms (such as LSH and κ-means trees).

源语言英语
主期刊名Advances in Neural Information Processing Systems 24
主期刊副标题25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
出版商Neural Information Processing Systems
ISBN(印刷版)9781618395993
出版状态已出版 - 2011
已对外发布
活动25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 - Granada, 西班牙
期限: 12 12月 201114 12月 2011

出版系列

姓名Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

会议

会议25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
国家/地区西班牙
Granada
时期12/12/1114/12/11

学术指纹

探究 'Learning to search efficiently in high dimensions' 的科研主题。它们共同构成独一无二的指纹。

引用此