摘要
Semantic search or text-to-video search in video is a novel and challenging problem in information and multimedia retrieval. Existing solutions are mainly limited to text-to-text matching, in which the query words are matched against the user-generated metadata. This kind of text-to-text search, though simple, is of limited functionality as it provides no understanding about the video content. This paper presents a state-of-the-art system for event search without any user-generated metadata or example videos, known as text-to-video search. The system relies on substantial video content understanding and allows for searching complex events over a large collection of videos. The proposed text-to-video search can be used to augment the existing text-to-text search for video. The novelty and practicality are demonstrated by the evaluation in NIST TRECVID 2014, where the proposed system achieves the best performance. We share our observations and lessons in building such a state-of-the-art system, which may be instrumental in guiding the design of the future system for video search and analysis.
| 源语言 | 英语 |
|---|---|
| 页(从-至) | 3-18 |
| 页数 | 16 |
| 期刊 | International Journal of Multimedia Information Retrieval |
| 卷 | 5 |
| 期 | 1 |
| DOI | |
| 出版状态 | 已出版 - 1 3月 2016 |
学术指纹
探究 'Text-to-video: a semantic search engine for internet videos' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver