Abstract
Automatic construction of content-based indices for video source material requires general semantic interpretation of both images and their accompanying sounds; but such a broadly-based semantic analysis is beyond the capabilities of the current technologies of machine vision and audio signal analysis. However, if one can assume a limited and well-demarcated body of domain knowledge for describing the content of a body of video, then it becomes easier to interpret a video source in terms of that domain knowledge. This paper presents our work on using domain knowledge to parse news video programs and to index them on the basis of their visual content. Models based on both the spatial structure of image frames and the temporal structure of the entire program have been developed for news videos, along with algorithms that apply these models by locating and identifying instances of their elements. Experimental results are also discussed in detail to evaluate both the models and the algorithms that use them. Finally, proposals for future work are summarized.
| Original language | English |
|---|---|
| Pages (from-to) | 256-266 |
| Number of pages | 11 |
| Journal | Multimedia Systems |
| Volume | 2 |
| Issue number | 6 |
| DOIs | |
| State | Published - Jan 1995 |
| Externally published | Yes |
Keywords
- Image processing
- Video database
- Video indexing
- Video parsing
- Video retrieval