INTERACTIVE TV RESEARCH - UITV.INFO Hosted by ELTRUN Research Center
Athens University of Economics and Business

Powered by Google
  Home  | Papers  | Events  | News  | Newsletter   
UITV.INFO > theses > ekin03

Ahmet Ekin (2003). Sports Video Processing for Description, Summarization, and Search. Rochester Institute of Technology. http://www.ece.rochester.edu/users/tekalp/students/ekin_thesis.pdf

Abstract
This thesis proposes solutions for structural and semantic video modeling, automatic video analysis, and expressive video search and retrieval. We present a structural-semantic video model for e effective representation of high- and low-level video information, an automatic, multi-modal sports video processing framework for instantiation of the model attributes and summarization, and, finally, a graph-based query formation and resolution framework for semantic search and retrieval based on the proposed model. Except for the video analysis algorithms, which are specific to sports video, the proposed structural-semantic video model and the graph-based querying framework are generic in the sense that they are applicable to description and querying of any type of video. We first introduce a structural-semantic video model for efficient description of high-level and low-level video features. The proposed model unifies the shot-based and object-based structural video models that are employed by video processing and computer vision communities with the entity-relationship (ER) or object-oriented models that are used by the database and information retrieval communities. This unified approach improves over the existing MPEG-7 approach that uses two description schemes (DS) for the same task. In order to instantiate model descriptors and generate automatic and real-time summaries of video, we focus on the domain of sports video because the extraction of high-level model entities from low-level video features necessitates the specification of a domain. We propose a multi-modal and scalable sports video processing framework for model descriptor instantiation and fast summarization of broadcast sports video. The proposed framework is multi-modal because it employs visual, audio, and text features, and is scalable because the system may generate descriptors in real-time or online based upon user preferences and requirements. It is also applicable to multiple types of sports. The scalability of the framework results from the classification of visual features into cinematic and object-based features and efficient processing of them. Because cinematic features are easier to compute, we extract cinematic features, such as shot-boundaries, shot-types, and slow-motion replays, before object-based analysis that involves object detection and tracking. Real-time descriptors and summaries are computed by using only cinematic visual features as well as some audio and text features. Because some cinematic and object-based algorithms use features extracted from field region and most sporting events take place on a field with one distinct dominant color, we develop a robust low-level dominant color region detection algorithm that automatically detects the color of the field and adapts to the variations due to the changes in imaging conditions. We demonstrate the efficiency and efectiveness of the proposed low-level, cinematic, and object-based feature extraction algorithms over a large dataset. In order to provide the users fast and efficient access to relevant content, we develop a graph-based query formation and resolution framework whereby queries are represented by graph patterns and resolved by applying graph matching concepts. With the proposed querying framework, the users are allowed to define new concepts when the existing descriptors are incomplete or insufficient to satisfy their requirements. One of the appealing features of the proposed framework is the support of multiple querying methodologies, such as query-by-description (text), query-by-example, and browsing. The users may form query graph patterns from the database schema, the existing descriptions, or alternatively, from the model templates (abstract graphs).



About UITV.INFO | Editorial team | Questions and Answers | Contact | Privacy Policy | Help
Copyright © 2002-2008 UITV.INFO. All rights are reserved by UITV.INFO or the respective Publishers and Authors.
Reproduction of material from UITV.INFO without permission is strictly prohibited.