Content Coverage and Redundancy Removal in Video Summarization
Intelligent Analysis of Multimedia Information
https://doi.org/10.4018/978-1-5225-0498-6.CH013…
3 pages
1 file
Sign up for access to the world's latest research
Abstract
Over the past decade, research in the field of Content-Based Video Retrieval Systems (CBVRS) has attracted much attention as it encompasses processing of all the other media types i.e. text, image and audio. Video summarization is one of the most important applications as it potentially enables efficient and faster browsing of large video collections. A concise version of the video is often required due to constraints in viewing time, storage, communication bandwidth as well as power. Thus, the task of video summarization is to effectively extract the most important portions of the video, without sacrificing the semantic information in it. The results of video summarization can be used in many CBVRS applications like semantic indexing, video surveillance copied video detection etc. However, the quality of the summarization task depends on two basic aspects: content coverage and redundancy removal. These two aspects are both important and contradictory to each other. This chapter aim...
Related papers
Lecture Notes in Computer Science, 2007
Video summarization approaches have various fields of application, specifically related to organizing, browsing and accessing large video databases. In this paper we propose and evaluate two novel approaches for video summarization, one based on spectral methods and the other on ant-tree clustering. The overall summary creation process is broke down in two steps: detection of similar scenes and extraction of the most representative ones. While clustering approaches are used for scene segmentation, the post-processing logic merges video scenes into a subset of user relevant scenes. In the case of the spectral approach, representative scenes are extracted following the logic that important parts of the video are related with high motion activity of segments within scenes. In the alternative approach we estimate a subset of relevant video scene using ant-tree optimization approaches and in a supervised scenario certain scenes of no interest to the user are recognized and excluded from the summary. An experimental evaluation validating the feasibility and the robustness of these approaches is presented.
2013
The digital storage in this technology world has played a major role in all most all of our routine applications. It contains a large amount of data including videos, extracted features, alerts, statistics etc. Designing systems to manage this extensive data and make it easily accessible for query and search is a very challenging and potentially rewarding problem. However, the vast majority of research in video indexing has taken place in the field of multimedia, in particular for authored or produced video such as news or movies, and spontaneous and broadcast video such as sporting events.This paper mainly focuses on the analyis of the video using the shot boundary detection methods. Shot boundary detection is the fundamental step in the content based video analysis. It is even a major research issue, since this has been used as an important parameter in the video retrieval process.The analysis in this paper contains two different methods, 1. Block based Histogram difference and 2....
IEEE Access
The fast progress in digital technology has sparked the generation of the amount of voluminous data from different social media platforms like Instagram, Facebook, YouTube, etc. There are other platforms, as well which generate large data like News, CCTV videos, sports, entertainment, etc. Lengthy Videos typically contain a significant number of duplicate occurrences that are uninteresting to the viewer. Eliminating this unnecessary information and concentra only on the crucial events will be far more advantageous. This produces a summary of lengthy films, which can save viewers time and enable better memory management. The highlights of a lengthy video are condensed into a video summary. Video summarization is an essential topic today, since many industries have CCTV cameras installed for various reasons such as monitoring, security, and tracking. Because surveillance videos are taken 24 hours a day, enormous amounts of memory and time are required if one wishes to trace any incident or person from the full day's video. The summary generated from multiple views is far more challenging, so more study and advancement in MVS is required. The conceptual basis of video summarizing approaches is thoroughly addressed in this paper. This paper addresses applications and technology challenges in Single view and Multi View summarization. INDEX TERMS Video summarization survey, video sequence, single view summarization (SVS), multi view summarization (MVS), big data.
A Unified Framework for Video Summarization, Browsing and Retrieval, 2006
In this paper, we present a semantic summarization algorithm that interfaces with the metadata and that works in compressed domain, in particular MPEG-1 and MPEG-2 videos. In enabling a summarization algorithm through high-level semantic content, we try to address two major problems: First, we present the facility provided in the DVA system that allows the semiautomatic creation of this metadata. Second, we address the main point of this system which is the utilization of this metadata to filter out the frames, creating an abstract of a video based on a Boolean condition set by the user. Our video summary quality survey indicates that the proposed method performs satisfactorily.
Samriddhi - a journal of physical sciences, engineering and technology, 2023
As technology progresses, a gigantic amount of video data is generated day by day. Processing such a huge video needs time and requires increased storage and computational power. Sometimes it is convenient for the user to watch a summary or highlight rather than a complete video, which is time-consuming. So, a fully automated solution is required to extract important segments from a video. Researchers have proposed multiple approaches/techniques for summarizing the videos, which resolve the problem of long videos and summarize them according to the video type. This survey and comparative evaluation of video summarizing techniques based on several domains are presented in this paper. Primarily, these methods are classified into different categories based on their methods or techniques used. Then an overview of some the latest literature is presented with the dataset and evaluation approaches used. The review is also made related to the domain direction and concluded by presenting benefits and difficulties associated with current video summarization techniques.
Multimedia Systems, 2003
Video is increasingly the medium of choice for a variety of communication channels, resulting primarily from increased levels of networked multimedia systems. One way to keep our heads above the video sea is to provide summaries in a more tractable format. Many existing approaches are limited to exploring important low-level feature related units for summarization. Unfortunately, the semantics, content and structure of the video do not correspond to low-level features directly, even with closed-captions, scene detection, and audio signal processing. The drawbacks of existing methods are the following:
Automatic video summarization is indispensable for fast browsing and efficient management of large video libraries. In this paper, we introduce an image feature that we refer to as heterogeneity image patch (HIP) index. The proposed HIP index provides a new entropy-based measure of the heterogeneity of patches within any picture. By evaluating this index for every frame in a video sequence, we generate a HIP curve for that sequence. We exploit the HIP curve in solving two categories of video summarization applications: key frame extraction and dynamic video skimming. Under the key frame extraction framework, a set of candidate key frames is selected from abundant video frames based on the HIP curve. Then, a proposed patchbased image dissimilarity measure is used to create affinity matrix of these candidates. Finally, a set of key frames is extracted from the affinity matrix using a min-max based algorithm. Under video skimming, we propose a method to measure the distance between a video and its skimmed representation. The video skimming problem is then mapped into an optimization framework and solved by minimizing a HIP-based distance for a set of extracted excerpts. The HIP framework is pixel-based and does not require semantic information or complex camera motion estimation. Our simulation results are based on experiments performed on consumer videos and are compared with state-ofthe-art methods. It is shown that the HIP approach outperforms other leading methods, while maintaining low complexity.
Expert Systems with Applications, 2013
The advances in computer and network infrastructure together with the fast evolution of multimedia data has resulted in the growth of attention to the digital video's development. The scientific community has increased the amount of research into new technologies, with a view to improving the digital video utilization: its archiving, indexing, accessibility, acquisition, store and even its process and usability. All these parts of the video utilization entail the necessity of the extraction of all important information of a video, especially in cases of lack of metadata information. The main goal of this paper is the construction of a system that automatically generates and provides all the essential information, both in visual and textual form, of a video. By using the visual or the textual information, a user is facilitated on the one hand to locate a specific video and on the other hand is able to comprehend rapidly the basic points and generally, the main concept of a video without the need to watch the whole of it. The visual information of the system emanates from a video summarization method, while the textual one derives from a keyword based video annotation approach. The video annotation technique is based on the keyframes, that constitute the video abstract and therefore, the first part of the system consists of the new video summarization method. According to the proposed video abstraction technique, initially, each frame of the video is described by the Compact Composite Descriptors (CCDs) and a visual word histogram. Afterwards, the proposed approach utilizes the Self-Growing and Self-Organized Neural Gas (SGONG) network, with a view to classifying the frames into clusters. The extraction of a representative key frame from every cluster leads to the generation of the video abstract. The most significant advantage of the video summarization approach is its ability to calculate dynamically the appropriate number of final clusters. In the sequel, a new video annotation method is applied to the generated video summary leading to the automatic generation of keywords capable of describing the semantic content of the given video. This approach is based on the recently proposed N-closest Photos Model (NCP). Experimental results on several videos are presented not only to evaluate the proposed system but also to indicate its effectiveness.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.