An improved sub-optimal video summarization algorithm
2010
Sign up for access to the world's latest research
Abstract
During the last few years the amount of digital video content has been increasing exponentially as a result of the proliferation of media sources like digital TV, streaming video internet sites like YouTube and wider availability of digital video cameras. The video data volume is so large that the only way a user can browse these libraries is through the use of timecondensation techniques. Video summarization achieves timecondensation by choosing a sub-set of frames of the original video creating a summary hopefully representative of the source video. The frame selection process can be directed according to different principles, based on either subjective or objective frame-relevance measures. Previous works have used dynamic programming (DP) and greedy approaches to choose the frames that make up the video summary. We present an algorithm that performs better than the greedy solution achieving a performance simplicity.
Related papers
ABSTRACT Video summary work originates from view time constraint; a shorter version of original video sequence is desirable in some applications. Our work is based on visual significance analysis, which is a function of visual features of interests over time. Once the frames in a sequence are labeled with visual significance, one-pass or two-pass frame selection algorithms are proposed to generate perceptually optimal video summary according to the visual significance function.
Communications in Computer and Information Science, 2011
Video summarization is a procedure to reduce the size of the original video without affecting vital information presented by the video. This paper presents an innovative video summarization technique based on inter-frame information variation. Similar group of frames are identified based on inter-frame information similarity. Key frames of a group are selected using disturbance ratio (DR), which is derived by measuring the ratio of information changes between consecutive frames of a group. The frames in the summarized video are selected by considering continuation in understanding the message carried out by the video. Higher priority is given to the frames which have higher information changes, and no-repetition to reduce the redundant areas in the summarized video. The higher information changes in the video frames are detected based on the DR measure of the group and this makes our algorithm adaptive in respect to the information content of the source video. The results show the effectiveness of the proposed technique compared to the related research works.
2004
Abstract The need for video summarization originates primarily from a viewing time constraint. A shorter version of the original video sequence is desirable in a number of applications. Clearly, a shorter version is also necessary in applications where storage, communication bandwidth and/or power are limited. Our work is based on a temporal rate-distortion optimization formulation for optimal summary generation. New metrics for video summary distortion are introduced.
Proceedings of the international workshop on TRECVID video summarization - TVS '07, 2007
This paper describes a system for selecting excerpts from unedited video and presenting the excerpts in a short summary video for efficiently understanding the video contents. Color and motion features are used to divide the video into segments where the color distribution and camera motion are similar. Segments with and without camera motion are clustered separately to identify redundant video. Audio features are used to identify clapboard appearances for exclusion. Representative segments from each cluster are selected for presentation. To increase the original material contained within the summary and reduce the time required to view the summary, selected segments are played back at a higher rate based on the amount of detected camera motion in the segment. Pitch-preserving audio processing is used to better capture the sense of the original audio. Metadata about each segment is overlayed on the summary to help the viewer understand the context of the summary segments in the original video.
Proceedings of the 15th ACM international conference on Multimedia, 2007
International Journal of Image, Graphics and Signal Processing, 2016
To select the long-running videos from online archives and other collections, the users would like to browse, or skim through quickly to get a hint on the semantic content of the videos. Video summarization addresses this problem by providing a short video summary of a full-length video. An ideal video summary would include all the important segments of the video and remain short in length. The problem of summarization is extremely challenging and has been a widely pursued subject of recent research. There are many algorithms presented in literature for video summarization and they represent visual information of video in concise form. Dynamic summaries are constructed with collection of key frames or some smaller segments extracted from video and is presented in the form of small video clip. This paper describes an algorithm for constructing the dynamic summary of a video by modeling every 40 consecutive frames of video as a bipartite graph. The method considers every 20 consecutive frames from video as one set and next 20 consecutive frames as second set of bipartite graph nodes with frames of the video representing nodes of the graph and edges connecting nodes denoting the relation between frames and edge weight depicting the mutual information between frames. Then the minimum edge weight maximal matching in every bipartite graph (a set of pair wise non-adjacent edges) is found using Hungarian method. The frames from the matchings which are represented by the nodes connected by the edges with weight below some empirically defined threshold and two neighbor frames are taken as representative frames to construct the summary. The results of the experiments conducted on data set containing sports videos taken from YOUTUBE and videos of TRECVID MED 2011 dataset have demonstrated the satisfactory average values of performance parameters, namely Informativeness value of 94 % and Satisfaction value of 92 %. These values and duration (MSD) of summaries reveal that the summaries constructed are significantly concise and highly informative and provide highly acceptable dynamic summary of the videos.
International Journal of Innovative Computing
Video summarization has arisen as a method that can help with efficient storage, rapid browsing, indexing, fast retrieval, and quick sharing of the material. The amount of video data created has grown exponentially over time. Huge amounts of video are produced continuously by a large number of cameras. Processing these massive amounts of video requires a lot of time, labor, and hardware storage. In this situation, a video summary is crucial. The architecture of video summarization demonstrates how a lengthy film may be broken down into shorter, story-like segments. Numerous sorts of studies have been conducted in the past and continue now. As a result, several approaches and methods—from traditional computer vision to more modern deep learning approaches—have been offered by academics. However, several issues make video summarization difficult, including computational hardware, complexity, and a lack of datasets. Many researchers have recently concentrated their research efforts on ...
International Journal of Scientific Research in Science and Technology, 2021
Modern era, a massive amount of multimedia data is analysed, browsed, and retrieved, slowing down delivery and increasing computation costs. Video summarization is an aspect of building video and browsing that has been increased to process all video information in the shortest amount of time. This method allows users to browse large amounts of data quickly. It is the method of separating key frames and video skims to create a summarized or abstract view of an entire video in the shortest amount of time while also removing duplication or redundant features. Paper focus on different ways to achieve a sample video: static and dynamic, which are divided into two categories. With both the rapid advancement of digital video technology, it is now possible to upload large videos from Youtube or other websites, as well as record massive amounts of data such as news, sports, lecture, and surveillance videos, among other things. Video storage, transfer, and processing take a significant amount of time. The user may not have enough time to watch the video prior to actually downloading it, or the user requires a quick and precise video search result. In these kind of cases, the video's highlight or summary speeds up search and indexing operations, and the user can view the video's focus or summary before downloading it.
International journal of engineering research and technology, 2020
Video summarization which gives a short and precise representation of original video clips by showing the most representative synopsis is gaining more attention. The main objective of Video summarization is to provide a clear analysis of the video by removing redundant and extracting key frames contents from the video. The architecture in video summarization shows how a large video skims in to short and story contents. Many types of research were done in the past and ongoing until now. Therefore, multiple methods and techniques proposed by researchers from classical computer vision until the recent deep learning approaches. Most literature shows that most of the video generation and summarization approaches shift into deep generative models and variational auto encoders. These techniques may fall into summarized, unsupervised and deep reinforcement learning approaches. Video representation categorized in static and dynamic summarization ways. But video summarization still challenging with different problems, these are computational devices, complexity, and lack of dataset are some them. The effective implementation of video summarization applied in different real-world scenarios like movies tailor in the film industry, highlight in football soccer, anomaly detection video surveillance system.
Multimedia Systems, 2004
In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot-segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the supergroup into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units.A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (3)
- Z. Li, G. M. Schuster, A. K. Katsaggelos, and B. Gandhi, Rate- distortion optimal video summary generation IEEE Transactions on Image Processing, vol. 14, pp. 1550 1560, Oct. 2005.
- Technical report, Systems Neurobiology Laboratory, Salk Insitute for Biological Studies, December 2005.
- Ferreira, L.F.; Cruz, L. A; Assunção, P.A.; "Video Summary Generation and Coding Using Temporal Scalability", Proc Conf. on Telecommunications -ConfTele, Santa Maria da Feiria, Portugal, Vol. 1, pp. 283 -286, May, 2009.