MediaBlender : Interactive Multimedia Segmentation
2011
Abstract
This project aims at providing segmentation tools for assisting the browsing of multimedia content. We implemented several segmentation algorithms: two based on the Bayesian Information Criterion (BIC), the first frame-by-frame and the second using a "divide-and-conquer" approach; a third based on the self-similarity matrix. We integrated the Torch3 library to benefit from classification algorithms. We also introduced optimization methods, notably parallelized image/video feature extraction distributed over GPUs and CPUs of one computer, using StarPU. All the components developed as part of this project have been incorporated in the MediaCycle framework.
References (18)
- REFERENCES
- Cédric Augonnet et al. "StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architec- tures". In: In Concurrency and Computation: Practice and Experience, Euro-Par (2009), pp. 863-874. (Cit. on p. 4).
- Connelly Barnes et al. "Video Tapestries with Continu- ous Temporal Zoom". In: ACM Transactions on Graphics (Proc. SIGGRAPH) 29.3 (Aug. 2010). (Cit. on pp. 5, 6).
- Rita Borgo et al. "A Survey on Video-based Graphics and Video Visualizations". In: Proc. of the EuroGraphics conf., State of the Art Report. 2011, pp. 1-23. (Cit. on p. 6).
- J Canny. "A computational approach to edge detection". In: IEEE Trans. on Pattern Analysis and Machine Intelligence 8.6 (1986), pp. 679-714. ISSN: 0162-8828. (Cit. on p. 3).
- S.S. Chen and P.S. Gopalakrishnan. "Speaker, environ- ment and channel change detection and clustering via the bayesian information criterion". In: (1998). (Cit. on p. 1).
- S.S. Cheng, H.M. Wang, and H.C. Fu. "BIC-based audio segmentation by divide-and-conquer". In: Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP). IEEE. 2008, pp. 4841-4844. (Cit. on p. 2).
- Patrick Chiu, Andreas Girgensohn, and Qiong Liu. "Stained-Glass Visualization for Highly Condensed Video Summaries". In: Proc. of the IEEE Intl. Conf. on Multime- dia and Expo (ICME 2004). 2004. (Cit. on p. 6).
- R. Collobert, S. Bengio, and J. Mariéthoz. Torch: a mod- ular machine learning software library. Technical Report IDIAP-RR 02-46. IDIAP, 2002. (Cit. on p. 2).
- Laurent Couvreur et al. "Audio Thumbnailing". In: QPSR of the numediart research program. Ed. by Thierry Dutoit. Vol. 1. 2. numediart Research Program on Digital Art Tech- nologies. June 2008, pp. 67-85. URL: http : / / www . numediart.org/docs/numediart_2008_s02_ p3_report.pdf. (Cit. on p. 6).
- R Deriche. "Using Canny's criteria to derive a recursively implemented optimal edge detector". In: Intl. J. Vision (1987), pp. 167-187. (Cit. on p. 3).
- J. Foote. "Automatic audio segmentation using a measure of audio novelty". In: IEEE Intl. Conf. on Multimedia and Expo (ICME). Vol. 1. 2000, pp. 452 -455. (Cit. on p. 2).
- GPU4vision. URL: http://www.gpu4vision.org. (Cit. on p. 3).
- Yuancheng Luo and Ramani Duraiswani. "Canny Edge De- tection on NVIDIA CUDA". In: Proc. of the Workshop on Computer Vision on GPUS, CVPR (2008). (Cit. on p. 4).
- NVIDIA CUDA. URL: http : / / www . nvidia . com / cuda. (Cit. on pp. 2, 3).
- Gonzalo Ramos and Ravin Balakrishnan. "Fluid Interac- tion Techniques for the Control and Annotation of Digital Video". In: Proc. of UIST. 2003. (Cit. on p. 6).
- G. Schwarz. "Estimating the dimension of a model". In: The annals of statistics (1978), pp. 461-464. ISSN: 0090-5364. (Cit. on p. 1).
- Anthony Tang, Saul Greenberg, and Sidney Fels. "Explor- ing Video Streams using Slit-Tear Visualizations". In: Proc. of Advanced Visual Interfaces (AVI). 2008. (Cit. on p. 6).