A Survey on Multimodal Video Representation for Semantic Retrieval
2005
https://doi.org/10.1109/EURCON.2005.1629877Abstract
This paper surveys the approaches to video representation, focusing on semantic analysis for content-based indexing and retrieval. A problem of adaptive representation of digital multimedia is critically assessed and some novel ideas are presented. Furthermore, the concept of video multimodality is reevaluated and redefined in order to introduce modalities such as editing technique or affect to the audience.
References (43)
- B.D. Adams. Where does computational media aesthetics fit? IEEE Multimedia Magazine, spec. ed. Computational Media Aesthetics, April-June 2003.
- Steve Anderson. Select and combine: The rise of database narratives. Res Magazine, 7(1):52-53, Jan/Feb 2004.
- Kobus Barnard, Pinar Duygulu, David Forsyth, Nando de Freitas, David M. Blei, and Michael I. Jordan. Matching words and pictures. J. Mach. Learn. Res., 3:1107-1135, 2003.
- M. Bertini, A. Del Bimbo, and W. Nunziati. Highlights modeling and detection in sports videos. Pattern Analysis and Applications, 2005.
- S. Bloehdorn, K. Petridis, C. Saathoff, N. Simou, V. Tzouvaras, Y. Avrithis, S. Handschuh, I. Kompatsiaris, S. Staab, and M. G. Strintzis. Semantic annotation of images and videos for multimedia analysis. In Proceedings of the 2nd European Semantic Web Confer- ence, ESWC 2005, Heraklion, Greece, May 2005.
- Tilo Burghardt, Janko Calic, and Barry Thomas. Tracking animals in wildlife videos using face detection. In European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, October 2004.
- Tilo Burghardt, Barry Thomas, Peter J Barham, and Janko Calic. Au- tomated visual recognition of individual african penguins. In Fifth International Penguin Conference, Ushuaia, Tierra del Fuego, Ar- gentina, September 2004.
- Janko Calic, Neill Campbell, Majid Mirmehdi, Barry Thomas, Ron Laborde, Sarah Porter, and Nishan Canagarajah. ICBR -multimedia management system for intelligent content based retrieval. In Inter- national Conference on Image and Video Retrieval CIVR 2004, pages 601-609. Springer LNCS 3115, July 2004.
- Janko Calic and Barry Thomas. Spatial analysis in key-frame extrac- tion using video segmentation. In Workshop on Image Analysis for Multimedia Interactive Services, April 2004.
- Gert Cauwenberghs and Tomaso Poggio. Incremental and decremen- tal support vector machine learning. In Proc. of Neural Information Processing Systems (NIPS) 2000, Denver, USA.
- Glorianna Davenport, Thomas Aguirre Smith, and Natalio Pincever. Cinematic primitives for multimedia. IEEE Comput. Graph. Appl., 11(4):67-74, 1991.
- Marc Davis. Media streams: representing video for retrieval and repurposing. In MULTIMEDIA '94: Proceedings of the second ACM international conference on Multimedia, pages 478-479, New York, NY, USA, 1994. ACM Press.
- Marc Davis. Media streams: representing video for retrieval and repurposing. PhD thesis, Cambridge, MA, USA, 1995.
- Nevenka Dimitrova. Context and memory in multimedia content analysis. IEEE MultiMedia, 11(3):7-11, 2004.
- A. Dorado, J. Calic, and E. Izquierdo. A rule-based video annotation system. Circuits and Systems for Video Technology, IEEE Transac- tions on, 14(5):622-633, May 2004.
- Chitra Dorai and Svetha Venkatesh. Media computing: computa- tional media aesthetics. The Kluwer international series in video computing. Kluwer Academic Publishers, Boston; London, 2002.
- Venkatesh S. Dorai, C. Bridging the semantic gap with computational media aesthetics. Multimedia, IEEE, 10:15-17, 2003.
- Sergei Eisenstein and Jay Leyda. Film Form Essays in film theory. Dennis Dobson Ltd, [S.l.], 1949. edited and translated by Jay Leyda.
- Louis D. Giannetti. Understanding movies. Prentice Hall; London: Prentice-Hall International (UK), Upper Saddle River, N.J., 9th ed. edition, 2002.
- A. Hanjalic and L. Q. Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143-154, 2005.
- Sion Hannuna, Neill Campbell, and David Gibson. Segmenting quadruped gait patterns from wildlife video. In The IEE Interna- tional Conference on Visual Information Engineering: Convergence in Graphics and Vision., pages 235-243. Institution of Electrical En- gineers, April 2005.
- I. Kompatsiaris, Y. Avrithis, P. Hobson, T. May, and J. Tromp. Achieving Integration of Knowledge and Content Technologies: The AceMedia Project. In Proc. European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, Royal Sta- tistical Society, London, UK, Nov. 2004.
- Lev Kuleshov and Ronald Levaco. Kuleshov on film: writings by Lev Kuleshov. University of California Press, Berkeley; London, 1974.
- R. Leonardi and P. Migliorati. Semantic indexing of multimedia doc- uments. Multimedia, IEEE, 9(2):44, 2002.
- R. Leonardi, P. Migliorati, and M. Prandini. Semantic indexing of soccer audio-visual sequences: a multimodal approach based on con- trolled Markov chains. Circuits and Systems for Video Technology, IEEE Transactions on, 14(5):634, 2004.
- Dongge Li, Nevenka Dimitrova, Mingkun Li, and Ishwar K. Sethi. Multimedia content processing through cross-modal association. In MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia, pages 604-611, New York, NY, USA, 2003. ACM Press.
- Lev Manovich. The language of new media. Leonardo. MIT Press, Cambridge, Mass.; London, 2001.
- Christian Metz. [Essais sur la signification au cinéma.] Film lan- guage. A semiotics of the cinema. Translated by Michael Taylor. New York: Oxford University Press, 1974.
- F. Nack and A. Parkes. Toward the automated editing of theme- oriented video sequences. Applied Artificial Intelligence, 11(4):331- 366, 1997.
- Milind R. Naphade. On supervision and statistical learning for se- mantic multimedia analysis. Journal of Visual Communication and Image Representation, 15(3):348-369, 2004.
- Laurence Nigay and Joëlle Coutaz. A design space for multimodal systems: concurrent processing and data fusion. In CHI '93: Pro- ceedings of the SIGCHI conference on Human factors in computing systems, pages 172-178, New York, NY, USA, 1993. ACM Press.
- M. Ramesh Naphade, I. V. Kozintsev, and T. S. Huang. Factor graph framework for semantic video indexing. Circuits and Systems for Video Technology, IEEE Transactions on, 12(1):40, 2002.
- J.J. Rocchio, Jr. The SMART Retrieval System: Experiments in Au- tomatic Document Processing, chapter Relevance Feedback in Infor- mation Retrieval, pages 313-323. Prentice-Hall, 1971.
- S. Santini, A. Gupta, and R. Jain. Emergent semantics through inter- action in image databases. Knowledge and Data Engineering, IEEE Transactions on, 13(3):337, 2001.
- C. Saraceno and R. Leonardi. Indexing audiovisual databases through joint audio and video processing. IJIST, 9(5):320-331, 1999.
- Ferdinand de Saussure. Course in general linguistics. Duckworth, London, 1983.
- Arnold W. M. Smeulders, Marcel Worring, Simone Santini, Amar- nath Gupta, and Ramesh Jain. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell., 22(12):1349-1380, 2000.
- C.G.M. Snoek and M. Worring. Multimodal video indexing: A review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5-35, 2005.
- S. Staab. Emergent semantics. IEEE Intelligent Systems, 17(1):78- 81, 2002.
- M. E. J. Wood, N. W. Campbell, and B. T. Thomas. Iterative refine- ment by relevance feedback in content-based digi tal image retrieval. In ACM Multimedia 98, pages 13-20. ACM, September 1998.
- Wang Yao, Liu Zhu, and Huang Jin-Cheng. Multimedia content analysis-using both audio and visual clues. Signal Processing Mag- azine, IEEE, 17(6):12, 2000.
- A. Yoshitaka and T. Ichikawa. A survey on content-based retrieval for multimedia databases. IEEE Transactions on Knowledge and Data Engineering, 11(1):81-93, Jan/Feb 1999.
- Xiang Sean Zhou and Thomas S. Huang. Relevance feedback in im- age retrieval: A comprehensive review. Multimedia Syst., 8(6):536- 544, 2003.