Abstract
Static multimedia on the Web can already be hardly structured manually. Although unavoidable and necessary, manual annotation of dynamic multimedia becomes even less feasible when multimedia quickly changes in complexity, i.e. in volume, modality, and usage context. The latter context could be set by learning or other purposes of the multimedia material. This multimedia dynamics calls for categorisation systems that index, query and retrieve multimedia objects on the fly in a similar way as a human expert would. We present and demonstrate such a supervised dynamic multimedia object categorisation system. Our categorisation system comes about by continuously gauging it to a group of human experts who annotate raw multimedia for a certain domain ontology given a usage context. Thus effectively our system learns the categorisation behaviour of human experts. By inducing supervised multi-modal content and context-dependent potentials our categorisation system associates field strengths of raw dynamic multimedia object categorisations with those human experts would assign. After a sufficient long period of supervised machine learning we arrive at automated robust and discriminative multimedia categorisation. We demonstrate the usefulness and effectiveness of our multimedia categorisation system in retrieving semantically meaningful soccer-video fragments, in particular by taking advantage of multimodal and domain specific information and knowledge supplied by human experts.
References (34)
- J. R. Quinlan, "C4.5: Programs for Machine Learning, Morgan Kaufmann," San Mateo, CA, 1992.
- W. Daelemans, J. Zavrel, K. van der Sloot and A. van den Bosch, "TiMBL: Tilburg Memory Based Learner," version 4.2, Reference Guide Technical Report 02-01, 2002.
- L. Breiman, J. H. Friedman, R. A. Olshen and C. J. Stone, Classification and Regression Trees, Wadsworth, United Kingdom, 1984.
- A.H. Salden, "Multimedia system analysis and processing," In Proceedings of 2001 IEEE International Conference on Multimedia and Expo, ICME2001, Waseda University, Tokyo, Japan, CD-ROM, 2001.
- F. Aldershoff and A. H. Salden,"Multiscale audio-video analysis and processing: segmentations and arrange- ments," In Proceedings of SPIE , Internet Multimedia Management Systems II, 4519, pp. 20-31, 2001.
- B. E. Stein, P. J. Laurienti, T. R. Stanford and M. T. Wallace, "Neural mechanisms for integrating in- formation from multiple senses," In Proceedings IEEE International Conference on Multimedia and Expo ICME2000, pp. 567-570, 2000.
- C. Faloutsos, M. Flickner, W. Niblack, D. Petkovic, W. Wquitz, R. Barber, "Efficient and Effective Querying by Image Content," Research Report RJ 9203 (81511), IBM Almaden Research Center, San Jose, 1993.
- J. R. Smith and S. F. Chang, "VisualSEEK: A fully automated content-based image query system," ACM Multimedia, 1996.
- A. Hamrapur, A. Gupta, B. Horowitz, C. F. Shu, C. Fuller, J. Bach, M. Gorkani and R. Jain, "VIRAGE Video Engine," In SPIE Proceedings on Storage and Retrieval for Image and Video Databases V, pp. 188-197, 1997.
- S. F. Chang, W. Chen, J. Meng, H. Sundaram and D. Zhong, "VideoQ: An automated content based video search system using visual cues," ACM Multimedia 1997, pp. 313-324, 1997.
- C. Meghini, F. Sebastiani and U. Straccia, "A model of multimedia information retrieval," Journal of the ACM 48(5), pp. 909-970, 2001.
- N. Slonim and N. Thisby, "The power of word clusters for text classification," In Proceedings of ECIR-01, 23-rd European Colloquium in Information Retrieval Research, Darmstadt, Germany, 2001.
- X. Song and G. Fan, "A Study of Supervised, Semi-Supervised and Unsupervised Multiscale Bayesian Image Segmentation," In Proceedings of the 45th IEEE International Midwest Symposium on Circuits and Systems, Tulsa, Oklahoma, USA, 2002.
- M. S. Crouse, R. D. Nowak, and R. G. Baraniuk, "Wavelet-based statistical signal processing using hidden Markov models," IEEE Trans. Signal Processing, 46(4), pp. 886-902, April 1998.
- H. Choi and R. Baraniuk, "Multiscale image segmentation using wavelet-domain hidden Markov models," IEEE Transactions in Image Processing, 10(9), pp. 1309-1321, 2001.
- G. Fan and X.-G. Xia, "Maximum likelihood texture analysis and classification using wavelet-domain hidden Markov models," In Proceedings of the 34th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, 2000.
- C. A. Bouman and M. Shapiro, "A multiscale random field model for Bayesian image segmentation," IEEE Transactions on Image Processing, 3(2), pp. 162-177, 1994.
- G. Fan and X.-G. Xia, "A joint multi-context and multiscale approach to Bayesian image segmentation," IEEE Transactions on Geo-science and Remote Sensing, 39(12), pp. 2680-2688, 2001.
- G. Fan and X.-G. Xia, "On context-based Bayesian image segmentation: Joint multi-context and multiscale approach and wavelet-domain hidden Markov models," In Proceedings of 35th Asilo-mar Conf on Signals, Systems and Computers, Pacific Grove, CA, Nov. 2001.
- A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta and R. Jain, "Content-based image retrieval, the end of the early years," IEEE TRANS PAMI, 22(12), pp. 1349-1380, 2000.
- J.M. Corridoni, A. del Bimbo and P. Pala, "Image retrieval by colour semantics," Multimedia Systems, 7, pp. 175-183, 1999.
- T. Gevers and A.'W. M. Smeulders, "Colour based object recognition," Pattern Recognition, pp. 453-464, 1999.
- H. Burkhardt and S. Siggelkow, "Invariant features for discriminating between equivalence classes," In Nonlinear model-based image video processing and analysis, John Wiley and Sons, 2000.
- G. Sommer and J. J. Koenderink (Eds.), Algebraic Frames for the Perception-Action Cycle, In Lecture Notes in Computer Science; Lecture Notes in Artificial Intelligence, 1315(VIII), 1997.
- A. Del Bimbo, "Expressive semantics for automatic annotation and retrieval of video streams," In Proceed- ings of IEEE International Conference on Multimedia and Expo ICME2000, pp. 671-674, 2000.
- T. Joachims and F. Sebastiani (eds.), "Automated text categorisation ," Special issue of Journal of Intelli- gent Information Systems, 18(2-3), 2002.
- D. D. Lewis and P. J. Hayes, (eds.), "Automated text categorisation ," Special issue of ACM Transactions on Information Systems, 12(3), 1994.
- F. Sebastiani, "Machine learning in automated text categorisation: a survey," IEI-B4-31-1999," Pisa, IT, 1999.
- L. Denoyer, H. Zaragoza and P. Gallinari, "HMM-based passage models for document classification and ranking," In Proceedings of ECIR-01, 23-rd European Colloquium in Information Retrieval Research, Darm- stadt, Germany, 2001.
- N. Fuhr, "A probabilistic model of dictionary-based automatic indexing," In Proceedings of RIAO-85, 1-st International Conference "Recherche d'Information Assistée par Ordinateur", Grenoble, France, pp. 207- 216, 1985.
- S. C. Deerwester, S. T. Dumais, T. K. Landauer, W. Furnas and R. A. Harshman, "Indexing by Latent Semantic Analysis," Journal of the American Society of Information Science, 41(6), pp. 391-407, 1990.
- G. Csurka and O. Faugeras, "Algebraic and geometric tools to compute projective and permutation invari- ants," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 21, pp. 58-65, 1999.
- Roman E. Maeder, The Mathematica Programmer II, Academic Press, 1996.
- A. Salden and M. Kempen, "Business Information and Knowledge Sharing," In Proceedings of IASTED International Conference Information and Knowledge Sharing, (IKS 2002), St. Thomas, Virgin Islands, USA, November 18-20, 2002