Academia.eduAcademia.edu

Outline

Semantic ontologies for multimedia indexing (SOMI)

2014, Library Hi Tech

Abstract

Purpose– The overwhelming speed and scale of digital media production greatly outpace conventional indexing methods by humans. The management of Big Data for e-library speech resources requires an automated metadata solution. The paper aims to discuss these issues.Design/methodology/approach– A conceptual model called semantic ontologies for multimedia indexing (SOMI) allows for assembly of the speech objects, encapsulation of semantic associations between phonic units and the definition of indexing techniques designed to invoke and maximize the semantic ontologies for indexing. A literature review and architectural overview are followed by evaluation techniques and a conclusion.Findings– This approach is only possible because of recent innovations in automated speech recognition. The introduction of semantic keyword spotting allows for indexing models that disambiguate and prioritize meaning using probability algorithms within a word confusion network. By the use of AI error-traini...

References (14)

  1. Chelba, C., Silva, J. and Acero, A. (2007), "Soft indexing of speech content for search in spoken documents", Computer Speech and Language, Vol. 21 No. 3, pp. 458-478.
  2. El Meliani, R. and O'Shaughnessy, D. (1995), "Lexical fillers for task-independent- training based keyword spotting and detection of new words", in EUROSPEECH, Fourth European Conference on Speech Communication and Technology, Madrid, Sep- tember 18-21, 1995, Universität Trier, Trier, pp. 2129-2133.
  3. Issam, B. and Ridda, L. (2012), "Approaches for the detection of the keywords in spoken documents application for the field of e-libraries", in Neural Information Processing: 19th International Conference on Neural Information Processing (ICONIP'12), Doha, Qatar, November 12-15, 2012, Springer Berlin/Heidelberg, pp.196-203.
  4. Jones, G.J.F. and Foote, J.T. (1996), "Retrieving spoken documents by combining multiple index sources", in Proceedings of the 19 th Annual International ACM SIGIR Confer- ence on Research and Development in Information Retrieval, Zürich, August 18-22, 1996, ACM Press, New York City, pp. 30-38.
  5. Larson, M. (2001), "Sub-word-based language models for speech recognition: implications for spoken document retrieval", in Workshop Proceedings on Language Modeling and Information Retrieval, Pittsburgh, May 31 -June 1, 2001, Carnegie Mellon University, Pittsburgh, pp. 78-82.
  6. Logan, B., Moreno, P. and Deshmukh, O. (2002), "Word and sub-word indexing ap- proaches for reducing the effects of OOV queries on spoken audio", in Proceedings of the Second International Conference on Human Language Technology Research, San Diego, March 24-27, 2002, Morgan Kaufmann Publishers, San Francisco, pp. 31-35.
  7. Makhoul, J., Kubala, F., Leek, R., Lui, D., Nguyen, L., Schwartz, R and Srivastava, A. (2000). "Speech and language technologies for audio indexing and retrieval," in Pro- ceedings of the IEEE Vol. 88 No. 2, pp. 1338-1353.
  8. Mamou, J., D. Carmel, D., Hoory, R. (2006), "Spoken document retrieval from call-center conversations", in Proceedings of the 29th Annual International ACM SIGIR Confer- ence on Research and Development in Information Retrieval, Seattle, August 6-10, 2006, ACM Press, New York City, pp. 51-58.
  9. 9 Mitschick, A. (2010), "Ontology-based indexing and contextualization of multimedia documents for personal information management applications", International Journal on Advances in Software, Vol. 3 No. 1, pp. 31-40.
  10. Parul, G. and Sharma, A.K. (2010),"Context based indexing in search engines using ontol- ogy", International Journal of Computer Application, Vol. 1 No 14, pp. 49-52.
  11. Srinivasan, S. and Petkovic, D. (2000), "Phonetic confusion matrix based spoken docu- ment retrieval", in Proceedings of the 23rd Annual International ACM SIGIR Confer- ence on Research and Development in Information Retrieval, Athens, Greece, July 24- 28, 2000, Universität Trier, Trier, pp. 81-87.
  12. Turunen, V.T. and Kurimo, M. (2007), "Indexing confusion networks for morph-based spoken document retrieval", in Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, July 23-27, 2007, ACM Press, New York, City, pp. 631-638.
  13. Wessel, F., Schluter, R., Macherey, K. and Ney, H. (2001), "Confidence measures for large vocabulary continuous speech recognition", IEEE Transactions on Speech and Audio Processing, Vol. 9 No. 3, pp. 288-298.
  14. 14 Zargayouna, H. and Salotti, S. (2004), "Mesure de similarité dans une ontologie pour l'indexation sémantique de documents XML", in Actes de la conference Ingénierie des Connaissances, Lyon, May 5-7, 2004, Nada Matta, Lyon, available at: http://liris.cnrs.fr/~ic04/programme/.