Academia.eduAcademia.edu

Outline

An Image Retrieval System for Video

2019, Similarity Search and Applications 12th International Conference, SISAP 2019

https://doi.org/10.1007/978-3-030-32047-8_29

Abstract

Since the 1970's the Content-Based Image Indexing and Retrieval (CBIR) has been an active area. Nowadays, the rapid increase of video data has paved the way to the advancement of the technologies in many different communities for the creation of Content-Based Video Indexing and Retrieval (CBVIR). However, greater attention needs to be devoted to the development of effective tools for video search and browse. In this paper, we present Visione, a system for large-scale video retrieval. The system integrates several content-based analysis and retrieval modules , including a keywords search, a spatial object-based search, and a visual similarity search. From the tests carried out by users when they needed to find as many correct examples as possible, the similarity search proved to be the most promising option. Our implementation is based on state-of-the-art deep learning approaches for content analysis and leverages highly efficient indexing techniques to ensure scalability. Specifically, we encode all the visual and textual descriptors extracted from the videos into (surrogate) textual representations that are then efficiently indexed and searched using an off-the-shelf text search engine using similarity functions.

References (19)

  1. Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L., Vairo, C.: VISIONE at VBS2019. In: MultiMedia Modeling -25th International Conference, MMM 2019, Thessaloniki, Greece, January 8-11, 2019, Proceedings, Part II. pp. 591-596 (2019)
  2. Amato, G., Falchi, F., Gennaro, C., Rabitti, F.: Searching and Annotating 100M Images with YFCC100M-HNfc6 and MI-File. In: Proceedings of the 15th Inter- national Workshop on Content-Based Multimedia Indexing. pp. 26:1-26:4. CBMI '17, ACM (2017)
  3. Amato, G., Falchi, F., Gennaro, C., Vadicamo, L.: Deep permutations: Deep con- volutional neural networks and permutation-based indexing. In: Similarity Search and Applications. pp. 93-106. Springer International Publishing, Cham (2016)
  4. Awad, G., Snoek, C.G.M., Smeaton, A.F., Quénot, G.: Trecvid semantic indexing of video : A 6-year retrospective. ITE Transactions on Media Technology and Applications 4(3), 187-208 (2016)
  5. Fellbaum, C., Miller, G.: WordNet: an electronic lexical database. Language, speech, and communication, MIT Press (1998)
  6. Gennaro, C., Amato, G., Bolettieri, P., Savino, P.: An approach to content-based image retrieval based on the lucene search engine library. In: Research and Ad- vanced Technology for Digital Libraries. pp. 55-66. Springer Berlin Heidelberg (2010)
  7. Gordo, A., Almazán, J., Revaud, J., Larlus, D.: End-to-end learning of deep vi- sual representations for image retrieval. International Journal of Computer Vision 124(2), 237-254 (2017)
  8. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadar- rama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  9. Jiang, Y.G., Wu, Z., Wang, J., Xue, X., Chang, S.F.: Exploiting feature and class relationships in video categorization with regularized deep neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(2), 352-364 (2018)
  10. Lokoč, J., Kovalčík, G., Souček, T.: Revisiting siret video retrieval tool. In: Mul- tiMedia Modeling. pp. 419-424. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6 44
  11. Lokoč, J., Bailer, W., Schöffmann, K., Münzer, B., Awad, G.: On influential trends in interactive video retrieval: Video browser showdown 2015-2017. IEEE Transac- tions on Multimedia 20(12), 3361-3376 (2018)
  12. Lokoč, J., Kovalčík, G., Münzer, B., Schöffmann, K., Bailer, W., Gasser, R., Vrochidis, S., Nguyen, P.A., Rujikietgumjorn, S., Barthel, K.U.: Interactive search or sequential browsing? a detailed analysis of the video browser showdown 2018. ACM Trans. Multimedia Comput. Commun. Appl. 15(1), 29:1-29:18 (2019)
  13. Niraimathi, D.S.: Color based image segmentation using classification of k-nn with contour analysis method. International Research Journal of Engineering and Tech- nology 3, 1169-1177 (2016)
  14. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
  15. Rossetto, L., Schuldt, H., Awad, G., Butt, A.A.: V3c -a research video collection. In: MultiMedia Modeling. pp. 349-360. Springer International Publishing, Cham (2019)
  16. Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.J.: YFCC100M: The new data in multimedia research. Communications of the ACM 59(2), 64-73 (2016)
  17. Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max- pooling of cnn activations. arXiv preprint arXiv:1511.05879 (2015)
  18. Truong, T.D., Nguyen, V.T., Tran, M.T., Trieu, T.V., Do, T., Ngo, T.D., Le, D.D.: Video search based on semantic extraction and locally regional object proposal. In: MultiMedia Modeling. pp. 451-456. Springer International Publishing (2018)
  19. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)