Academia.eduAcademia.edu

Multimedia Retrieval

description591 papers
group38 followers
lightbulbAbout this topic
Multimedia Retrieval is the process of searching, accessing, and retrieving information from various types of media, including text, images, audio, and video. It involves the use of algorithms and techniques to index, query, and manage multimedia content, enabling efficient information retrieval based on user queries and preferences.
lightbulbAbout this topic
Multimedia Retrieval is the process of searching, accessing, and retrieving information from various types of media, including text, images, audio, and video. It involves the use of algorithms and techniques to index, query, and manage multimedia content, enabling efficient information retrieval based on user queries and preferences.

Key research themes

1. How can fusion of textual and visual information improve multimedia retrieval performance and semantic understanding?

This research theme investigates approaches that combine textual metadata, natural language queries, and visual features such as color, texture, and high-level semantic concepts to enhance multimedia retrieval accuracy and semantic understanding. It addresses the persistent semantic gap by mapping between low-level visual features and high-level textual or conceptual descriptions, enabling more effective retrieval of relevant multimedia content. This fusion leverages complementary strengths of each modality—text for semantic richness and visual features for specificity.

Key finding: This paper demonstrates that combining text-based query information with visual concept detectors via late fusion significantly improves video retrieval performance on real-world datasets. It finds that automatically mapping... Read more
Key finding: Presents an architecture (HPQS) integrating natural language query interpretation with semantic analysis and content-based retrieval of multimedia (images, tables, text). It exploits data fusion, caching, high-speed... Read more
Key finding: This work extends a multimodal retrieval system by enriching textual features through external query expansion and visual features via logistic regression-based concept detectors. For retrieval, sequential use of textual... Read more
Key finding: Introduces a multimedia retrieval framework that jointly indexes multi-modal content and incorporates a credibility model (expertise, trustworthiness, quality, reliability) to re-rank results. By integrating concept-based... Read more
Key finding: Shows that combining textual and structural features of XML documents using geometric metrics significantly improves multimedia retrieval effectiveness compared to using either modality alone. The approach represents... Read more

2. What advancements in feature representation and dimensionality reduction can enhance content-based multimedia retrieval efficiency and effectiveness?

This research theme focuses on novel representations of image and multimedia features, including combining local and global histograms of visual words, and dimensionality reduction techniques such as principal component analysis (PCA) and kernel PCA. Efficient feature extraction and selection improve retrieval scalability and accuracy by reducing high-dimensional data while preserving salient discriminative information. The exploration includes nonlinear dimension reduction and multilinear kernel mapping to better capture complex data structures and enhance retrieval precision.

Key finding: Proposes representing an image by combining global histograms of visual words over the entire image with local histograms computed over salient object regions (local rectangular areas). Experiments on several benchmark... Read more
Key finding: This study applies kernel PCA, a nonlinear extension of PCA, to extract principal components in a high-dimensional feature space induced by Gaussian kernels for image retrieval. Experimental results indicate that kernel PCA... Read more
Key finding: Introduces a multilinear kernel modeling approach to reduce the dimensionality of feature vectors derived from multimedia content. This approach accounts for the interrelation among dataset features more effectively than... Read more

3. How can structural metadata and query modification techniques address semantic challenges in multimedia retrieval systems?

This theme explores methods leveraging document structure (e.g., XML hierarchies) and interactive query adaptation to improve the retrieval of multimedia content. Techniques include geometric metrics exploiting XML node kinship to calculate relevance of multimedia elements in structured documents, addressing the limited descriptive content of multimedia elements themselves. Additionally, user-centric query modification methods, such as segment-based query refinement and intra-query learning, allow efficient alignment of retrieval systems with subjective human perception, reducing the semantic gap without repeated extensive database searches.

Key finding: Proposes a novel similarity metric based on geometric distances within XML document trees that leverages kinship ties (children, siblings, ancestors) to better assess multimedia element relevance without relying on physical... Read more
Key finding: Introduces an intra-query learning methodology where modified versions of the user query image (generated through segment-level manipulations) are used to infer user perceptual preferences without repeated database searches.... Read more
Key finding: Discusses the necessity of image-based querying in retrieval systems, especially for unknown or unfamiliar images, highlighting shortcomings of existing text or shape-based search requiring descriptive metadata. Emphasizes a... Read more

All papers in Multimedia Retrieval

The SCHEMA Reference System is a content-based image and video indexing and retrieval system that adopts a module-based, expandable architecture. Using this modulebased approach, five different analysis modules, developed at different... more
In this paper, the most recent version of the system developed by the SCHEMA NoE, termed SCHEMA Reference System, is presented. The Reference System adopts a module-based, expandable architecture, with well defined interfaces between... more
This paper presents a system that is designed to make possible the organization and search within the collected digitized material of intangible cultural heritage. The motivation for building the system was a vast quantity of multimedia... more
With the explosive broadcast of multimedia (text documents, image, video etc.) in our life, how to annotate, search, index, browse and relate various forms of information efficiently becomes more and more important. Combining these... more
With the explosive broadcast of multimedia (text documents, image, video etc.) in our life, how to annotate, search, index, browse and relate various forms of information efficiently becomes more and more important. Combining these... more
Retrieval of multimedia has become a requirement for many contemporary information systems. These systems need to provide browsing, querying, navigation, and, sometimes, composition capabilities involving various forms of media. In this... more
Relevance feedback techniques are designed to automatically improve a system's representation of a query by using documents the user has marked as relevant. However, traditional relevance feedback models suffer from a number of... more
In our participation in the ImageCLEF 2009 Photo Retrieval task we pursued two objectives: Firstly, to re-evaluate MultiModal Local Context Analysis (MMLCA), our multimodal fusion technique. Secondly, to evaluate a new subquery generation... more
Overlay text brings important semantic clues in video content analysis such as video information retrieval and summarization, since the content of the scene or the editors intention can be well represented by using inserted text. Most of... more
We propose in this paper a novel multimodal approach to automatically predict the visual concepts of images through an effective fusion of visual and textual features. It relies on a Selective Weighted Late Fusion (SWLF) scheme which, in... more
Association rule mining (ARM) has been studied in the areas of content-based multimedia retrieval and semantic concept detection due to its high efficiency and accuracy. Two important processes in mining the association rules for... more
This paper presents an overview and comparative analysis of our systems designed for the TRECVID 2015 [1] multimedia event detection (MED) task. We submitted 17 runs, of which 5 each for the zeroexample, 10-example and 100-example... more
The concept of “approximate” searching has applications in a vast number of fields. Some examples are non-traditional databases (eg storing images, fingerprints or audio clips, where the concept of exact search is of no use and we search... more
This paper introduces the reader to the approach we are taking to develop an ontology that could be used to represent the knowledge inherent in filmed materials. Such an ontology could be used as the semantic basis for multimedia... more
This paper presents an experimental framework for the Placing tasks, both estimation and verification at MediaEval Benchmarking 2016. The proposed framework provides results for four runs first, using metadata (such as user tags and title... more
This paper demonstrates in formal terms that Galilean preinertia is a universal property of all physical objects which allows us to demonstrate, also in formal terms, the impossibility of detecting absolute motion and that motion can only... more
Subspace selection is a powerful tool in data mining. An important subspace method is the Fisher-Rao linear discriminant analysis (LDA), which has been successfully applied in many fields such as biometrics, bioinformatics, and multimedia... more
After recalling the Galilean origin of the concept of preinertia, and recalling the concept itself (still unknown to contemporary physics), the article explains the enormous empirical evidence of the concept, as well as the experimental... more
Embedded Media Barcode Links, or simply EMBLs, are optimally blended iconic barcode marks, printed on paper documents, that signify the existence of multimedia associated with that part of the document content (Figure ). EMBLs are used... more
The aim of this presentation is to review some of the standards, connected with multimedia and their metadata. We start with MPEG family and continue with Open Standards for Interactive TV. Efficient video-streaming is presented. Some... more
This paper presents an I/O efficient algorithm for graph pattern matching problem. It is based on decision tree approach proposed by B. T. Messmer and H. Bunke. In that paper, if the time needed for preprocessing is neglected, the... more
The use of the join operator in metric spaces leads to what is known as a similarity join, where objects of two datasets are paired if they are somehow similar. We propose an heuristic that solves the 1-NN selfsimilarity join, that is, a... more
This blog post introduces a baby spin structure for 3D space.
The main goal of this paper it is to present our experiments in ImageCLEF 2011 Campaign (Medical Retrieval Task). This edition we use textual and visual information, based on the assumption that the textual module better captures the... more
Video is a massive amount of data that contains complex interactions between moving objects. The extraction of knowledge from this type of information creates a demand for video analytics systems that uncover statistical relationships... more
Users are generally interested in the edge-ranked section of returning search results, according to an analysis of click-through data from a very big search engine log. As a result, search engines must achieve great accuracy with... more
Let K be a field of characteristic 2 and G a nonabelian locally finite 2-group. Let V (KG)be the group of units with augmentation 1 in the group algebra KG. An explicit list of groups is given, and it is proved that all involutions in V... more
The indexing and retrieval of multimedia items is difficult due to the semantic gap between the user's perception of the data and the descriptions we can derive automatically from the data using computer vision, speech recognition, and... more
This paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the... more
Abstract. XML processing models and respective languages do not reflect the separation of concerns needed in complex XML applications maintenance. As content complexity grows, there is a need for higher abstraction levels on XML... more
A fundamental problem in image retrieval is how to improve the text-based retrieval systems, which is known as "bridging the semantic gap". The reliance on visual similarity for judging semantic similarity may be problematic due to the... more
The paper presents the Argos evaluation campaign of video content analysis tools supported by the French Techno-Vision program. This project aims at developing the resources of a benchmark of content analysis methods and algorithms. The... more
This paper presents an overview of the ImageCLEF 2019 lab, organized as part of the Conference and Labs of the Evaluation Forum-CLEF Labs 2019. ImageCLEF is an ongoing evaluation initiative (started in 2003) that promotes the evaluation... more
In this work, we outline the submissions of Dublin City University (DCU) team, the organisers, to the NTCIR-13 Lifelog-2 Task. We submitted runs to the Lifelog Semantics Access (LSAT) and the Lifelog Insight (LIT) sub-tasks.
In today's world e-learning is one of the popular modes of learning and video lectures are more prominent in keeping learners engaged with course. Internet enabled to keep a large number of video lectures on-line. To search for a required... more
This work is pertaining to the diversified ranking of web-resources and interconnected documents that rely on a network-like structure, e.g. web-pages. A practical example of this would be a query for the k most relevant web-pages that... more
We accept this thesis as conforming to the required standard Abstract XML has become a standard format in information exchange and integration. Database support of persistent data storage and query capability is often desired for many XML... more
Content-based image retrieval systems were introduced as an alternative to avoid the need of manual tagging in traditional keyword-based image retrieval systems. However, the representation of image using visual features only involves a... more
As the number of internet users are increasing day by day. So amount of data also increases, so fast response is desired by different users. So most of the researchers are working in relevant information retrival. Proposed work has focus... more
YOLUM, PINAR. Properties of Referral Networks: Emergence of Authority and Trust (Under the direction of Munindar P. Singh). Developing, maintaining, and disseminating trust in open environments is crucial. We develop a decentralized... more
Since 3D models are becoming more popular, the need for effective methods capable of retrieving 3D models are becoming crucial. Current methods require an example 3D model as query. However, in many cases, such a query is not easy to get.... more
In the field of multimedia retrieval in video, text frame classification is essential for text detection, event detection, event boundary detection etc. We propose a new text frame classification method that introduces a combination of... more
Recommender systems have been systematically applied in industry and academia to help users cope with information uncertainty. However, given the multiplicity of the preferences and needs it has been shown that no approach is suitable for... more
This paper tries to solve the problem of query generation from multiple media examples, e.g. images, by exploiting a media document representation called feature terms. A feature term denotes a continuous interval of a media feature. This... more
Although IP-multicast has been proposed and investigated for years, there are major problems inherent in the IP-multicasting technique, e.g., difficulty to scale up the system, difficulty in allocating a globally unique multicast address,... more
Research on multimedia information retrieval (MIR) has recently witnessed a booming interest. A prominent feature of this research trend is its simultaneous but independent materialization within several fields of computer science. The... more
Download research papers for free!