Multimedia Annotation

description35 papers

group2 followers

lightbulbAbout this topic

Multimedia Annotation is the process of adding descriptive metadata to various forms of media, such as text, images, audio, and video, to enhance their accessibility, usability, and searchability. This practice facilitates the organization, retrieval, and understanding of multimedia content in digital environments.

lightbulbAbout this topic

Key research themes

1. How can machine learning and NLP improve the quality and semantic consistency of textual annotations in multilingual multimedia archives?

This research area focuses on applying advanced machine learning (ML), natural language processing (NLP), and deep learning techniques to automatically enhance the quality, harmonization, and semantic coherence of textual annotations (e.g., keywords and tags) linked to multimedia content, especially in multilingual and heterogeneous digital libraries. Improving annotation quality aids effective search, navigation, and visualization of large multimedia repositories and addresses challenges such as language identification, spelling correction, semantic similarity, and term specialization.

Methods, Models and Tools for Improving the Quality of Textual Annotations

by Maria Teresa Artese

2023, Modelling

Key finding: This work develops an integrated pipeline combining supervised and unsupervised machine learning and deep learning techniques—including automatic language detection, spelling error identification and correction, and word... Read more

articleView Paper downloadDownload

Automated metadata annotation: What is and is not possible with machine learning

by Joaquim López

2023, Data Intelligence

Key finding: This paper critically examines machine learning’s (ML) capabilities and limitations for automated descriptive metadata annotation in cultural heritage and scholarly collections, highlighting the scarcity of large,... Read more

articleView Paper downloadDownload

A Survey On Image and video Metadata using AI and Image Processing

by International Journal of Scientific Research in Computer Science, Engineering and Information Technology IJSRCSEIT

2023, International Journal of Scientific Research in Computer Science, Engineering and Information Technology

Key finding: This survey analyses AI and image processing methods for automatic metadata generation in the context of unstructured multimedia data such as video lectures. It experimentally evaluates three summarization algorithms... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What are effective approaches to multimedia annotation that enhance collaborative reasoning and decision-making in distributed virtual and educational environments?

This research theme explores designing and evaluating multimedia annotation systems that enrich collaborative decision-making and reflective learning processes in virtual environments (VEs) and educational settings. It emphasizes multimodal annotations (audio, text, sketches, video-synchronized camera movements) combined with structured argumentation trees and shared tag vocabularies to capture provenance, facilitate asynchronous discussions, and promote critical thinking among practitioners and students. These approaches support geographically distributed teams or learners engaging with complex multimedia artifacts and professional contexts.

Beyond Post-It: Structured Multimedia Annotations for Collaborative VEs

by Joaquim Jorge

2024

Key finding: Introducing a rich multimedia annotation framework embedding audio, sketches, synchronized camera movements, and structured argumentation trees, this paper shows how annotations in collaborative virtual engineering... Read more

articleView Paper downloadDownload

Multimedia Annotations for Practical Collaborative Reasoning

by Manuel Cebrian de la Serna

2025, Journal of New Approaches in Educational Research

Key finding: This experimental study involving 274 undergraduate students assesses how multimedia annotations combined with folksonomy tag strategies (broad vs. narrow tags) influence the critical and reflective quality of student... Read more

articleView Paper downloadDownload

Investigating Perceptions of a Location-Based Annotation System

by Tân Phạm

2024, Lecture Notes in Computer Science

Key finding: Evaluating MobiTOP, a hierarchical, multimedia-rich, web-based location annotation system, this usability study finds positive user acceptance of features enabling hierarchical annotation creation, sharing and browsing of... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can user-centered methodologies and tools facilitate effective manual or semi-automatic multimedia annotation integrating semantic web technologies and user expertise?

Given that fully automated semantic annotation remains inadequate for complex multimedia, this theme examines user-centered frameworks, methodologies, and tools that assist annotators—including non-expert users—in manually or semi-automatically creating ontology-based, multimedia annotations. It considers approaches that lower barriers to ontology navigation and extension, synchronize structured annotations with multimedia playback, and enable rich interaction with multimedia fragments. These methods are designed to produce precise, interoperable annotations while bridging the semantic gap through collaborative user involvement.

A user centered annotation methodology for multimedia content

by Christian Ammendola

2015, Poster Proceedings of ESWC

Key finding: The paper proposes the SA (Selection and Addition) methodology that supports non-expert users in ontology-based multimedia annotation by semantically retrieving relevant ontology elements and allowing in-situ extension of... Read more

articleView Paper downloadDownload

Synchronised Annotation of Multimedia

by Mike Wald

2013

Key finding: Presenting the Synote system, this work introduces a web-based platform that enables users to create fine-grained synchronized multimedia annotations—termed synmarks and synnotations—that link notes, tags, bookmarks, and... Read more

articleView Paper downloadDownload

The LEMO annotation framework: weaving multimedia annotations with the web

by Bernhard Haslhofer

2011, International Journal on …

Key finding: This paper introduces the LEMO Annotation Framework, a standards-based, uniform model that supports interoperable multimedia annotations across diverse content types with support for fragment addressing and web... Read more

articleView Paper downloadDownload

Multimedia information technology and the annotation of video

by Marcel Worring

2015

Key finding: This foundational survey analyzes the challenges and state-of-the-art technologies in video annotation, emphasizing the semantic gap between raw multimedia data and meaningful metadata. It advocates for hybrid man-machine... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Multimedia Annotation

Multimedia database of the cultural heritage of the Balkans

by Gordana Pavlović-Lažetić

2025

This paper presents a system that is designed to make possible the organization and search within the collected digitized material of intangible cultural heritage. The motivation for building the system was a vast quantity of multimedia... more

descriptionView Paper arrow_downwardDownload

Multimedia Annotations for Practical Collaborative Reasoning

by Manuel Cebrian de la Serna

2025, Journal of New Approaches in Educational Research

University education requires students to be trained both at university and at external internship centres. Because of Covid-19, the availability of multimedia resources and examples of practical contexts has become vital. Multimedia... more

descriptionView Paper arrow_downwardDownload

Investigating Perceptions of a Location-Based Annotation System

by Tân Phạm

2024, Lecture Notes in Computer Science

We introduce MobiTOP, a Web-based system for organizing and retrieving hierarchical location-based annotations. Each annotation contains multimedia content (such as text, images, video) associated with a location, and users are able to... more

descriptionView Paper arrow_downwardDownload

FilmEd - Collaborative video indexing, annotation and discussion tools over broadband networks

by Ronald Schroeter

2024

A number of research groups and software companies have developed digital annotation tools for textual documents, web pages, images, audio and video resources. By annotations we mean subjective comments, notes, explanations or external... more

descriptionView Paper arrow_downwardDownload

Using the Semantic Grid to Build Bridges Between Museums and Indigenous Communities

by Ronald Schroeter

2024, Working Group (APPS- …

Using the Semantic Grid to Build Bridges between Museums and Indigenous Communities Jane Hunter1, Ronald Schroeter1, Bevan Koopman1, and Michael Henderson1 DSTC, University of Queensland, Brisbane, Australia 4072 {jane, ronalds, bevank,... more

descriptionView Paper arrow_downwardDownload

A robust graph-based semi-supervised sparse feature selection method

by Razieh Sheikhpour

2024, Information Sciences

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will... more

descriptionView Paper arrow_downwardDownload

Investigating Perceptions of a Location-Based Annotation System

by Thị Thêm Kim

2024, Lecture Notes in Computer Science

descriptionView Paper arrow_downwardDownload

Tangible needle, digital haystack

by F. Rocca

2023, Proceedings of the 8th International Conference on Tangible, Embedded and Embodied Interaction

This paper presents the design process of a desk-set tangible user interface for the navigation and manipulation of media content organized by content-based similarity with offthe-shelf/flea market devices. For intra-media navigation, a... more

descriptionView Paper arrow_downwardDownload

SensoryX ’21 Workshop on Multisensory Experiences at ACM IMX ’21

by George Ghinea

2023

Multisensory experiences have been increasingly undertaken in the digital world. With the emerging interest in immersive applications (i.e. 360 videos and virtual reality), more and more researchers and practitioners are in pursuit of... more

descriptionView Paper arrow_downwardDownload

SensoryX ’21 Workshop on Multisensory Experiences at ACM IMX ’21

by Estêvão B Saleme

2023

descriptionView Paper arrow_downwardDownload

Multitasking with Play Write, a Mobile Microproductivity Writing Tool

by Shamsi Iqbal

2023

Mobile devices offer people the opportunity to get useful tasks done during time previously thought to be unusable. Because mobile devices have small screens and are often used in divided attention scenarios, people are limited to using... more

descriptionView Paper arrow_downwardDownload

Integrating Crowdsourcing and Human Computation for Complex Video Annotation Tasks

by marcello amorim

2023

Video annotation is an activity that aims to supplement this type of multimedia object with additional content or information about its context, nature, content, quality and other aspects. These annotations are the basis for building a... more

descriptionView Paper arrow_downwardDownload

Crowdsourcing authoring of sensory effects on videos

by marcello amorim

2023, Multimedia Tools and Applications

Human perception is inherently multi-sensorial involving five traditional senses: sight, hearing, touch, taste, and smell. In contrast to traditional multimedia, based on audio and visual stimuli, mulsemedia seek to stimulate all the... more

descriptionView Paper arrow_downwardDownload

Calendar.Help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop

by Rafal Kocielnik

2023, Social Science Research Network

Although information workers may complain about meetings, they are an essential part of their work life. Consequently, busy people spend a significant amount of time scheduling meetings. We present Calendar.help, a system that provides... more

descriptionView Paper arrow_downwardDownload

Crowdsourcing authoring of sensory effects on videos

by Estêvão B Saleme

2023, Multimedia Tools and Applications

descriptionView Paper arrow_downwardDownload

Serbia Forum - Digital Cultural Heritage Portal

by Milan Todorović

2023, Lecture Notes in Computer Science

Serbia-Forum is a web application portal designed and implemented by the Mathematical Institute of the Serbian Academy of Sciences and Arts (MISANU) whose goal is to digitally make available, many units of cultural heritage belonging to... more

descriptionView Paper arrow_downwardDownload

Investigating Perceptions of a Location-Based Annotation System

by Thi Lại Kim

2023, Lecture Notes in Computer Science

descriptionView Paper arrow_downwardDownload

Investigating Perceptions of a Location-Based Annotation System

by Thi Bùi Kim

2023, Lecture Notes in Computer Science

descriptionView Paper arrow_downwardDownload

Multimedia database of the cultural heritage of the Balkans

by biljana sikimic

2023

descriptionView Paper arrow_downwardDownload

Integrating Crowdsourcing and Human Computation for Complex Video Annotation Tasks

by Marcello de Amorim

2023

descriptionView Paper arrow_downwardDownload

Sensorial Information Extraction and Mapping to Generate Temperature Sensory Effects

by Yong Soo Joo

2022, ETRI Journal

In this paper, a method to extract temperature effect information using the color temperatures of video scenes with mapping to temperature effects is proposed to author temperature effects of multiple sensorial media content... more

descriptionView Paper arrow_downwardDownload

Strengthening Students ’ Academic Achievement in English at Secondary School Level in Kohat Division

by Sadia Ambreen

2022

The purpose of this research study was to explore the effectiveness of educational technology in strengthening students ’ academic achievement in English at secondary school level. All the students at secondary school level in Kohat... more

descriptionView Paper arrow_downwardDownload

Crowdsourcing authoring of sensory effects on videos

by George Ghinea

2022, Multimedia Tools and Applications

descriptionView Paper arrow_downwardDownload

Crowdsourcing authoring of sensory effects on videos

by Marcello de Amorim

2022, Multimedia Tools and Applications

descriptionView Paper arrow_downwardDownload

Social Documentary: An interactive and evolutive installation to explore crowd-sourced media content

by Ceren Kayalar

2022

This paper aims to present a project in progress, an interactive installation for collaborative manipulation of multimedia content. The proposed setup consists in a vertical main screen and a horizontal second screen, which is used as... more

descriptionView Paper arrow_downwardDownload

Discriminating Joint Feature Analysis for Multimedia Data Understanding

by Alexander Hauptmann

2022, IEEE Transactions on Multimedia

In this paper, we propose a novel semi-supervised feature analyzing framework for multimedia data understanding and apply it to three different applications: image annotation, video concept detection and 3D motion data analysis. Our... more

descriptionView Paper arrow_downwardDownload

A Comparative Study of Using Multimedia Annotation and Printed Textual Glossary in Learning Vocabulary

by Mohamad firgi afif

2022, International Journal of Learning and Development

This study intends to evaluate the effectiveness of Electronic Glossary and Non-electronic Glossary in L2 vocabulary learning among a group of low proficiency learners of English. It also seeks to determine which glossary mode is... more

descriptionView Paper arrow_downwardDownload

A Comparative Study of Using Multimedia Annotation and Printed Textual Glossary in Learning Vocabulary

by Mohamad Jafre

2022, International Journal of Learning and Development

descriptionView Paper arrow_downwardDownload

Arabic Natural Language Processing for Qur’anic Research: A Systematic Review

by Ala Al-Fuqaha

2022

The Qur’an, the holy divine book of Muslims, was revealed over fourteen centuries ago, in Arabic. With the rise of Islam, the Arabic language gained popularity and became the lingua franca for large swaths of the old world. Devout Muslims... more

descriptionView Paper arrow_downwardDownload

Crowdsourcing authoring of sensory effects on videos

by Celso Alberto Saibel Santos

2022, Multimedia Tools and Applications

descriptionView Paper arrow_downwardDownload

Crowdsourcing Authoring of Sensory Effects on Videos

by Celso Alberto Saibel Santos

2022, Multimedia Tools and Applications

descriptionView Paper arrow_downwardDownload

Multimedia database of the cultural heritage of the Balkans

by Gordana Pavlovic-Lazetic

2022

descriptionView Paper arrow_downwardDownload

Arabic Natural Language Processing for Qur’anic Research: A Systematic Review

by Muhammad Huzaifa Bashir

2021

Arabic Natural Language Processing for Qur’anic Research: A Systematic Review

descriptionView Paper arrow_downwardDownload

Towards integration of end-user tags with professional annotations

by Maarten Brinkerink

2021

The goal of the paper is assessing the quality of end-user tags from a video labeling game as a first step in the process of integrating them with the annotations made by professionals. Tags lack precise meaning, whereas the terms and... more

descriptionView Paper arrow_downwardDownload

Crowdsourcing-based multimedia subjective evaluations

by Filippo Mazza

2021, Proceedings of the 2nd ACM international workshop on Crowdsourcing for multimedia - CrowdMM '13

Research on Quality of Experience (QoE) heavily relies on subjective evaluations of media. An important aspect of QoE concerns modeling and quantifying the subjective notions of 'beauty' (aesthetic appeal) and 'something well-known'... more

descriptionView Paper arrow_downwardDownload

Unifying linguistic annotations and ontologies for the Arabic Quran

by Majdi S Sawalha

2021

descriptionView Paper arrow_downwardDownload

Automatic Image Annotation Using Modified Multi-label Dictionary Learning

by International Research Group - IJET JOURNAL

2018

Automatic image annotation has attracted lots of research interest, and effective method for image annotation. Find effectively the correlation among labels and images is a critical task for multi-label learning. Most of the existing... more

descriptionView Paper arrow_downwardDownload

Using Ontology for Associating Web Multimedia Resources With the Holy Quran

by Tarek El-Sakka

2016

There is a huge wealth of multimedia web resources related to the sciences of the Holy Quran, including "Tafseer" of the Holy Quran, teaching the provisions of recitation, the stories of the Holy Quran, and many other categories of... more

descriptionView Paper arrow_downwardDownload

Real-Time Interactive Verification of Quran Words in The Web Contents

by Tarek El-Sakka and

2016

There are many Arabic websites contain phrases from the Quran. Regrettably, the Quran texts appeared in a majority of websites were suffering from many mistakes and typos. Hence, finding the correct form of Quran verses has become... more

Figure 1: Definition of the Quranic Chapter Table. Quran texts are divided into 114 chapters. Some of the chapters are of type Makki while the others are of the other type Maddani. In addition, each chapter has other information like verse count, word count, start-up page in Mushaf, etc. So, we collect this information for each chapter and developed as a database table, as illustrated in the following figure: Each chapter contains verses with a total number of 6236 verses. We develop two tables for simple verses text (text without diacritics) and for Uthmanic verses text (text with diacritics). Each verse contains one or more words, which has been developed as a record in the verse table, as illustrated in the following figure:

Figure 5: Definition of the Accelerator XML File. The web visitor should click on the link found in the page to add the Accelerator. Hence, a window will be displayed to ask adding the Accelerator as shown in figure 6

Figure 8: The Architecture of the Verification System. Figure 7: Menu List of Accelertors. We develop a website for verifying Quran with the name “Quran Verifier” that is currently published with the web address: www.quranexplorer.info. The real-time verification system is comprised of many parts, as shown in figure 8. The first one uses processes to find occurrences of the highlighted text in the Holy Quran using. The first process prepares the input data (highlighted text) by using some pre- process functions to trim the text, remove extra spaces or check the presence of diacritical marks. Next, if the diacritical marks were found, the diacritic search process would be used. On the other hand, the simple search process would be used if there are not diacritics. This special manipulation tries to find a match by looking for the input data in the Quran database. Finally, the results of either

Figure 9: The Search Algorithm for the Verification System. The second component is responsible for displaying the search results. It contains two processes for displaying results according the user choice when clicking the accelerator icon. If the user chooses to preview results, the first match will be invoked in a small pop-up window on the same web page. If the user decides to display all results, the list of results will be displayed on a separate web page within our website. All displayed results (for both processes) would be in the Uthmanic drawn of the Madinah Mushaf.

We are developing the famous referenced recital of Hafs from Asim for the Madinah Mushaf with the Uthmanic drawn to be the reference of our search database. We have been applying this work to verify the web contents in the Holy Quran. We applied this work on websites that displaying the Holy Quran or some of its contents. Those websites are classified into two categories. The first category is the websites displaying the Holy Quran like the Mushaf. While the second category is the websites citing some verses/phrases of the Holy Quran. When verifying the first category of websites that are displaying the Holy Quran. we found most of them are verified with the same syntax of Quran verses but some of them are different in diacritics because they are using a different recital for their Mushaf. Figure 10: Result of the Preview Action.

We develop an IE Accelerator that runs the verification task from our website. This Accelerator definition contains two activity actions. The first action is the preview action that can be invoked with the verification result when the web visitor hover the mouse on our accelerator. The second action is the execute action that open a web page from our website with the verification results. This Accelerator must be installed first by visiting the verification system website at the following URL: www.quranexplorer.info/InstallOuranVerifier.aspx. The XML document used to install the Accelerator is described in figure 5.

Figure 11: Result of the Execute Action. As an example for running our system to verify a phrase in the Holy Quran using the IE Accelerator is shown in figure 10. The results of the verification are displayed in figure 11. Web visitors can click on a result to explore it in the Madinah Mushaf. The verification system has been built with only one recital of Hafs from the Madinah Mushaf. The results successes with 100% of sites use the Uthmainc drawing. This ratio is decreased to 70% for sites use different Mushaf drawings or use Quran text without diacritics.

descriptionView Paper arrow_downwardDownload

Multimedia and semantic technologies for future computing environments

by Marco Bertini

2016, Multimedia Tools and Applications

Research progresses in multimedia computing and systems using semantic technologies have been recently and widely explored. This special issue on multimedia and semantic technologies for future computing environments provides high quality... more

descriptionView Paper arrow_downwardDownload

Multi-task support vector mach ines for feature selection with shared knowled ge discovery

by Oscar Chang

2015

Feature selection is an effective way to reduce computational cost and improve feature quality for the large-scale multimedia analysis system. In this paper, we propose a novel feature selection method in which the hinge loss function... more

descriptionView Paper arrow_downwardDownload

Unifying linguistic annotations and ontologies for the Arabic Quran

by Eric S Atwell and

2014

descriptionView Paper arrow_downwardDownload

A Convex Formulation for Semi-Supervised Multi-Label Feature Selection

by Oscar Chang

2014

Explosive growth of multimedia data has brought challenge of how to efficiently browse, retrieve and organize these data. Under this circumstance, different approaches have been proposed to facilitate multimedia analysis. Several... more

descriptionView Paper arrow_downwardDownload

Semi-supervised Feature Analysis for Multimedia Annotation by Mining Label Correlation

by Oscar Chang

2014

In multimedia annotation, labeling a large amount of training data by human is both time-consuming and tedious. Therefore, to automate this process, a number of methods that leverage unlabeled training data have been proposed. Normally, a... more

Figure [J] shows the convergence curve of the proposed algorithm w.r.t. the objective function value in ([Q) on the MIML dataset. It is observed that the objective function values converge within 4 iterations. In this section, an experiment is conducted to validate that our proposed iterative algorithm monotonically decreases the objective function until convergence. 10 x c labeled training data in MIML dataset are tested in this experiment. 7, a and 6 are fixed at 1 which is the median value of the tuned range of the parameters. 3.5 Convergence Study

Fig. 3. The MAP variations of different parameter settings using the MIML dataset

Table 2. Performance Comparison(+Standard Deviation(%)) when 5 x c data are labeled

descriptionView Paper arrow_downwardDownload

Unlabeled data improvesword prediction

by ali farhadi

2014

Labeling image collections is a tedious task, especially when multiple labels have to be chosen for each image. In this paper we introduce a new framework that extends state of the art models in word prediction to incorporate information... more

descriptionView Paper arrow_downwardDownload

Using a Quran Spoken Corpus 1 Magdy Salwa Hamdy6

by salwa hamada

2013

descriptionView Paper arrow_downwardDownload

Towards integration of end-user tags with professional annotations

by Lora Aroyo

2012

descriptionView Paper arrow_downwardDownload

Linking user-generated video annotations to the web of data

by Michiel Hildebrand

2012, few.vu.nl

In the audiovisual domain tagging games are explored as a method to collect user-generated metadata. For example, the Netherlands Institute for Sound and Vision deployed the video labelling game Waisda? to collect user tags for videos... more

descriptionView Paper arrow_downwardDownload

Multimedia Annotation

Key research themes

1. How can machine learning and NLP improve the quality and semantic consistency of textual annotations in multilingual multimedia archives?

2. What are effective approaches to multimedia annotation that enhance collaborative reasoning and decision-making in distributed virtual and educational environments?

3. How can user-centered methodologies and tools facilitate effective manual or semi-automatic multimedia annotation integrating semantic web technologies and user expertise?

Related Topics

All papers in Multimedia Annotation