Academia.eduAcademia.edu

Video and Audio Processing

description14 papers
group11 followers
lightbulbAbout this topic
Video and audio processing is the manipulation and analysis of digital video and audio signals using algorithms and software. This field encompasses techniques for encoding, decoding, compressing, enhancing, and transforming multimedia content to improve quality, facilitate transmission, and enable various applications in entertainment, communication, and data analysis.
lightbulbAbout this topic
Video and audio processing is the manipulation and analysis of digital video and audio signals using algorithms and software. This field encompasses techniques for encoding, decoding, compressing, enhancing, and transforming multimedia content to improve quality, facilitate transmission, and enable various applications in entertainment, communication, and data analysis.

Key research themes

1. How can integration and synchronization techniques improve real-time joint audio-video processing in multimedia applications?

This research area focuses on methodologies and frameworks to tightly integrate and synchronize audio and video streams in real-time multimedia applications, addressing challenges posed by independent capture, transmission delays, buffering, and processing constraints. Synchronization enables coherent audiovisual experiences crucial for activities such as video conferencing, live streaming, and interactive media.

Key finding: The authors developed an online system achieving synchronization of independently captured and separately processed audio and video streams via time-stamping techniques. This approach correlates audio packets with video... Read more
Key finding: Presented Vosaic, a WWW browser extension integrating real-time video and audio into hypertext pages with no retrieval latency by extending HTTP servers to utilize a novel Video Datagram Protocol (VDP). This allowed a 44-fold... Read more
Key finding: Comprehensively reviews methods for joint audio-video signal processing emphasizing the importance of cross-modal fusion for speech recognition and person authentication. It highlights integration challenges, current... Read more

2. What advances in video compression standards and computational frameworks drive efficient multimedia processing and adaptation?

This theme explores developments in video compression standards like H.264 and predictions on future video coding paradigms, alongside computational frameworks that allow adaptable video editing and delivery. Research emphasizes improving coding efficiency, reducing bitrates, adapting video for multiple platforms, and leveraging machine learning for content-aware processing, which are fundamental for scalable, high-quality multimedia distribution.

Key finding: Analyzes H.264/AVC standard's technical innovations yielding at least a twofold compression efficiency improvement over predecessors through enhanced motion estimation with small block sizes, intra prediction, a DCT-like... Read more
Key finding: Synthesizes expert panel insights on emerging video coding research, including the dual-track approach integrating conventional and deep learning-based coding techniques, the ongoing importance of scalable, immersive... Read more

3. How do audio signal processing and structured audio standards enhance multimedia audio representation, synthesis, and user interaction?

This theme covers advances in digital audio processing including perception-informed signal modeling, synthetic audio representation within multimedia standards, and novel audio coding methods. It examines computational frameworks for representing, coding, and synthesizing both natural and structured audio, facilitating flexible, high-fidelity soundtracks and enabling enriched multimedia experiences and assistive technologies.

Key finding: Details human auditory system characteristics underlying audio processing, emphasizing the cochlea's frequency analysis via place and volley principles. This physiological understanding guides design of digital audio... Read more
Key finding: Describes MPEG-4's extension for 'Structured Audio,' enabling algorithmic descriptions of synthetic sounds, musical scores, and audio effects integrated with natural audio streams. This facilitates highly compressed, flexible... Read more
Key finding: Reviews audio signal characteristics and classification methodologies, highlighting time-frequency representations and feature extraction techniques imperative for audio segmentation, retrieval, and compression. Emphasizes... Read more
Key finding: Presents a novel approach mapping image color components to musical instrument sounds for aiding visually impaired users in constructing mental spatial images via auditory cues. Experimental results show that learned... Read more

All papers in Video and Audio Processing

This paper provides an overarching framework embracing conceptual and technical frameworks for improving the online communication skills of lifelong learners. This overarching framework is called FILTWAM (Framework for Improving Learning... more
The term “Near-duplicate” is an object that is fully or partly similar to another object. There are natural and artificial near-duplicates. Natural near-duplicates are similar objects within the similar environment, while artificial... more
Существует широкий круг задач, где требуется анализ, аудио-визуальных моделей реальности. В частности, для многих военных и гражданских приложений, необходимо наличие поиска нечетких дубликатов видео. Для мирного применения, — это... more
XI All-Russian Conference “Neurocomputers and their application”, Мoscow: MSUPE, 19.03.2013. Понятие «нечеткий дубликат» означает неполное или частичное совпадение текущего документа (изображения) с другим документом подобного... more
The paper focuses on the algorithms of the event detection in content-based video retrieval. Video has a complex structure and can express the same idea in different ways. This makes the task of searching for video more complicated. Video... more
Существует широкий круг задач, где требуется анализ, аудио-визуальных моделей реальности. В частности, для многих военных и гражданских приложений, необходимо наличие поиска нечетких дубликатов видео. Для мирного применения, — это... more
В работе рассмотрен подход для поиска нечетких дубликатов видео. Поиск основан на сравнении относительных длин сцен в пространстве L2. Сравнение проводится с учетом гипотезы Гейла-Черча. Вводится понятие «дескриптора сцены». Для ускорения... more
Даны два видео файла или потока. Нужно выяснить являются ли они дубликатами друг друга. Здесь, под словом дубликат понимается не формализуемое условие: «На этих файлах изображено одно и то же?». Возможна, и другая постановка этой задачи.... more
Download research papers for free!