Academia.eduAcademia.edu

Multimedia signal processing

description1,412 papers
group139 followers
lightbulbAbout this topic
Multimedia signal processing is the study and application of techniques for the analysis, manipulation, and synthesis of multimedia data, including audio, video, and images. It encompasses algorithms and methods for encoding, decoding, compression, enhancement, and transmission of multimedia content to improve quality and efficiency in various applications.
lightbulbAbout this topic
Multimedia signal processing is the study and application of techniques for the analysis, manipulation, and synthesis of multimedia data, including audio, video, and images. It encompasses algorithms and methods for encoding, decoding, compression, enhancement, and transmission of multimedia content to improve quality and efficiency in various applications.
This study investigated the influence of stereoscopic visual depth and body vibration on the high-level affective perception that concerns the senses of presence and verisimilitude. The multisensory content used in our experiment... more
Lipreading gets increasingly attention from the scientific society. However, many aspects related to lipreading are still unknown or poorly understood. In the current paper we present the entire process used for engineering the data for... more
Multiview video coding is an emerging application where, in addition to classical temporal prediction, an efficient disparity prediction should be performed in order to achieve the best compression performance. A popular coder is the... more
This paper addresses the problem of dense disparity estimation from a pair of color stereo images. Based on a convex set theoretic formulation, the stereo matching problem is cast as a convex programming problem in which a color-based... more
The literature has introduced many steganalysis methods intended to combat specific steganography techniques and to detect particular image formats. This paper proposes a detection system based on extracting histogram features. The... more
Implementation of game-based learning has been perceived by educators as a means to enhance effective classroom learning. Aspects in games have been identified to motivate learners to actively engage throughout the learning as it provides... more
Purpose – The purpose of this study is to presents the steps taken to produce a mobile learning application framework to learn Microeconomics for which is named “MobiEko Apps”. Mobile learning application is utilized because the framework... more
Most distributed source coding schemes involve the application of a channel code to the signal and transmission of the resulting syndromes. For low complexity encoding with superior compression performance, graph-based channel codes such... more
Unsupervised methods tend to discover highly speaker-specific representations of speech. We propose a method for improving the quality of posteriorgrams generated from an unsupervised model through partitioning of the latent classes. We... more
Microsatellites and drones are often equipped with digital cameras whose sensing system is based on color filter arrays (CFAs), which define a pattern of color filter overlaid over the focal plane. Recent commercial cameras have started... more
In our daily life human can remember many faces and can recognize them irrespective of illumination, aging, obstructions, variation in views. Most of researchers have worked on the problem of face recognition to develop an automatic face... more
The use of effective teaching materials either the materials are paper-based or computer-based, ensures that knowledge is transferred effectively and meaningfully to students. The materials can even be more effective if they are... more
A new method for high capacity data hiding in H.264 streams is presented. The proposed method takes advantage of the different block sizes used by the H.264 encoder during the inter prediction stage in order to hide the desirable data. It... more
Transmission of compressed video over wireless channels remains a challenging task due to the noisy nature of the wireless channels and a single bit error in the compressed video bit-stream might cause the reconstructed video to be... more
Using solely the information retrieved by audio fingerprinting techniques, we propose methods to treat a possibly large dataset of user-generated audio content, that (1) enable the grouping of several audio files that contain a common... more
Using solely the information retrieved by audio fingerprinting techniques, we propose methods to treat a possibly large dataset of user-generated audio content, that (1) enable the grouping of several audio files that contain a common... more
This paper describes objective video quality validation efforts conducted in the past two decades. Validation efforts to be examined include a validation test performed by the T1A1 committee in the early 1990's; five rounds of validation... more
In this paper, we first study the recognition of emotions involved in human speech. We propose an emotion recognition algorithm based on a neural network and also propose a method to coIlect a large speech database that contains emotions.... more
Curve evolution implementations [3][17] [18] of the Mumford-Shah functional are of broad interest in image segmentation. These implementations, however, have initialization problems . A mathematical analysis of the initialization problem... more
Efficient implementation methods are proposed for Chan-Vese models [3] [16]. The proposed methods do not require solutions of PDEs and are therefore fast. The advantages of level set methods, such as automatic handling of topological... more
profesor dr Milojko Jevtovi}, dipl. in`.
In this paper a new watermarking method using rotation-invariant Zernike moments is introduced. The watermark signal is embedded in the Zernike moments of the input image. The watermarked image does not show any quality degradation. Tests... more
This paper proposes a significant motion vector protection (SMVP) scheme for error-resilient transmission of videos. In terms of a rate-distortion optimization model, we show how to determine the significant motion vectors (SMVs) and how... more
We are investigating the use of Body Area Networks (BANs), wearable sensors and wireless communications for measuring, processing, transmission, interpretation and display of biosignals. The goal is to provide telemonitoring and... more
A saliency-based method for generating video summaries is presented, which exploits coupled audiovisual information from both media streams. Efficient and advanced speech and image processing algorithms to detect key frames that are... more
To enhance the security of the copyright image we propose imperceptible color image watermarking scheme. Recent day's internet technology is widely uses overall in the world. It uses many different type of data, digital images one of... more
We proposed a spatial coherence-based PSD estimation and source separation technique in [1] using a 32channel spherical microphone array. While the proposed spherical microphone-based method exhibited a satisfactory performance in... more
DVDs and Blu-rays are among the most frequent victims of video content counterfeiting. Primarily, illegal distribution of movies on Internet is a growing menace to film industry. For this reason, authentication techniques are required to... more
ABSTRAK Berat badan merupakan salah satu parameter yang memberikan gambaran pada massa tubuh. Pada pengukuran berat badan yang telah dilakukan secara manual yaitu dengan menggunakan alat penimbang berat badan (timbangan injak) didapatkan... more
This work explores novel mechanisms for aerial acoustic machine-machine communications. It builds on previous work by some of the authors [1], as well as others [2]. In this paper we describe aerial acoustic communication systems that... more
Matrix factorization methods are now widely used to recover 3D structure from 2D projections [1]. In practice, the observation matrix to be factored out has missing data, due to the limited field of view and the occlusion that occur in... more
This paper describes a Hidden Markov Model (HMM)-based method of automatic transcription of MIDI (Musical Instrument Digital Interface) signals of performed music. The problem is formulated as recognition of a given sequence of... more
Much of the work on visual quality assessment has been devoted to gray-level images; metrics taking into account color information and the temporal component are still relatively rare. This paper presents a quality metric for color video... more
In this paper, we present a novel system which combines depth-from-stereo and visual hull reconstruction for acquiring dynamic real-world scenes at interactive rates. First, we use the silhouettes from multiple views to construct a... more
Depth maps, characterizing per-pixel physical distance between objects in a 3D scene and a capturing camera, can now be readily acquired using inexpensive active sensors such as Microsoft Kinect. However, the acquired depth maps are often... more
A common issue in video transcoding for heterogeneous network environment is to efficiently and accurately reduce the bit-rate such that the distortion is minimized under a given rate constraint. To convert the bit-rate of an encoded... more
In order to speed up video coding efficiency such as H.264/AVC and H265/HEVC, we propose in this paper a parallel approach of full search (FS) algorithm for motion estimation on Graphic Processor Unit (GPU). We implemented the traditional... more
This paper describes a novel methodology for automated recognition of high-level activities. A key aspect of our framework relies on the concept of cooccurring visual words for describing interactions between several persons. Motivated by... more
The ability to predict motion fields at finer temporal scales from coarser ones is a very desirable property for temporal scalability. This is at best very difficult in current state-of-theart video codecs (i.e., H.264, HEVC), where... more
In this paper, we first study the recognition of emotions involved in human speech. We propose an emotion recognition algorithm based on a neural network and also propose a method to coIlect a large speech database that contains emotions.... more
Orthogonal Frequency Division Multiplexing (OFDM) is regarded as one of the most outstanding multicarrier modulation technique in fourth generation (4G) wireless networks, which makes it possible to transfer very high bit rates despite... more
Sign language synthesis has seen a large increase i n pplications over the past few decades, as it represents a poten tial solution to communication problem for the deaf community. All t hat is needed is to convert a writing form (books,... more
We propose a novel Layered Compressed Sensing (CS) approach for robust transmission of video signals over packet loss channels. In our proposed method, the encoder consists of a base layer and an enhancement layer. The base layer is a... more
Negative symptoms of schizophrenia significantly affect the daily functioning of patients, especially movement and expressive gestures. The diagnosis of such symptoms is often difficult and require the expertise of a trained clinician.... more
In the future multimedia technology will be able to provide video frame rates equal to or better than 30 frames-per-second FPS. Until that time the hearing impaired community will be using band-limited communication systems over... more
The goal of video summarization is to generate a shorter video sequence of a lengthy original sequence using only the key frames of the original sequence. We consider a video summarization scheme that generates a video summary that can be... more
The aim of this work is to introduce a primary research on Arabic audiovisual analysis. Each language has multiple phonemes and visemes and each viseme can have multiple phonemes. The first part focuses on how to classify Arabic visemes... more
Download research papers for free!