Academia.eduAcademia.edu

Audio Coding

description660 papers
group28 followers
lightbulbAbout this topic
Audio coding is the process of compressing audio data to reduce file size while preserving sound quality. It involves algorithms that analyze audio signals and remove redundant or non-essential information, enabling efficient storage and transmission of audio content in various formats.
lightbulbAbout this topic
Audio coding is the process of compressing audio data to reduce file size while preserving sound quality. It involves algorithms that analyze audio signals and remove redundant or non-essential information, enabling efficient storage and transmission of audio content in various formats.

Key research themes

1. How can parametric and sinusoidal models improve low-bitrate audio coding efficiency and perceptual quality?

This research area focuses on advancing parametric audio coding techniques using sinusoidal and exponentially damped sinusoidal (EDS) models to efficiently represent audio signals at very low bitrates, while preserving perceptual quality. It addresses challenges in modeling transient signals, optimizing parameter quantization, and integrating psychoacoustic considerations to enhance sparse signal representation and subjective audio coding performance.

Key finding: Proposed an efficient analysis/synthesis framework combining EDS model with dynamic time-segmentation on transients and psychoacoustic modeling, coupled with an asymptotically optimal entropy-constrained joint quantization of... Read more
Key finding: Introduced a scalable wideband audio coder utilizing sorted sinusoidal parameters based on perceptual significance, transmitting the most perceptually important components first to achieve fixed-rate scalability with low... Read more
Key finding: Presented a hybrid audio compression method converting audio files into text representations followed by conventional text compression algorithms. The approach combining 6-bit coding achieved compression ratios between... Read more

2. What are the advancements in hybrid and scalable audio coding techniques for universal speech and music compression?

This theme investigates methods integrating multiple coding paradigms—such as linear predictive coding (LPC), algebraic code-excited linear prediction (ACELP), transform coding (TCX), and scalable coding frameworks—to achieve efficient, universal audio coding that supports both speech and music signals. It includes mode switching, variable frame lengths, entropy coding improvements, and scalable-to-lossless transitions, aiming for robust quality across diverse content at multiple bitrates.

Key finding: Extended prior ACELP/TCX hybrid coding by increasing frame lengths to 80 ms with adaptive overlapping windows and enhanced spectral quantization using extended multi-rate algebraic vector quantization. The closed-loop mode... Read more
Key finding: Developed a scalable-to-lossless audio compression approach based on transform coding combined with Set Partitioning In Hierarchical Trees (SPIHT) algorithm. The method provides fine-grained scalability from low bitrate lossy... Read more
Key finding: By sorting sinusoidal parameters according to perceptual importance, the coder achieved scalable audio quality under fixed-rate constraints, enabling low delay and quality progression that suits wireless applications. This... Read more

3. How can psychoacoustically-informed perceptual models and lossless coding techniques enhance audio compression fidelity and objective quality assessment?

This research direction integrates auditory perception models into audio codec design and lossless coding methods to optimize compression while preserving subjective audio quality. It also explores objective metrics aligned with human perception, enabling reliable prediction of coding artifacts impact. Lossless and perceptual lossless techniques using predictive filtering and entropy coding aim to maximize coding efficiency without compromising signal integrity.

Key finding: Proposed a lossless audio compression scheme using a novel Weighted Cascaded Least Mean Squared (WCLMS) adaptive predictor optimal for a wide range of audio signals, including psychoacoustically pre-filtered (perceptual)... Read more
Key finding: Provided a comprehensive description of the MPEG-4 ALS codec employing forward-adaptive linear prediction and entropy coding with block-length switching and joint channel coding. The codec achieves remarkable compression and... Read more
by Peter Pocta and 
1 more
Key finding: Compared the impact of commonly deployed lossy audio codecs in DAB and web-casting on subjective and objective audio quality. Retrained objective models (PEAQ and POLQA Music) with codec-distorted signals significantly... Read more
Key finding: Presented the Fuzzy Quality Index (FQI), a low-complexity objective audio quality metric based on fuzzy logic integrated into the PEAQ framework. Results showed slight performance improvement over original PEAQ by better... Read more

All papers in Audio Coding

La tonalité est un attribut perceptif lié aux composantes tonales (émergences spectrales) présentes dans de nombreux types de bruit (avions, systèmes de climatisation, voiture électrique, ...). Plusieurs facteurs influencent la tonalité... more
In this paper we investigate a special-purpose application of MPEG-1 layer II audio streaming. First, we discuss how two or more already coded MPEG-audio bitstreams can be manipulated and mixed within the coded subband domain by using an... more
A multichannel extension to the RVQGAN neural coding method is proposed, and realized for data-driven compression of third-order Ambisonics audio. The input-and output layers of the generator and discriminator models are modified to... more
Recently lifting-based integer transforms have received much attention, especially in the area of lossless audio and image coding. The usual approach is to apply the lifting scheme to each Givens rotation. Especially in the case of long... more
Proper coding and transmission of medical and physiological data is a crucial issue for the effective deployment and performance of telemedicine services. This chapter presents a platform for performing proper medical content adaptation... more
Substantial progress has been made recently in finding acoustic features that describe perceptually relevant aspects of sound. This paper presents a general framework for synthesizing audio manifesting arbitrary sets of quantifiable... more
We present a general framework for synthesizing audio manifesting arbitrary sets of perceptually motivated, quantifiable acoustic features. Much work has been done recently on finding acoustic features that describe perceptually relevant... more
Codage audio perceptuel a bas d?ebit par D?ecomposition Modale Empirique (EMD) Kais KHALDI1,3, Abdel-Ouahab BOUDRAA2, Monia TURKI1, Thierry CHONAVEL3 1Unit?e Signaux et Syst emes, ENIT BP 37, Le Belved ere 1002 Tunis, Tunisie 2IRENav,... more
This paper describes the use of sorted sinusoidal parameters to produce a fixed rate, scalable, wideband audio coder. The sorting technique relies on the perceptual significance of the sinusoidal parameters. Sinusoidal coding permits the... more
La problématique abordée dans cet article concerne le développement et l'optimisation d'une métrique de qualité pour les images compressées. Nous utilisons une méthode d'optimisation basée sur l'utilisation conjointe d'une échelle... more
Bark-scale warped linear prediction [WLP] is a very potential core for a monophonic perceptual audio codec [2]. In the current paper the WLP scheme is extended for processing complex valued signals (CWLP). Three different methods of... more
An inherent property of many DSP algorithms is that they tend to exhibit uniform frequency resolution from zero to Nyquist frequency. This is a direct consequence of using unit delays as building blocks; a frequency independent delay... more
Frequency-warped signal processing techniques are attractive to many wideband speech and audio applications since they have a clear connection to the frequency resolution of human hearing. A warped version of linear predictive coding... more
MPEG-1 Layer 3 (MP3) is one of the most popular compression formats used for sound and especially for music. However, during the coding process, the MP3 algorithm negatively affects the spectral and dynamic characteristics of the audio... more
The papers at this Convention have been selected on the basis of a submitted abstract and extended precis that have been peer reviewed by at least two qualified anonymous reviewers. This convention paper has been reproduced from the... more
La mise en service et l'arrêt de certains circuits audionumériques, tels que ceux intégrés dans les téléphones portables multimédia, entraîne la génération de bruits impulsionnels (« pop ») qui peuvent s'avérer gênants pour l'utilisateur,... more
This paper deals with pre-echo reduction in low bit-rate audio compression. [1] proposed an attack restoration method based on the correction of the temporal envelop of the decoded signal. A small set of coefficients were then transmitted... more
This paper studies the quality of multimedia content at very low bitrates. We carried out subjective experiments for assessing audiovisual, audio-only, and video-only quality. We selected content and encoding parameters that are typical... more
We carried out a number of subjective experiments for audiovisual, audio-only, and video-only quality assessment. We selected content and encoding parameters at very low bitrates that are typical of mobile applications. Using these data,... more
New approaches to hybrid in band on channel (HIBOC) FM systems for digital audio broadcasting based on multistream transmission methodology and multidescriptive audio coding techniques are introduced in this paper. These ideas combined... more
Emerging digital audio applications for broadcast radio and multimedia systems are presenting new challenges such as the need to code mixed audio content, error robustness, higher audio bandwidth and quality at low bit rates; demanding a... more
In this paper we describe the components of a novel audio coding algorithm capable of delivering high-fidelity CDlike stereo audio at the bit rates of 40-48 kbps and natural sounding FM grade mono at the bit rates of 18-22 kbps. Bandwidth... more
In the application of conventional audio compression algorithms to low bit rate audio coding one is faced with the unsatisfactory tradeoff between coarser quantization and audio bandwidth reduction. Frequency Extension has therefore... more
R sum L'algorithme OLA Overlapp and Add est classiquement utilis pour le ltrage adaptatif dans le domaine des fr quences. Cette m thode introduit cependant un retard, qui peut nuire son utilisation dans certaines applications temps r el.... more
This paper outlines an adaptive wavelet-based perceptual audio coding scheme attending to various entropy-type criteria. Its performance using some different wavelet families and various filter lengths and decomposition depths has also... more
In a previous work we presented a low delay audio coder [1]. This coder responded to a Subband-ADPCM hybrid structure and required 2'5 bits/sample. In this paper we present a new low delay audio coder that demands only 2 bit/sample and... more
This paper deals with the application of adaptive signal models for parametric audio coding. The matching pursuit algorithm is used for extracting sinusoidal components and transients in audio signals. The resulting residue is... more
Nowadays, it does not exist an audio coding standard for getting nearly-transparent quality with low delay. The standard ISO-MPEG is profusely used in audio for getting high quality [1]. It uses a perceptual model that requires high... more
ITU-R BS.1387 states a method for objective assessment of perceived audio quality. This Recommendation, known also as PEAQ (Perceptual Evaluation of Audio Quality) is based on a psychoacoustic model of the human ear and was standardized... more
The very great volume of information and data in a digital image can cause practical problems. Transmitting an image from one computer to another and/or archiving are very expensive due to the abundance of data representing the image in... more
Computational methods for the perceptual evaluation of lossy audio processing systems are required, if subjective listening tests are either too expensive or not applicable (e.g. for real-time monitoring). One approach to assess the... more
This paper illustrates the suitability of ADSP 21160 for implementation of high performance DSP applications requiring 32 bit floating point precision. We present a real-time implementation of a multi-channel MPEG-2 AAC-LC encoder. The... more
Audio Lossless Coding (ALS) is a new addition to the suite of MPEG-4 audio coding standards. The ALS codec is based on forward-adaptive linear prediction, which offers remarkable compression even with low predictor orders. Nevertheless,... more
The spectrum interpolation synthesis model has recently been applied in the high quality synthesis of harmonic musical sounds. In this work we investigate the performance of the model in the compression of music signals. Efficient methods... more
This paper illustrates the suitability of ADSP 21160 for implementation of high performance DSP applications requiring 32 bit floating point precision. We present a real-time implementation of a multi-channel MPEG-2 AAC-LC encoder. The... more
This paper illustrates the suitability of ADSP 21160 for implementation of high performance DSP applications requiring 32 bit floating point precision. We present a real-time implementation of a multi-channel MPEG-2 AAC-LC encoder. The... more
This paper proposes a modification of ADPCM algorithm to be suitable for hardware implementation of portable audio devices. This is accomplished by a simplification of predictor coefficient algorithm which normally requires a high... more
This paper describes a novel technique for audio coding, a lossy compression algorithm, that considers perceptual and rate-distortion criteria. It is based on matched finite impulse response (FIR) wavelet-packet-like filter banks, the... more
With the proliferation of broadband access and continuous decline of storage prize per gigabyte, there has been an increasing demand of audio solution that provides high sampling rate and high resolution. Lossless audio is undoubtedly the... more
Wide bitrate range scalability is now the latest trend in audio coding. A lot of efforts has been devoted to the development of algorithms for more efficient scalable audio coder that scales from very low bitrate. Scalable audio coding... more
A simple and flexible bit-plane coding method is developed for scalable audio coding. It is different from the traditional bit-plane coding in that the optimal bit-plane scanning order is adapted to the scale between the energy of the... more
This paper presents a double improvements to reduce the speech coding complexity of the pitch prediction in a Code-Excited Linear Prediction (CELP) coder. First, the pitch analysis structure is modified. A new fast Pitch Modelling by... more
We propose an approach to digital audio effects using recombinant spatialization for signal processing. This technique, which we call Spatio-Operational Spectral Synthesis (SOS), relies on recent theories of auditory perception. The... more
Speech coding is a very mature research area and many coding schemes are available that provide speech qualities ranging from highly intelligible synthetic speech at about 2 kbit/s, till wideband natural speech at about 16 kbit/s.... more
In this paper, we propose an improved sinusoidal audio modeling method for perceptual matching pursuits driven by a perceptual distortion measure. A linear model derived from the effective signal processing in the ear is used for... more
A new audio transform coding technique is proposed that reduces the bitrate requirements of the Perceptual Transform Audio Coders, by utilizing the stationarity characteristics of the audio signals. The method detects the frames which... more
Download research papers for free!