IJERT-Residual Excited Linear Predictive Coding
2015, International Journal of Engineering Research and Technology (IJERT)
Sign up for access to the world's latest research
Abstract
https://www.ijert.org/residual-excited-linear-predictive-coding https://www.ijert.org/research/residual-excited-linear-predictive-coding-IJERTV4IS050982.pdf In this paper we present a low bit rate voice coding technique called the residual-excited linear prediction (RELP) coding. It uses 10 th order Levinson-Durbin Recursive algorithm. It provides very good and accurate estimates of speech parameters and is relatively efficient for computation. In the RELP system, vocal tract modeling is done by the LPC technique, and the LPC residual signal is used as the excitation signal. The range of the transmission rate is reduced to 9.6 kbits/s the synthetic speech in this range is quite good. As the transmission rate is lowered, the synthetic speech quality degrades very gradually. Since no pitch extraction is required, it is robust in any operating environment .The speech signal of males and females were coded and the results showed that the coding technique gives good speech quality with low complexity.
Related papers
The International Conference on Electrical Engineering
Speech coding is a very important area that finds civilian and military applications. It can be considered as one of the important stages in speech processing. It is used to compress speech; this is because the speech signal is very redundant. Speech coding has many applications; it is used in digital telephony, in multimedia and in security of digital communications. In this paper, we focused on developing algorithms and methods for a waveform speech coder operating at low bit rate with good quality reconstructed speech signal. Moreover, a new model for linear predictive coding of speech that can be used to produce high quality speech at low data rate is introduced. In this model, we divided the residual (excitation signal) to subframes and made energy and voice / unvoice classifications to choose the best pulses in the residual that give us low bit rate and good quality for the reconstructed speech. Hence, this vocoder forms an excitation sequence which consists of groups of uniformly spaced pulses. During analysis the amplitude and LP coefficients of the pulses are determined. In addition, a new technique in the quantization of the amplitude of each pulse as well as linear prediction parameters is proposed.
Speech coding is an important application of speech processing. Linear predictive coding (LPC) is the powerful speech coding technique used for encoding speech signals at a low bit rate. This method provides accurate estimation of parameters with less complexity. In this paper we discuss the implementation of plain linear predictive coding (LPC) voice coder and voice excited linear predictive (VELP) voice coder. Both of these voice coders are based on the principle of linear prediction where the current sample is predicted by the linear function of past values. VELP is an improved version of plain LPC voice coder. It is implemented by using DCT for coefficients to improve quality of speech. Simulation results of plain LPC and VELP are compared and we find that VELP produces better quality of signal than LPC.
2003
This article is focused on speech coding methods for achieving communication quality speech at bit rates of 4 kbit/s and lower. The speech coding techniques are based on an all-pole model of the vocal tract which may be implemented in the time domain with appropriately selected excitation functions or else may be fit to a spectral analysis of the speech signal. Three main types of coders are described below. Code-excited linear prediction (CELP) coders select their excitation from waveform codebooks using analysis-by-synthesis closed-loop techniques, which need to be supplemented by speech classification and open-loop parametric techniques for keeping up with quality at lower rates. The prototypical sinusoidal coder (SC) has a bank of oscillators for signal synthesis, driven by a model of the magnitude spectrum. However, phase regeneration is important in enhancing speech reconstruction at low rates. Waveform interpolation (WI) coders afford a wider timefrequency footprint for the representation of the excitation, showing a good potential for achieving toll quality at bit rates below 4 kbit/s.
IEEE Transactions on Communications, 1982
Abstracr-Predictive coding is a promising approach for speech coding. In this paper, we review the recent work on adaptive predictive coding of speech signals, with particular emphasis on achieving high speech quality at low bit rates (less than 10 kbits/s). Efficient prediction of the redundant structure in speech signals is obviously important for proper functioning of a predictive coder. It is equally important to ensure that the distortion in the coded speech signal be perceptually small. The subjective loudness of quantization noise depends both on the short-time spectrum of the noise and its relation to the short-time spectrum of the speech signal. The noise in the formant regions is partially masked by the speech signal itself. This masking of quantization noise by speech signal allows one to use low bit rates while maintaining high speech quality. This paper will present generalizations of predictive coding for minimizing subjective distortion in the reconstructed speech signal at the receiver. The quantizer in predictive coders quantizes its input on a sample-by-sample basis. Such sample-by-sample (instantaneous) quantization creates difficulty in realizing an arbitrary noise spectrum, particularly at low bit rates. We will describe a new class of speech coders in this paper which could be considered to be a generalization of the predictive coder. These new coders not only allow one to realize the precise optimum noise spectrum which is crucial to achieving very low bit rates, but also represent the important first step in bridging the gap between waveform coders and vocoders without suffering from their limitations.
Arabian Journal for Science and Engineering, 2019
In this paper, we propose a variable-bit-rate speech codec-based on mixed excitation linear prediction enhanced (MELPe) with an average bit rate of 2 kbps and with a better representation of excitation signal. The order of the prediction filter in MELPe coding architecture is reduced from 10 to 7 without affecting the perceptual quality of the decoded speech by using psychoacoustic Mel scale. An efficient two-split vector quantization is developed with weighted Euclidean distance measure for Mel scale-based linear predictive coding (Mel-LPC), and it requires only 18 bits/frame. The instantaneous pitch or epoch that is vital for many speech processing applications is preserved in this codec by including it in the excitation signal used for reconstructing the voiced speech. The quantization scheme developed for glottal closure instants (GCIs) causes an increase in the bit requirement for voiced frames by 4-25 bits depending on the position of GCIs. To compensate for that, the Mel-LPC order for both silence and unvoiced frames has been brought down to 4 without compromising the perceptual quality of reconstructed speech. The lowered bit budget for unvoiced frame is 41 bits/frame, and for silence, it is 31 bits/frame. Further reduction of 10 bits for silence frame is obtained by reducing the number of transmitted parameters and by tuning the quantization bit requirement for each. For categorizing the speech frames at the entry of the encoder, a neural network-based voiced/unvoiced/silence classification algorithm using five-dimensional feature set is created. The experimental results show that the proposed coding scheme operates at an average bit rate of 2 kbps, which is less than the bit rate of MELPe (2.4 kbps), but with a better perceptual score. In addition to all these, the incorporation of Mel-LPC gives a better performance in the estimation of formants and GCIs.
The Journal of the Acoustical Society of America, 1988
Problem statement: Speech Enhancement plays an important role in any of the speech processing systems like speech recognition, speech coding, mobile communication, hearing aid, etc., Approach: In this work, the performance of the speech coding method is enhanced by using speech enhancement as the preprocessing technique. The purpose of the proposed method is to reduce the bit rate of the speech signal to be transmitted, so that the bandwidth can be utilized efficiently. In noisy environment speech coding is done both for desired speech and the unwanted noise signal. If the noise is reduced before coding the speech signal, the bit rate required will also be reduced. In this work a simple adaptive speech enhancement technique, using an adaptive sigmoid type function to determine the weighting factor of the TSDD algorithm is employed based on a subband approach for speech enhancement and Voice excited Linear predictive coding (VELP) method is used for coding the speech signal. Results:...
There is currently a great deal of interest in the development of speech coding algorithms capable of delivering toll quality at 4 kb/s and below. For synthesizing high quality speech, accurate representation of the voiced portions of speech is essential. For bit rates of 4 kb/s and below, conventional code excited linear prediction (CELP) may likely not provide the appropriate degree of periodicity. It has been shown that good quality low bit rate speech coding can be obtained by frequency domain techniques such as sinusoidal transform coding (STC), multi-band excitation (MBE), mixed excitation linear prediction (MELP), and multi-band LPC (MB-LPC) vocoders. In this paper, a speech coding algorithm based on an improved version of MB-LPC is presented. Main features of this algorithm include a multi-stage time/frequency pitch estimation and an improved mixed voicing representation. An efficient quantization scheme for the spectral amplitudes of the excitation, called formant weighted ...
International Conference on Acoustics, Speech, and Signal Processing, 1989
In this paper the efficient representation of the excitation signal to an LPC synthesis filter by means of a vector expansion of the residual signal is examined. According to this approach the excitation signal is represented as a linear combination of a small number of vectors taken from a given vector set, known at both ends of the transmission channel. It is demonstrated that this approach provides a unified framework for describing and analyzing a wide range of residual speech coders, from Multipulse LPC and CELP to Residual Transform Coders and leads to generalization of some of these schemes. Optimality conditions based on the singular value decomposition (SVD) of the impulse response matrix of the perceptually weighted LPC synthesis tilter are glven. A resulting simplified Predictive Transform Coder is proposed and examined by computer simulations.
Principles of Speech Coding, 2010
Introduction to LTT Systems 2.1.1 Linearity 2.1.2 Time Invariance 2.1.3 Representation Using Impulse Response 2.1.4 Representation of Any Continuous-Time (CT) Signal .. 2.1.5 Convolution 2.1.6 Differential Equation Models 2.2 Review of Digital Signal Processing 2.2.1 Sampling 2.2.2 Shifted Unit Pulse: 8 (wk) 2.2.3 Representation of Any DT Signal 2.2.4 Introduction to Z Transforms 2.2.5 Fourier Transform, Discrete Fourier Transform 2.2.6 Digital Filter Structures 2.3 Review of Stochastic Signal Processing 2.3.1 Power Spectral Density 2.4 Response of a Linear System to a Stochastic Process Input.... 2.5 Windowing 2.6 AR Models for Speech Signals, Yule-Walker Equations 2.7 Short-Term Frequency (or Fourier) Transform and Cepstrum. 2.7.1 Short-Term Frequency Transform (STFT) 2.7.2 The Cepstrum 2.8 Periodograms 2.9 Spectral Envelope Determination for Speech Signals 2.10 Voiced/Unvoiced Classification of Speech Signals 2.10.1 Time-Domain Methods 2.10.1.1 Periodic Similarity 2.10.1.2 Frame Energy 2.10.1.3 Pre-Emphasized Energy Ratio 2.10.1.4 Low-to Full-Band Energy Ratio 2.10.1.5 Zero Crossing 2.10.1.6 Prediction Gain 2.10.1.7 Peakiness of Speech 2.10.1.8 Spectrum Tilt 2.10.2 Frequency-Domain Methods 2.10.3 Voiced/Unvoiced Decision Making 2.11 Pitch Period Estimation Methods 2.12 Summary Exercise Problems References Bibliography Contents ix 3. Sampling Theory 61 3.1 4.10 ITU G.711 |i-Law and A-Law PCM Standards 4.10.1 Conversion between Linear and Companded Codes ... 4.10.1.1 Linear to |x-Law Conversion 92 4.10.1.2^i-Law to Linear Code Conversion 93 4.10.1.3 Linear to A-Law Conversion 94 4.10.1.4 A-Law to Linear Conversion 95 4.11 Optimum Quantization 95 4.11.1 Closed Form Solution for the Optimum Companding Characteristics 96 4.11.2 Lloyd-Max Quantizer 97 4.12 Adaptive Quantization

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (4)
- Zarkadis, D.J.; Evans, B.G, "Performance considerations of a 9.6kb/s RELP coder" IEEE Trans, pp.172-177, August 2002.
- Katterfeldt, H.," A DFT-based residual-excited linear predictive coder" IEEE INFOCOM 2003, pp.824-827, January 2003.
- Katterfeldt, H.; Behl, E.," Implementation of a robust RELP speech coder", IEEE 1983, pp.1316-1319.
- Chong Un; Magill, D.," The Residual-Excited Linear Prediction Vocoder"IEEE comm., pp.1466-1474, jan.2003 International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 www.ijert.org IJERTV4IS050982 (This work is licensed under a Creative Commons Attribution 4.0 International License.) Vol. 4 Issue 05, May-2015