E9261 Linear Prediction Coding - Line Spectral Frequencies_RamAG
2024, E9261:Speech Information Processing
Sign up for access to the world's latest research
Abstract
Details of Linear prediction analsyis and line spectral frequencies, both used in speech coding.
Related papers
Lecture Notes in Computer Science, 2006
In this paper we present first experimental results with a novel audio coding technique based on approximating Hilbert envelopes of relatively long segments of audio signal in critical-band-sized subbands by autoregressive model. We exploit the generalized autocorrelation linear predictive technique that allows for a better control of fitting the peaks and troughs of the envelope in the sub-band. Despite introducing longer algorithmic delay, improved coding efficiency is achieved. Since the described technique does not directly model short-term spectral envelopes of the signal, it is suitable not only for coding speech but also for coding of other audio signals.
International Journal of Engineering Research and Technology (IJERT), 2015
https://www.ijert.org/residual-excited-linear-predictive-coding https://www.ijert.org/research/residual-excited-linear-predictive-coding-IJERTV4IS050982.pdf In this paper we present a low bit rate voice coding technique called the residual-excited linear prediction (RELP) coding. It uses 10 th order Levinson-Durbin Recursive algorithm. It provides very good and accurate estimates of speech parameters and is relatively efficient for computation. In the RELP system, vocal tract modeling is done by the LPC technique, and the LPC residual signal is used as the excitation signal. The range of the transmission rate is reduced to 9.6 kbits/s the synthetic speech in this range is quite good. As the transmission rate is lowered, the synthetic speech quality degrades very gradually. Since no pitch extraction is required, it is robust in any operating environment .The speech signal of males and females were coded and the results showed that the coding technique gives good speech quality with low complexity.
IEEE Transactions on Speech and Audio Processing, 1995
An important problem in speech coding is the quantization of linear predictive coefficients (LPC) with the smallest possible number of bits while maintaining robustness to a large variety of speech material and transmission media. Since direct quantization of LPC's is known to be unsatisfactory, we consider this problem for an equivalent representation, namely, the line spectral frequencies (LSF). To achieve an acceptable level of distortion a scalar quantizer for LSF's requires a 36 bit codebook. We derive a 30 bit two-quantizer scheme which achieves a performance equivalent to this scalar quantizer. This equivalence is verified by tests on data taken from various types of filtered speech, speech corrupted by noise and by a set of randomly generated LSF's. The two-quantizer format consists of both a vector and a scalar quantizer such that for each input, the better quantizer is used. The vector quantizer is designed from a training set that reflects the joint density (for coding efficiency) and which ensures coverage (for robustness). The scalar quantizer plays a pivotal role in dealing better with regions of the space that are sparsely covered by its vector quantizer counterpart. A further reduction of 1 bit is obtained by formulating a new adaptation algorithm for the vector quantizer and doing a dynamic programming search for both quantizers. The method of adaptation takes advantage of the ordering of the LSF's and imposes no overhead in memory requirements. The dynamic programming search is feasible due to the ordering property. Subjective tests in a speech coder reveal that the 29 bit scheme produces equivalent perceptual quality to that when the parameters are unquantized.
ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1978
A method for LPC analysis in a tranaformed domain (LPCTD) has been developed theoretically and studied experimentally in the VJalsh-Hadsnard domain (LPCVJHD) for low-bit-rate coding of speech signals Speech signals in the Walsh-Hadasiard domain have bean modelled by their largest variance coefficients and a few prediction coefficients which represent the remaining coefficients. Deteraination of the prediction coefficients has been based on the correlation between the spectral coeffioients. Intelligible speech at bit-rates of 8 kb/s and 2+ kb/s was achieved when i6 and 62+ point Walsh-Hadamard transforms were used,raspectively. At the latter bitrate the quality was significantly improved when unvoiced sounds wore coded separately by their largest variance coefficients. The main advantage of LPCWHD system is its simplicity which can lead to a far less complex implementation than that of vocoder systems.
ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1978
Electronics and Communications in Japan (Part III: Fundamental Electronic Science), 1994
This paper proposes a new speech coding method pitch synchronous innovation code excited linear predictor (PSI-CELP). This method is based on CELP but adds pitch synchronous innovation. This results in even random codevectors being adaptively converted to have pitch periodicity for voiced frames. This scheme can improve the synthesized speech quality of voiced frames in the low bit-rate CELP without increasing either computational complexity or bit rate.
1985
We describe in this paper a code-excited linear predictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion. Each sample of the innovation sequence is filtered sequentially through two time-varying linear recursive filters, one with a long-delay (related to pitch period) predictor in the feedback loop and the other with a short-delay predictor (related to spectral envelope) in the feedback loop. We code speech, sampled at 8 kHz, in blocks of 5-msec duration. Each block consisting of 40 samples is produced from one of 1024 possible innovation sequences. The bit rate for the innovation sequence is thus 1/4 bit per sample. We compare in this paper several different random and deterministic code books for their effectiveness in providing the optimum innovation sequence in each block. Our results indicate that a random code book has a slight speech quality advantage at low bit rates. Examples of speech produced by the above method will be played at the conference.
Speech coding is the process of converting voice signal into digital form in a few bits as possible. The newly developed Code Excited Linear Prediction " CELP " coders is one major type of the parametric coders which combines between low data rates and good speech quality. Most of these coders have been built initially for 7 languages not included Arabic language or its dialects. It is known that the speech quality is directly proportional to the data rate but what is the effect of the change of the spoken language or accent? This paper is made to answer on three main questions. The first question is; what is the effect of the language or accents on CELP coders? Moreover what will happen if the speech is compressed more by lower data rate coder and at the same time the language is other than English? Finally if there is an effect, so what is the defective part in the coder? Extensive testing is done on 3 coders ITU G.711, ITU G.723.1 and 3GPP AMR with changing the spoken l...
1980
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: E C 54729 INFORM ATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.
There is currently a great deal of interest in the development of speech coding algorithms capable of delivering toll quality at 4 kb/s and below. For synthesizing high quality speech, accurate representation of the voiced portions of speech is essential. For bit rates of 4 kb/s and below, conventional code excited linear prediction (CELP) may likely not provide the appropriate degree of periodicity. It has been shown that good quality low bit rate speech coding can be obtained by frequency domain techniques such as sinusoidal transform coding (STC), multi-band excitation (MBE), mixed excitation linear prediction (MELP), and multi-band LPC (MB-LPC) vocoders. In this paper, a speech coding algorithm based on an improved version of MB-LPC is presented. Main features of this algorithm include a multi-stage time/frequency pitch estimation and an improved mixed voicing representation. An efficient quantization scheme for the spectral amplitudes of the excitation, called formant weighted ...

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.