E9261 Linear Prediction Coding - Line Spectral Frequencies_RamAG

Ramakrishnan Angarai Ganesan

Outline

Title

Abstract

All Topics

Languages and Linguistics

Phonetics

E9261 Linear Prediction Coding - Line Spectral Frequencies_RamAG

Ramakrishnan Angarai Ganesan

2024, E9261:Speech Information Processing

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Details of Linear prediction analsyis and line spectral frequencies, both used in speech coding.

Hynek Hermansky

Lecture Notes in Computer Science, 2006

In this paper we present first experimental results with a novel audio coding technique based on approximating Hilbert envelopes of relatively long segments of audio signal in critical-band-sized subbands by autoregressive model. We exploit the generalized autocorrelation linear predictive technique that allows for a better control of fitting the peaks and troughs of the envelope in the sub-band. Despite introducing longer algorithmic delay, improved coding efficiency is achieved. Since the described technique does not directly model short-term spectral envelopes of the signal, it is suitable not only for coding speech but also for coding of other audio signals.

downloadDownload free PDF View PDFchevron_right

IJERT-Residual Excited Linear Predictive Coding

IJERT Journal

International Journal of Engineering Research and Technology (IJERT), 2015

https://www.ijert.org/residual-excited-linear-predictive-coding https://www.ijert.org/research/residual-excited-linear-predictive-coding-IJERTV4IS050982.pdf In this paper we present a low bit rate voice coding technique called the residual-excited linear prediction (RELP) coding. It uses 10 th order Levinson-Durbin Recursive algorithm. It provides very good and accurate estimates of speech parameters and is relatively efficient for computation. In the RELP system, vocal tract modeling is done by the LPC technique, and the LPC residual signal is used as the excitation signal. The range of the transmission rate is reduced to 9.6 kbits/s the synthetic speech in this range is quite good. As the transmission rate is lowered, the synthetic speech quality degrades very gradually. Since no pitch extraction is required, it is robust in any operating environment .The speech signal of males and females were coded and the results showed that the coding technique gives good speech quality with low complexity.

downloadDownload free PDF View PDFchevron_right

A two codebook format for robust quantization of line spectral frequencies

Ravindran Ramachandran

IEEE Transactions on Speech and Audio Processing, 1995

An important problem in speech coding is the quantization of linear predictive coefficients (LPC) with the smallest possible number of bits while maintaining robustness to a large variety of speech material and transmission media. Since direct quantization of LPC's is known to be unsatisfactory, we consider this problem for an equivalent representation, namely, the line spectral frequencies (LSF). To achieve an acceptable level of distortion a scalar quantizer for LSF's requires a 36 bit codebook. We derive a 30 bit two-quantizer scheme which achieves a performance equivalent to this scalar quantizer. This equivalence is verified by tests on data taken from various types of filtered speech, speech corrupted by noise and by a set of randomly generated LSF's. The two-quantizer format consists of both a vector and a scalar quantizer such that for each input, the better quantizer is used. The vector quantizer is designed from a training set that reflects the joint density (for coding efficiency) and which ensures coverage (for robustness). The scalar quantizer plays a pivotal role in dealing better with regions of the space that are sparsely covered by its vector quantizer counterpart. A further reduction of 1 bit is obtained by formulating a new adaptation algorithm for the vector quantizer and doing a dynamic programming search for both quantizers. The method of adaptation takes advantage of the ordering of the LSF's and imposes no overhead in memory requirements. The dynamic programming search is feasible due to the ordering property. Subjective tests in a speech coder reveal that the 29 bit scheme produces equivalent perceptual quality to that when the parameters are unquantized.

downloadDownload free PDF View PDFchevron_right

Linear prediction techniques in the Walsh spectral domain for speech analysis and synthesis

Anthony Constantinides

ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1978

A method for LPC analysis in a tranaformed domain (LPCTD) has been developed theoretically and studied experimentally in the VJalsh-Hadsnard domain (LPCVJHD) for low-bit-rate coding of speech signals Speech signals in the Walsh-Hadasiard domain have bean modelled by their largest variance coefficients and a few prediction coefficients which represent the remaining coefficients. Deteraination of the prediction coefficients has been based on the correlation between the spectral coeffioients. Intelligible speech at bit-rates of 8 kb/s and 2+ kb/s was achieved when i6 and 62+ point Walsh-Hadamard transforms were used,raspectively. At the latter bitrate the quality was significantly improved when unvoiced sounds wore coded separately by their largest variance coefficients. The main advantage of LPCWHD system is its simplicity which can lead to a far less complex implementation than that of vocoder systems.

downloadDownload free PDF View PDFchevron_right

Predictive coding of speech signals and subjective error criteria

Bishnu Atal

ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1978

downloadDownload free PDF View PDFchevron_right

Pitch synchronous innovation code excited linear prediction (PSI-CELP)

Takehiro Moriya

Electronics and Communications in Japan (Part III: Fundamental Electronic Science), 1994

This paper proposes a new speech coding method pitch synchronous innovation code excited linear predictor (PSI-CELP). This method is based on CELP but adds pitch synchronous innovation. This results in even random codevectors being adaptively converted to have pitch periodicity for voiced frames. This scheme can improve the synthesized speech quality of voiced frames in the low bit-rate CELP without increasing either computational complexity or bit rate.

downloadDownload free PDF View PDFchevron_right

Code-excited linear prediction(CELP): High-quality speech at very low bit rates

Arun Raj

1985

We describe in this paper a code-excited linear predictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion. Each sample of the innovation sequence is filtered sequentially through two time-varying linear recursive filters, one with a long-delay (related to pitch period) predictor in the feedback loop and the other with a short-delay predictor (related to spectral envelope) in the feedback loop. We code speech, sampled at 8 kHz, in blocks of 5-msec duration. Each block consisting of 40 samples is produced from one of 1024 possible innovation sequences. The bit rate for the innovation sequence is thus 1/4 bit per sample. We compare in this paper several different random and deterministic code books for their effectiveness in providing the optimum innovation sequence in each block. Our results indicate that a random code book has a slight speech quality advantage at low bit rates. Examples of speech produced by the above method will be played at the conference.

downloadDownload free PDF View PDFchevron_right

The Effect of the Spoken Language on the Linear Prediction Vector Quantization Distortion for Linear Prediction Coders

michael nasief

Speech coding is the process of converting voice signal into digital form in a few bits as possible. The newly developed Code Excited Linear Prediction " CELP " coders is one major type of the parametric coders which combines between low data rates and good speech quality. Most of these coders have been built initially for 7 languages not included Arabic language or its dialects. It is known that the speech quality is directly proportional to the data rate but what is the effect of the change of the spoken language or accent? This paper is made to answer on three main questions. The first question is; what is the effect of the language or accents on CELP coders? Moreover what will happen if the speech is compressed more by lower data rate coder and at the same time the language is other than English? Finally if there is an effect, so what is the defective part in the coder? Extensive testing is done on 3 coders ITU G.711, ITU G.723.1 and 3GPP AMR with changing the spoken l...

downloadDownload free PDF View PDFchevron_right

Linear prediction analysis/synthesis & noise cancellation techniques in speech signals

Surinder Dhanjal

1980

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: E C 54729 INFORM ATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

downloadDownload free PDF View PDFchevron_right

A mixed sinusoidally excited linear prediction coder at 4 kb/s and below

Vishu Viswanathan

There is currently a great deal of interest in the development of speech coding algorithms capable of delivering toll quality at 4 kb/s and below. For synthesizing high quality speech, accurate representation of the voiced portions of speech is essential. For bit rates of 4 kb/s and below, conventional code excited linear prediction (CELP) may likely not provide the appropriate degree of periodicity. It has been shown that good quality low bit rate speech coding can be obtained by frequency domain techniques such as sinusoidal transform coding (STC), multi-band excitation (MBE), mixed excitation linear prediction (MELP), and multi-band LPC (MB-LPC) vocoders. In this paper, a speech coding algorithm based on an improved version of MB-LPC is presented. Main features of this algorithm include a multi-stage time/frequency pitch estimation and an improved mixed voicing representation. An efficient quantization scheme for the spectral amplitudes of the excitation, called formant weighted ...

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Related papers

Technology Linear Predictive Coding

Sanjay gulhane

2015

Linear predictive coding (LPC) is defined as a digital method for encoding an analog signal in which a particular value is predicted by a linear function of the past values of the signal. It was first proposed as a method for encoding human speech by the United States Department of Defense in federal standard 1015, published in 1984. Human speech is produced in the vocal tract which can be approximated as a variable diameter tube. The linear predictive coding (LPC) model is based on a mathematical approximation of the vocal tract represented by this tube of a varying diameter. At a Particular time, t, the speech sample s(t) is represented as a linear sum of the p previous samples. The most important aspect of LPC is the linear predictive filter which allows the value of the next sample to be determined by a linear combination of previous samples. Under normal circumstances, speech is sampled at 8000 samples/second with 8 bits used to represent each sample. This provides a rate of 64...

downloadDownload free PDF View PDFchevron_right

CODE EXCITED LINEAR PREDICTION FOR SPEECH COMPRESSION

Rahulkrishnan Chandrasekharan

Codebook Excited Linear Prediction (CELP) is one of the most widely used class of speech coders which is based on the concept of LPC. The enhancement is that a codebook of different excitation signals is maintained on the encoder and decoder. The encoder finds the most suitable excitation signal sends its index to the decoder which then uses it to reproduce the signal. Hence the name Codebook Excited is given to this coder. There are many variants of CELP that are in use in various applications. Low Delay CELP (LD-CELP) and Algebraic CELP (ACELP) are generally used in internet voice calls and cell phones.

downloadDownload free PDF View PDFchevron_right

Principles of Speech Coding

SIm NARASIMHA

Principles of Speech Coding, 2010

Introduction to LTT Systems 2.1.1 Linearity 2.1.2 Time Invariance 2.1.3 Representation Using Impulse Response 2.1.4 Representation of Any Continuous-Time (CT) Signal .. 2.1.5 Convolution 2.1.6 Differential Equation Models 2.2 Review of Digital Signal Processing 2.2.1 Sampling 2.2.2 Shifted Unit Pulse: 8 (wk) 2.2.3 Representation of Any DT Signal 2.2.4 Introduction to Z Transforms 2.2.5 Fourier Transform, Discrete Fourier Transform 2.2.6 Digital Filter Structures 2.3 Review of Stochastic Signal Processing 2.3.1 Power Spectral Density 2.4 Response of a Linear System to a Stochastic Process Input.... 2.5 Windowing 2.6 AR Models for Speech Signals, Yule-Walker Equations 2.7 Short-Term Frequency (or Fourier) Transform and Cepstrum. 2.7.1 Short-Term Frequency Transform (STFT) 2.7.2 The Cepstrum 2.8 Periodograms 2.9 Spectral Envelope Determination for Speech Signals 2.10 Voiced/Unvoiced Classification of Speech Signals 2.10.1 Time-Domain Methods 2.10.1.1 Periodic Similarity 2.10.1.2 Frame Energy 2.10.1.3 Pre-Emphasized Energy Ratio 2.10.1.4 Low-to Full-Band Energy Ratio 2.10.1.5 Zero Crossing 2.10.1.6 Prediction Gain 2.10.1.7 Peakiness of Speech 2.10.1.8 Spectrum Tilt 2.10.2 Frequency-Domain Methods 2.10.3 Voiced/Unvoiced Decision Making 2.11 Pitch Period Estimation Methods 2.12 Summary Exercise Problems References Bibliography Contents ix 3. Sampling Theory 61 3.1 4.10 ITU G.711 |i-Law and A-Law PCM Standards 4.10.1 Conversion between Linear and Companded Codes ... 4.10.1.1 Linear to |x-Law Conversion 92 4.10.1.2^i-Law to Linear Code Conversion 93 4.10.1.3 Linear to A-Law Conversion 94 4.10.1.4 A-Law to Linear Conversion 95 4.11 Optimum Quantization 95 4.11.1 Closed Form Solution for the Optimum Companding Characteristics 96 4.11.2 Lloyd-Max Quantizer 97 4.12 Adaptive Quantization

downloadDownload free PDF View PDFchevron_right

Springer Handbook of Speech Processing

Jacob Benesty

The Journal of the Acoustical Society of America, 2009

downloadDownload free PDF View PDFchevron_right

LINEAR PREDICTIVE CODING

IJESRT Journal

downloadDownload free PDF View PDFchevron_right

IRJET- ANALYSIS OF CEPSTRAL AND LINEAR PREDICTION IN VARIOUS SPEECH SIGNAL

IRJET Journal

IRJET, 2021

In this project analyzing of Voiced, Unvoiced and silence regions of speech from their time domain and frequency domain representation. The research the redundancy within the speech signal is used.

downloadDownload free PDF View PDFchevron_right

Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain

Hynek Hermansky

2008

Audio coding based on Frequency Domain Linear Prediction (FDLP) uses autoregressive models to approximate Hilbert envelopes in frequency sub-bands. Although the basic technique achieves good coding efficiency, there is a need to improve the reconstructed signal quality for tonal signals with impulsive spectral content. For such signals, the quantization noise in the FDLP codec appears as frequency components not present in the input signal. In this paper, we propose a technique of Spectral Noise Shaping (SNS) for improving the quality of tonal signals by applying a Time Domain Linear Prediction (TDLP) filter prior to the FDLP processing. The inverse TDLP filter at the decoder shapes the quantization noise to reduce the artifacts. Application of the SNS technique to the FDLP codec improves the quality of the tonal signals without affecting the bit-rate. Performance evaluation is done with Perceptual Evaluation of Audio Quality (PEAQ) scores and with subjective listening tests.

downloadDownload free PDF View PDFchevron_right

LSP calculation methods for application to speech coding

Fausto Pellandini

1999

Line spectrum pair (LSP) parameters are commonly used in speech coding for quantization of the speech spectral envelope. Unfortunately, the high computational complexity in the calculation of the LSP is a drawback for both real-time implementation and application in low-power portable devices. In this paper, some techniques for reducing computational complexity of the LSP calculation are given. The use of these techniques results in three novel LSP calculation algorithms which are explained and evaluated from the point of view of accuracy and computational complexity.

downloadDownload free PDF View PDFchevron_right

Intraframe quantization of speech line spectrum pairs for code-excited linear prediction based coders in packet networks

Fatiha Merazka

Transactions on Emerging Telecommunications Technologies, 2012

This paper proposes intraframe quantization schemes of line spectrum pair parameters to improve frame erasures for the ITU-T G.723.1 coder. The standard ITU-T G.723.1 coder uses an interframe quantization of the line spectrum pair parameters, which causes error propagation to the next frames. Simulation results show that our intraframe quantization schemes coding is much more robust to frame erasures than the embedded method in ITU-T G.723.1, and a typical improvement of 0.4 dB on average spectral distortion can be obtained with 40% packet loss. Enhanced modified bark spectral distortion tests, under various packet loss conditions, confirm that our proposed method is superior to the interframe algorithm embedded in the ITU-T G.723.1 coder of over 1.6.

downloadDownload free PDF View PDFchevron_right

Speech Analysis and Synthesis by Linear Prediction of the Speech Wave

Bishnu Atal

The Journal of the Acoustical Society of America, 1970

We describe a procedure for efficient encoding of the speech wave by representing it in terms of time-varying parameters related to the transfer function of the vocal tract and the characteristics of the excitation. The speech wave, sampled at 10 kHz, is analyzed by predicting the present speech sample as a linear combination of the 12 previous samples. The 12 predictor coe&ients are determined by minimiaing the mean-squared error between the actual and the predicted values of the speech samples. Fifteen parametek-namely, the 12 predictor coethcienta, the pitch period, a binary parameter indicating whether the speech is voiced or unvoiced, and the rms value of the speech samples-are derived by analysis of the speech wave, encoded and transmitted to the synthesizer. The speech wave is synthesized as the output of a linear recursive filter excited by either a sequence of quasiperiodic pulses or a wbite-noise source. Application of this method for efficient transmission and storage of speech signals as well as procedures for determining other speech characteristics, such as formant frequencies and bandwidths, the spectral envelope, and the autocorrelation function, are discussed.

downloadDownload free PDF View PDFchevron_right

E9261 Linear Prediction Coding - Line Spectral Frequencies_RamAG

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics