Academia.eduAcademia.edu

Speech Compression

description37 papers
group58 followers
lightbulbAbout this topic
Speech compression is a process that reduces the data rate of audio signals representing human speech, aiming to minimize bandwidth usage while preserving intelligibility and quality. It employs various algorithms and techniques to eliminate redundancy and irrelevant information in speech signals, facilitating efficient storage and transmission.
lightbulbAbout this topic
Speech compression is a process that reduces the data rate of audio signals representing human speech, aiming to minimize bandwidth usage while preserving intelligibility and quality. It employs various algorithms and techniques to eliminate redundancy and irrelevant information in speech signals, facilitating efficient storage and transmission.

Key research themes

1. How can subspace and spectral subtraction methods improve low-bit-rate speech compression in noisy environments?

This research theme explores advanced preprocessing techniques aimed at enhancing the signal-to-noise ratio (SNR) of speech signals before compression under low-bit-rate conditions, particularly for applications like cellular communication. The focus lies on comparing and combining signal-subspace-based speech enhancement with spectral subtraction algorithms to mitigate additive noise effects and improve quality in bandwidth-constrained speech coding frameworks.

Key finding: This paper demonstrates that a signal-subspace-based speech enhancement algorithm outperforms conventional spectral-subtraction-based noise reduction methods in improving the perceptual quality of speech coded by the CELP... Read more
Key finding: The study proposes a sinusoidal speech coder with a noise-resilient design that classifies frames into voiced/unvoiced segments to optimize parameter selection. The codec extracts spectral peaks in the frequency domain using... Read more
Key finding: This work introduces a packet loss concealment method working directly on quantized sinusoidal speech parameters at 8 kbit/s, employing time-scaling of adjacent packets to compensate lost data in VoIP scenarios. The coded... Read more

2. What are the benefits and challenges of wavelet transform-based speech compression methods?

Wavelet transform methods, particularly Discrete Wavelet Transform (DWT), have been widely studied for speech compression due to their ability to efficiently represent non-stationary signals by capturing both temporal and spectral properties. This research theme examines how DWT-based methods exploit multi-resolution analysis to achieve high compression ratios while preserving signal quality and how these approaches compare to traditional coding standards and other transforms, including practical implementation aspects and trade-offs.

Key finding: This paper highlights the effectiveness of DWT in achieving promising compression ratios for speech (2.31) and images (2.67) while maintaining high signal energy retention (~99.99%) and yielding high SNR and PSNR values. It... Read more
Key finding: The research integrates discrete wavelet transform to compress stego-speech signals while preserving perceptual integrity. Optimization of wavelet selection, decomposition depth, and coefficient thresholding enables a balance... Read more
Key finding: The paper implements single-level DWT-based speech compression on FPGA using VHDL, separating high and low frequency components and retaining approximation coefficients to reduce bit rate. It addresses practical hardware... Read more
Key finding: This review compares multiple speech coders including traditional ITU standard codecs and wavelet-based coders, analyzing compression ratio, SNR, and mean opinion scores for English and Hindi speech. It highlights that... Read more

3. Can integration of speech recognition features into low-bit-rate compression enhance recognition accuracy and system efficiency?

This theme investigates methods to reconcile low-bit-rate speech compression with speech recognition performance by incorporating recognition-relevant features (e.g., MFCC) into the compression pipeline. The aim is to minimize recognition degradation commonly caused by traditional waveform compression, enabling direct recognition from compressed representations and reducing retraining needs, facilitating distributed speech recognition, and improving playback on devices with storage constraints.

Key finding: The RECOVC algorithm compresses speech by encoding Mel-Frequency Cepstral Coefficients (MFCC) and pitch period, enabling lossless recognition over low-bandwidth channels without degrading recognition accuracy in large... Read more
Key finding: This work focuses on compressing transformer models used in end-to-end speech recognition by pruning and quantizing weights, significantly reducing model size (up to 84%) with minimal impact on accuracy. By optimizing model... Read more
Key finding: Comparative evaluation of time-domain voice activity detection (VAD) algorithms for VoIP shows that efficient silence detection significantly reduces bandwidth by pruning non-speech frames without compromising toll-grade... Read more

All papers in Speech Compression

In this paper we present a technique of efficacy improvement of speech signal compression algorithm without individual features speech production loss. The compression in this case means to delete, from the digital signal, those... more
ED035315 - Time-Compressed Speech as an Educational Medium: Studies of Stimulus Characteristics and Individual Differences. Final Report.
The paper gives the details about the speech compression using discrete wavelet transform in FPGA. In today's world multimedia files are used, storage space required for these files is more and sound files have no option so ultimate... more
Details of Linear prediction analsyis and line spectral frequencies, both used in speech coding.
Voice over Internet Protocol (VoIP) is a revolutionary technology which is acting as a platform for the development of latest trends in modern communication world. The speech signal quality in VoIP is governed by the speech coding... more
Speech processing is the fastest growing technology due to its applications in various fields such as research, forensic and aid for blind people. This paper describes speech processing techniques which involve improving the signal to... more
Database security is the mechanisms that secure the database against deliberate or accidental threats, unauthorized users, hackers and ip snoopers. In this paper we proposed two mixed techniques to secure the database ie one is... more
In the applications of Internet and wireless communication network, information security is one of the most challenging aspects. Cryptography is the best solution that offers the requisite protection from unintended persons. By using... more
Image restoration forms the foundation of various applications in the areas of medicine, astronomy etc. Historically an image is reproduced utilizing numerous techniques of which Fourier and wavelet transform systems developed from the... more
There is a fast growing need to support audio voice communication in wireless channel between a voice source and personal computer PC. In this paper we develop an algorithm using matlab software to indicate how a PC can receive audio... more
This article introduces a new technique for voice signals encryption & decryption to ameliorate the information security during transferring over unsecure network. The presented mechanism is based on a particular type of asymmetric key... more
In the applications of Internet and wireless communication network, information security is one of the most challenging aspects. Cryptography is the best solution that offers the requisite protection from unintended persons. By using... more
The Intemational Conference on Electronics and Communication Systems 02 Avril 2014 universite Sofia Bulgaria
In this paper, we studied the effects of voice codecs on remote speaker recognition system, considering three types of speech codec: PCM, DPCM and ADPCM conforming to International Telecommunications Union Telecoms (ITU-T) recommendation... more
Large-scale mobile communication systems tend to contain legacy transmission channels with narrowband bottlenecks, resulting in characteristic 'telephone-quality' audio. While higher quality codecs exist, due to the scale and... more
Today in the age of technology the use of digital visual system increasing at tremendous rate for information entertainment and education purpose therefore it has become essential to reduce the cost of image transmission and storage as... more
The advancement of systems with the capacity to compress audio signals and simultaneously secure is a highly attractive research subject. This is because of the need to enhance storage usage and speed up the transmission of data, as well... more
The advancement of systems with the capacity to compress audio signals and simultaneously secure is a highly attractive research subject. This is because of the need to enhance storage usage and speed up the transmission of data, as well... more
In this paper an audio coding scheme based on the Empirical Mode Decomposition (EMD) in association with the Hilbert transform is presented. The audio signal is decomposed adaptively into intrinsic oscillatory components by EMD called... more
Global System for Mobile Communications (GSM) is one of the most commonly used cellular technologies in the world. One of the objectives in mobile communication systems is the security of the exchanged data. GSM employs many cryptographic... more
The performance of audio steganography compression system using discrete wavelet transform (DWT) is investigated. Audio steganography coding is the technology of transforming stegospeech into efficiently encoded version that can be... more
In this paper, we studied the effects of voice codecs on remote speaker recognition system, considering three types of speech codec: PCM, DPCM and ADPCM conforming to International Telecommunications Union Telecoms (ITU-T) recommendation... more
One of the biggest problems in cryptography is the distribution of keys. Suppose you live in the United States and want to pass information secretly to your friend in Europe. If you truly want to keep the information secret, you need to... more
The advancement of systems with the capacity to compress audio signals and simultaneously secure is a highly attractive research subject. This is because of the need to enhance storage usage and speed up the transmission of data, as well... more
Abstract--The aim of the Paper is to provide the security during transmission of the data. Commonly used technologies are cryptography, Compression and decompression. This can be used for secure and fast sending for security purpose we... more
In this paper, a system for the purpose of signals encryption using the technique of independent component analysis has been proposed. The proposed system mixes the original signal with arbitrary number of random signals in order to... more
This paper presents a technique for image compression and hiding in image of a high secret has been applied to wavelet transform and wavelet transform packet first apply two dimensional wavelet transform packet on the cover image was... more
In this paper, we studied the effects of voice codecs on remote speaker recognition system, considering three types of speech codec: PCM, DPCM and ADPCM conforming to International Telecommunications Union Telecoms (ITU-T) recommendation... more
In this work, we present results on the effect of well-known mixed excitation linear prediction (MELP) and code-excited linear prediction (CELP) codecs (coder/decoder) on voicing and vocal tract parameters of Arabic sounds. The study... more
Voice over Internet Protocol (VoIP) is a popular and important internet protocol for real-time voice calling. It is used in several software applications such as Skype, WhatsApp, and Google Talk. However, communications over the internet... more
The growth of the cellular technology and wireless networks all over the world has increased the demand for digital information by manifold. This massive demand poses difficulties for handling huge amounts of data that need to be stored... more
Voice over Internet Prototcol (VoIP) is a technology to carry voice calls over the internet. It is a technology to replace the traditional Public Switched telephone Network (PSTN). VoIP is a growing technology. Its functions, facilities... more
Large-scale numerical simulations of high-intensity focused ultrasound (HIFU), important for model-based treatment planning, generate large amounts of data. Typically, it is necessary to save hundreds of gigabytes during simulation. We... more
This paper gives an idea about the importance of speech enhancement and the performance analysis of DWT and Kalman filter based speech enhancement techniques. The objectives of Speech Enhancement vary widely reduction of noise level,... more
Speech compression is a mature technology with many applications. Over the past decade, huge advances have been made in the area of speech coding for reduced bit-rate transmission. With perceptual audio coding, the signal is coded... more
Speech coding deals with the problem of reducing the bit rate required for representing speech signals while preserving the quality of the speech reconstructed from that representation. In this paper, we propose a novel speech coding... more
The Intemational Conference on Electronics and Communication Systems 02 Avril 2014 universite Sofia Bulgaria
Cryptography, the scheme of information stashing and verification, entirely deals with protocols, algorithms and strategies to ensure the precise security facility of the signal consistently by hindering unauthorized access to the... more
The purpose of this study is to describe text-to-speech system for the Tigrigna language, using dialog fusion architecture and developing a prototype text-to-speech synthesizer for Tigrigna Language. Methods : The direct observation and... more
In this paper, we studied the effects of voice codecs on remote speaker recognition system, considering three types of speech codec: PCM, DPCM and ADPCM conforming to International Telecommunications Union Telecoms (ITU-T) recommendation... more
TI warrants performance of its semiconductor products and related software to the specifications applicable at the time of sale in accordance with TI's standard warranty. Testing and other quality control techniques are utilized to... more
Speech generation is one of the most important areas of research in speech signal processing which is now gaining a serious attention. Speech is a natural form of communication in all living things. Computers with the ability to... more
Speech is an important biological signal for primary mode of communication among human being and also the most natural and efficient form of exchanging information among human in speech. Speech processing is the most important aspect in... more
As the technology is rapidly advancing day by day sharing of information over the internet is experiencing an explosive growth, which in turn is also posing new threats and vulnerabilities in the existing systems. The quest for more... more
The Intemational Conference on Electronics and Communication Systems 02 Avril 2014 universite Sofia Bulgaria
In this paper, we studied the effects of voice codecs on remote speaker recognition system, considering three types of speech codec: PCM, DPCM and ADPCM conforming to International Telecommunications Union-Telecoms (ITU-T) recommendation... more
The tremendous growth of digital data has led to a high necessity for compressing applications either to minimize memory usage or transmission speed. Despite of the fact that many techniques already exist, there is still space and need... more
The tremendous growth of digital data has led to a high necessity for compressing applications either to minimize memory usage or transmission speed. Despite of the fact that many techniques already exist, there is still space and need... more
Download research papers for free!