2 Speech Signals Classification 1
2015
Sign up for access to the world's latest research
Abstract
Speech Signals are the primary source of direct transmitter-to-receiver human communication and falls in the category of acoustic signals. These signals are the mechanical waves represented in terms of analog signal and propagate as vibration in the channel. Only one thing that classifies the speech signals in acoustic signals is its origination by humans. The signals possess most diverse features as characteristics of human development and culture. Henceforth, the applications of speech signals range from music, medicine to security, authentication etc. The transmission of speech signals commence at generation, propagation and conclude by reception of signals (figure 1.1).The unwanted signals (figure 1.2) from different sources mix with these signals and generate noise. The noise in digital communication can surface due to various reasons for example: finite precision of equipments or use of local components, faulty lines, improper coding etc. This type of noise is a different subj...
Related papers
Towards Hybrid and Adaptive Computing, 2010
Principles of Speech Coding, 2010
Introduction to LTT Systems 2.1.1 Linearity 2.1.2 Time Invariance 2.1.3 Representation Using Impulse Response 2.1.4 Representation of Any Continuous-Time (CT) Signal .. 2.1.5 Convolution 2.1.6 Differential Equation Models 2.2 Review of Digital Signal Processing 2.2.1 Sampling 2.2.2 Shifted Unit Pulse: 8 (wk) 2.2.3 Representation of Any DT Signal 2.2.4 Introduction to Z Transforms 2.2.5 Fourier Transform, Discrete Fourier Transform 2.2.6 Digital Filter Structures 2.3 Review of Stochastic Signal Processing 2.3.1 Power Spectral Density 2.4 Response of a Linear System to a Stochastic Process Input.... 2.5 Windowing 2.6 AR Models for Speech Signals, Yule-Walker Equations 2.7 Short-Term Frequency (or Fourier) Transform and Cepstrum. 2.7.1 Short-Term Frequency Transform (STFT) 2.7.2 The Cepstrum 2.8 Periodograms 2.9 Spectral Envelope Determination for Speech Signals 2.10 Voiced/Unvoiced Classification of Speech Signals 2.10.1 Time-Domain Methods 2.10.1.1 Periodic Similarity 2.10.1.2 Frame Energy 2.10.1.3 Pre-Emphasized Energy Ratio 2.10.1.4 Low-to Full-Band Energy Ratio 2.10.1.5 Zero Crossing 2.10.1.6 Prediction Gain 2.10.1.7 Peakiness of Speech 2.10.1.8 Spectrum Tilt 2.10.2 Frequency-Domain Methods 2.10.3 Voiced/Unvoiced Decision Making 2.11 Pitch Period Estimation Methods 2.12 Summary Exercise Problems References Bibliography Contents ix 3. Sampling Theory 61 3.1 4.10 ITU G.711 |i-Law and A-Law PCM Standards 4.10.1 Conversion between Linear and Companded Codes ... 4.10.1.1 Linear to |x-Law Conversion 92 4.10.1.2^i-Law to Linear Code Conversion 93 4.10.1.3 Linear to A-Law Conversion 94 4.10.1.4 A-Law to Linear Conversion 95 4.11 Optimum Quantization 95 4.11.1 Closed Form Solution for the Optimum Companding Characteristics 96 4.11.2 Lloyd-Max Quantizer 97 4.12 Adaptive Quantization
Speech recognition has of late beco me a practical technology. It is used in real -world hu man language applications, such as informat ion retrieval. It is the most common means of the communicat ion because the information contains the fundamental role in conversation. Fro m the speech or conversation, it converts an acoustic signal that is captured by a microphone or a telephone, t o a set of words. A cluster of words can either be the final result or it can then apply the synthesis to pronounce into text, wh ich implies speech-to-text. It means that, speech recognition can serve as the input to further linguistic processing to achieve speech understanding.This Paper analysis the types and algorithms of speech recognition.
1990
This thesis comprises a study of various types of signal processing techniques, applied to the tasks of extracting information from speech, cough, and dolphin sounds. Established approaches to analysing speech sounds for the purposes of low data rate speech encoding, and more generally to determine the characteristics of the speech signal, are reviewed. Two new speech processing techniques, shift-and-add and CLEAN (which have previously been applied in the field of astronomical image processing), are developed and described in detail. Shjft-and-add is shown to produce a representation of the long-term "average" characteristics of the speech signal. Under certain simplifying assumptions, this can be equated to the average glottal excitation. The iterative deconvolution technique called CLEAN is employed to deconvolve the shift-and-add signal from the speech signal. Because the resulting "CLEAN" signal has relatively few non-zero samples, it can be directly encoded at a low data rate. The performance of a low data rate speech encoding scheme that takes advantage of this attribute of CLEAN is examined in detail. Comparison with the multi-pulse LP C approach to speech coding shows that the new method provides similar levels of performance at medium data rates of about 16kbitfs. The changes that occur in the character of a person's cough sounds when that person is afflicted with asthma are outlined. The development and implementation of a microcomputer based cough sound analysis system, designed to facilitate the ongoing study of these sounds, is described. The system performs spectrographic analysis on the cough sounds. A graphical user interface allows the sound waveforms and spectra to be displayed and examined in detail. Preliminary results are presented, which indicate that the spectral content of cough sounds are changed by asthma. An automated digital approach to studying the characteristics of Hector's dolphin vocalisations is described. This scheme characterises the sounds by extracting descriptive parameters from their time and frequency domain envelopes. The set of parameters so obtained from a sample of click sequences collected from free-ranging dolphins is analysed by principal component analysis. Results are presented which indicate that Hector's dolphins produce only a small number of different vocal sounds. In addition to the statistical analysis, several of the clicks, which are assumed to be used for echo-location, are analysed in terms of their range-velocity ambiguity functions. The results suggest that Hector's dolphins can distinguish targets separated in range by about 2cm, but are unable to separate targets that differ only in their velocity. v Acknow ledgements Many people have contributed to the completion of this thesis by way of their support, encouragement, and advice. I am indebted to each of them. In particular, I thank Professor Richard Bates for the great amount of energy he expended as my supervisor, both in the ideas he contributed during the course of my research, and in his efforts to impose literary perfection on this thesis. I would also like to thank Bill Kennedy for his help with many practical aspects of my research. Richard Fright made many invaluable contributions, both while he was a post-doctoral fellow in this department, and at Christchurch Hospital where he was instrumental in getting the cough analysis project underway. During the period of my study I have had the opportunity to work with many people, and I thank all those people for their help, interest, and stimulation. I would especially like to thank each of
except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
2014
Mapping of information using pattern classifiers has become more popular now days, even though without a obvious agreement on what classifiers need to be utilized or even just how benefits must be screened. This paper states that by comparative analyses how information maps in multiple class situation which provides the information concerned on neural representation. Speech signal generated from wireless devices may have noise. Noise must be separated from signal. In order to separate noise from speech signal Linear and quadratic discriminant analysis can be used. Logistic regression can be also be used in order to get accurate signal on receiver end since it will calculate the probability.
Proceedings of the IEEE, 2013
Annales Des Télécommunications, 1995
Résumé Le traitement de la parole a connu ces dernières années un formidable développement lié aux avancées technologiques des composants de traitement numérique des signaux et à la numérisation grandissante des réseaux. Cet article fournit une analyse des principales techniques qui se sont imposées récemment dans les domaines du codage, de la reconnaissance et de la synthèse de la parole. En
The aim of this thesis work is to investigate the algorithms of speech recognition. The author programmed and simulated the designed systems for algorithms of speech recognition in MATLAB. There are two systems designed in this thesis. One is based on the shape information of the cross-correlation plotting. The other one is to use the Wiener Filter to realize the speech recognition. The simulations of the programmed systems in MATLAB are accomplished by using the microphone to record the speaking words. After running the program in MATLAB, MATLAB will ask people to record the words three times. The first and second recorded words are different words which will be used as the reference signals in the designed systems. The third recorded word is the same word as the one of the first two recorded words. After recording words, the words will become the signals' information which will be sampled and stored in MATLAB. Then MATLAB should be able to give the judgment that which word is recorded at the third time compared with the first two reference words according to the algorithms programmed in MATLAB. The author invited different people from different countries to test the designed systems. The results of simulations for both designed systems show that the designed systems both work well when the first two reference recordings and the third time recording are recorded from the same person. But the designed systems all have the defects when the first two reference recordings and the third time recording are recorded from the different people. However, if the testing environment is quiet enough and the speaker is the same person for three time recordings, the successful probability of the speech recognition is approach to 100%. Thus, the designed systems actually work well for the basical speech recognition.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.