Speaker Independent Urdu Speech Recognition Using HMM

Ashraf, Javed; Iqbal, Naveed; Khattak, Naveed Sarfraz; Zaidi, Ather Mohsin

doi:10.1007/978-3-642-13881-2_14

Outline

Natural Language Processing

Speaker Independent Urdu Speech Recognition Using HMM

Naveed Khattak

2010, Lecture Notes in Computer Science

https://doi.org/10.1007/978-3-642-13881-2_14

visibility

…

description

2 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Automatic Speech Recognition (ASR) is one of the advanced fields of Natural Language Processing (NLP). Recent past has witnessed valuable research activities in ASR in English, European and East Asian languages. But unfortunately South Asian Languages in general and "Urdu" in particular have received very less attention. In this paper we present an approach to develop an ASR system for Urdu language. The proposed system is based on an open source speech recognition framework called Sphinx4 which uses statistical based approach (HMM: Hidden Markov Model) for developing ASR system. We present a Speaker Independent ASR system for small sized vocabulary, i.e. fifty two isolated most spoken Urdu words and suggest that this research work will form the basis to develop medium and large size vocabulary Urdu speech recognition system.

Agha Ali Raza

2009 Oriental COCOSDA International Conference on Speech Database and Assessments, 2009

Center for Research in Urdu Language Processing (CRULP; www .crulp.org) at NUCES is currently working on a project entitled Telephone-based Speech Interfaces for Access to Information by Non-literate Users in collaboration with Carnegie Mellon University. The goal of this project is to investigate the use of speech interfaces for users to access online health related information in Pakistan. This will be achieved by developing a telephone based dialogue system consisting of an Urdu Speech Recognition system and a Text to Speech system that can interact with the health workers to answer their queries. One key component of this system a Large Vocabulary Automatic Speech Recognition (LVASR) system for Urdu. This system requires the construction of a phonetically rich and balanced corpus for recognition of continuous and spontaneous speech in Urdu. Once the training corpus is recorded, it has to be labeled. The system will be based on Hidden Markov Models, using Sphinx 3 [3] trainer and Sphinx 4 ([4], [5]) decoder. This paper describes the process employed in the design and development of the phonetically rich Urdu speech corpus , the initial step in the development of the Urdu LVASR. The next section briefly reviews similar work done for other languages and the phonetic characteristics of Urdu. Sections 3 and 4 and 6 describe

downloadDownload free PDF View PDFchevron_right

A Speech Recognition System for Urdu Language

Azam Beg

Communications in Computer and Information Science, 2009

This paper presents a speech processing and recognition system for individually spoken Urdu language words. The speech feature extraction was based on a dataset of 150 different samples collected from 15 different speakers. The data was pre-processed using normalization and by transformation into frequency domain by (discrete Fourier transform). The speech recognition feed-forward neural models were developed in MATLAB. The models exhibited reasonably high training and testing accuracies. Details of MATLAB implementation are included in the paper for use by other researchers in this field. Our ongoing work involves use of linear predictive coding and cepstrum analysis for alternative neural models. Potential applications of the proposed system include telecommunications, multi-media, and voice-activated tele-customer services.

downloadDownload free PDF View PDFchevron_right

Speaker Dependent and Independent Isolated Hindi Word Recognizer using Hidden Markov Model (HMM

Ishan Bhardwaj

Hindi is very complex language with large number of phonemes and being used with various ascents in different regions in India. In this manuscript, speaker dependent and independent isolated Hindi word recognizers using the Hidden Markov Model (HMM) is implemented, under noisy environment. For this study, a set of 10 Hindi names has been chosen as a test set for which the training and testing is performed. The scheme instigated here implements the Mel Frequency Cepstral Coefficients (MFCC) in order to compute the acoustic features of the speech signal. Then, K-means algorithm is used for the codebook generation by performing clustering over the obtained feature space. Baum Welch algorithm is used for re-estimating the parameters, and finally for deciding the recognized Hindi word whose model likelihood is highest, Viterbi algorithm has been implemented; for the given HMM. This work resulted in successful recognition with 98.6% recognition rate for speaker dependent recognition, for total of 10 speakers (6 male, 4 female) and 97.5% for speaker independent isolated word recognizer for 10 speakers (male).

downloadDownload free PDF View PDFchevron_right

Hidden Markov Model based isolated Hindi word recognition

Ishan Bhardwaj

2012 2nd International Conference on Power Control and Embedded Systems, 2012

In this paper three schemes based on the Hidden Markov Model for recognition of isolated words in Hindi Language speech are discussed; namely speaker dependent, multi speaker and speaker independent. For the study a set of 10 Hindi words is chosen, for which the training followed by testing is performed. The recogniser is built over three basic building blocks namely Feature extraction, Training and Recognition (Testing). The scheme proposed here implements the Mel Frequency Cepstral Coefficients (MFCC) in order to compute the spectral features of the speech signal. Then, K-means algorithm is used to form the codebook by performing clustering over the obtained feature vectors. Recognition of a spoken Hindi word is carried out by first driving its features, and then deciding in favour of the Hindi word whose model likelihood is highest, by implementing the Viterbi algorithm for the given HMM. The recognition rate for speaker dependent isolated word recogniser for total of 10 speakers (7 male, 3 female) is 99% whereas for multi speaker it is 98% (10 male) and for speaker independent (10 male) it is 97.5%. Experiments are carried out to develop a approach towards advancement in this field specifically for Hindi.

downloadDownload free PDF View PDFchevron_right

HMM-BASED SPEECH SYNTHESISER FOR THE URDU LANGUAGE

Joao Cabral

This work presents Hidden Markov Model (HMM) based speech synthesis for the Urdu language. This is a widely spoken language across different regions in Asia. For example, Urdu is the official language of Pakistan and one of the national languages of India. Unfortunately, there is no corpus of Urdu currently publicly available that to our knowledge is appropriate for HMM-based speech synthesis purpose. We overcame this problem by recording an Urdu speech database with word and phone labels obtained using manual and semi-automatic annotation approaches. In summary, the objective of this work is to develop an HMM-based Urdu speech synthesiser from scratch by trying to use publicly available text processing tools for this language and by developing the necessary processing components.

downloadDownload free PDF View PDFchevron_right

HMM-based Automatic Speech Recognition Systems: A survey

Tuba Qureshi

2017

Natural language processing enables computer and machines to understand and speak human languages. Speech recognition is a process in which computer understand the human language and processes further instructions as per recognition of the human language. The human language varies so the machine or computer needs entirely different algorithms as the human languages differ in various aspects, such as sounds, phonemes, words, meanings and much more. Understanding human language is a challenging job and for this purpose Hidden Markov Models are used commonly as they possess promising results in understanding human language. A survey of various researches employing Hidden Markov models is presented to highlight the importance of HMM in the process of speech recognition.

downloadDownload free PDF View PDFchevron_right

Automatic Speech Recognition (ASR) System for Isolated Marathi Words: Using HTK

Nita Patil

International Journal of Innovative Technology and Exploring Engineering, 2019

The present manuscript focuses on building automatic speech recognition (ASR) system for Marathi language (M-ASR) using Hidden Markov Model Toolkit (HTK). The M-ASR system gives the detail about experimentation and implementation using the HTK Toolkit. In this work total 106 speaker independent Marathi isolated words were recognized. These unique Marathi words are used to train and evaluate M-ASR system. The speech corpus (database) is created by us using isolated Marathi words uttered with mixed gender people. The system uses Mel Frequency Cepstral Coefficient (MFCC) for the purpose of extracting features using Gaussian mixture model (GMM). Viterbi algorithm based on token passing is used for decoding to recognize unknown utterances. The proposed M-ASR system is speaker independent. The proposed system has reported 96.23% word level recognition accuracy.

downloadDownload free PDF View PDFchevron_right

A Study of Sindhi Related and Arabic Speech Recognition System

tuba qureshi

2017

Speech Recognition is the understanding human words by computer that was spoken by the human. These words may be the human language and changing the human language will demand different challenges for the different language which means the algorithms designed for English speech recognition cannot be employed to recognize another language such as Sindhi. It requires entirely new and separate algorithms to understand spoken words for Sindhi language. In this regard, every language and script pose different challenges related to script. This paper introduces a study related to speech recognition systems available in various language specially related to Sindhi language. An emphasis has been given to architecture of automatic speech recognition system, various challenges posed by the scripts with special attention to Sindhi and its related languages.

downloadDownload free PDF View PDFchevron_right

Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit

Mohamed BELLOUKI

Int. J. Speech Technol., 2020

In this paper, we present our Amazigh automatic speech recognition system. Its realization is constructed with context-independent phonetic Hidden Markov Models. Many choices are made on this system, such as the number of states of the models, the type of emission probability densities associated with the states, and the representation of the signal by cepstral coefficients. The results of recognition of our system place it at a level of height performance comparable to that achieved by Markovian automatic speech recognition systems. Our system is designed to recognize 43 distinct isolated Amazigh words (33 letters and 10 digits). The recognition rate is then calculated for each digit and letter. The overall accuracy and word recognition rate for the whole database achieved 91.31% after extensive testing and change of the recognition parameters. The results obtained in this work are improved in association with our previous work concerning Amazigh spoken digits and letters automatic...

downloadDownload free PDF View PDFchevron_right

The Development of Isolated Words Pashto Automatic Speech Recognition System

Irfan Engineering, Hazrat Ali, Nasir Professor

— The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches. This paper presents the development of a Medium-Vocabulary Speech Corpus for Pashto language and development of Pashto ASR system by using the corpus. The vocabulary encompasses 161 isolated words of Pashto language, consisting of most frequently used words of Pashto language, names of the days of the week and digits from 0 to 25. The words were uttered by 50 speakers of different ages and genders, including both native and non-native speakers of Pashto language. Recording of the corpus was performed in a noise free office environment. The Corpus developed is then used for the development of an automatic speech recognition system for Pashto language.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Hazrat Ali

—In this paper, we present an approach to develop an automatic speech recognition (ASR) system of Urdu isolated words. Our experimentation is based on a medium vocabulary speech corpus of Urdu, consisting of 250 words. We develop our approach using the open source Sphinx toolkit. Using this platform, we extract the Mel Frequency Cepstral Coefficients (MFCC) features and build a Hidden Markov Model to perform recognition task. We report percentage accuracy for two different experiments based on 100 and 250 words respectively. Experimental results suggest that better recognition accuracy has been achieved with this approach, as compared to the previous results reported on this corpus.

downloadDownload free PDF View PDFchevron_right

Development for a Speaker Independent Spontaneous Urdu Speech Recognition System

Asad Mustafa

2010

This paper reports the design and development of an 82 speaker Urdu speech corpus for speaker independent spontaneous speech recognition using the CMU Sphinx Open Source Toolkit for Speech Recognition. The corpus consists of 45 hours of spontaneous and read speech data from 82 speakers (42 male and 40 female), recorded over a microphone and a telephone line. The speech was collected from speakers ranging from 20 to 55 years of age. Recording sessions were conducted in office and home environments.

downloadDownload free PDF View PDFchevron_right

Urdu Speech Corpus and Preliminary Results on Speech Recognition

Hazrat Ali, Nasir Ahmad

Language resources for Urdu language are not well developed. In this work, we summarize our work on the development of Urdu speech corpus for isolated words. The Corpus comprises of 250 isolated words of Urdu recorded by ten individuals. The speakers include both native and non-native, male and female individuals. The corpus can be used for both speech and speaker recognition tasks. We also report our results on automatic speech recognition task for the said corpus. The framework extracts Mel Frequency Cepstral Coefficients along with the velocity and acceleration coefficients, which are then fed to different classifiers to perform recognition task. The classifiers used are Support Vector Machines, Random Forest and Linear Discriminant Analysis. Experimental results show that the best results are provided by the Support Vector Machines with a test set accuracy of 73%. The results reported in this work may provide a useful baseline for future research on automatic speech recognition of Urdu.

downloadDownload free PDF View PDFchevron_right

Speech Corpus Development for a Speaker Independent Spontaneous Urdu Speech Recognition System

Asad Mustafa

2010

downloadDownload free PDF View PDFchevron_right

Automatic Speech Recognition System for Isolated & Connected Words of Hindi Language By Using Hidden Markov Model Toolkit (HTK

Gia An

— Speech recognition is the process of converting an acoustic waveform into the text similar to the information being conveyed by the speaker. In this paper implementation of isolated words and connected words Automatic Speech Recognition system (ASR) for the words of Hindi language will be discussed. The HTK (hidden markov model toolkit) based on Hidden Markov Model (HMM), a statistical approach, is used to develop the system. Initially the system is trained for 100 distinct Hindi words .This paper also describes the working of HTK tool, which is used in various phases of ASR system, by presenting a detailed architecture of an ASR system developed using various HTK library modules and tools. The recognition results will show that the overall system accuracy for isolated words is 95% and for connected words is 90%. Index Terms— HMM, HTK, Mel Frequency Cepstral Coefficient (MFCC), Automatic Speech Recognition (ASR), Hindi, Isolated word ASR, connected word ASR.

downloadDownload free PDF View PDFchevron_right

Speech Recognition System Architecture for Gujarati Language

jinal tailor

International Journal of Computer Applications, 2016

Speech recognition is an area of Natural Language Processing and Artificial Intelligence. To achieve good accuracy and efficiency of Automatic Speech Recognition (ASR) system for Indian Gujarati language is challenging task due to its morphology, language barriers, different dialects, and unavailability of resources. This paper presents proposed architecture of ASR for Gujarati language. Raw input data have been collected from 4 male and 2 female who belongs from age between 18 to 36 years to prepare dataset for training purpose. The goal of Speech recognition system is to make machines capable enough to operate in natural languages. ASR is a system to convert vocalized form to visualized form using different computational devices. This convincing approach is useful to the people having disabilities deaf or inability to use input device. In this paper we have used Hidden Markov Model Toolkit HTK Tool to measure performance and error parameters. The system implementation analyzed WR (Word Recognition Rate) 95.9% and WER (Word Error Rate) as 5.85 % in Lab environment. For the open noisy environment calculated WR was 95.1% and WER found 7.40%.

downloadDownload free PDF View PDFchevron_right

Automatic speech recognition of Urdu words using linear discriminant analysis

Nasir Professor

Journal of Intelligent & Fuzzy Systems, 2015

Urdu is amongst the five largest languages of the world and possess a very important role as it shares its vocabulary with languages as Arabic, Persian, Hindi and several other languages of the Indo-Pak. The Automatic Speech Recognition task of Urdu has not been addressed significantly. This paper presents the statistical based classification technique to achieve the task of Automatic Speech Recognition of isolated words in Urdu. The proposed approach is based on calculation of 52 Mel Frequency Cepstral Coefficients for each isolated word. The classification has been achieved with Linear Discriminant Analysis. The successful or incorrect matches have been presented in the Confusion Matrix. As a prototype, the framework has been trained with audio samples of seven speakers including male/female, native/non-native and speakers with different ages. The test set comprises of audio data of three speaker. For each isolated, percentage error has been calculated. It was found that majority of the words are recognized with percentage error less than 33%. Some words suffer 100% error and were referred to be the bad words. This work may provide a baseline for further research on Urdu Automatic Speech Recognition.

downloadDownload free PDF View PDFchevron_right

Automatic Speech Recognition of Urdu Digits with Optimal Classification Approach

Hazrat Ali

International Journal of Computer Applications, 2015

Speech Recognition for Urdu language is an interesting and less developed task. This is primarily due to the fact that linguistic resources such as rich corpus are not available for Urdu. Yet, few attempts have been made for developing Urdu speech recognition frameworks using the traditional approaches such as Hidden Markov Models and Neural Networks. In this work, we investigate the use of three classification methods for Urdu speech recognition task. We extract the Mel Frequency Cepstral Coefficients, the delta and delta-delta features from the speech data and train the classifiers to perform Urdu speech recognition. We present the performance achieved by training a Support Vector Machine (SVM) classifier, a random forest (RF) classifier and a linear discriminant analysis classifier (LDA) for comparison with SVM. Consequently, the experimental results show that SVM gives better performance than RF and LDA classifiers on this particular task.

downloadDownload free PDF View PDFchevron_right

Performance analysis of isolated Bangla speech recognition system using Hidden Markov Model.

Abdullah - al - mamun

here we present a model of isolated speech recognition (ISR) system for Bangla character set and analysis the performance of that recognizer model. In this isolated Bangla speech recognition is implemented by the combining MFCC as feature extraction for the input audio file and used Hidden Markov Model (HMM) for training & recognition due to HMMs uncomplicated and effective framework for modeling time-varying sequence of spectral feature vector. A series of experiments have been performed with 10-talkers (5 male and 5 female) by 56 Bangla characters (include, Bangla vowel, Bangla consonant, Bangla

downloadDownload free PDF View PDFchevron_right

A Medium Vocabulary Urdu Isolated Words Balanced Corpus for Automatic Speech Recognition

Hazrat Ali, Omar Farooq, Nasir Professor

2012

The role of a standard database in conducting and evaluating the speech recognition research is two-fold. Firstly, it provides a standard platform for the research by providing a balance amongst various aspects of speech recognition such as gender, dialect, and age. Secondly, it provides a common platform for comparing the performance of various speech recognition approaches. This paper presents the development of a Medium Vocabulary Speech Corpus for Urdu Language. The Corpus comprises of 250 isolated words, including digits and the most frequently spoken words of the Urdu Language. The words have been selected from the 5000 most frequently words amongst the 19.3 million words of Urdu. The selected words have been uttered by 50 speakers in a noise-free acoustically balanced studio. The speakers comprises of both native and non-native, male and female, youngsters and aged persons. The corpus has been built for Automatic Speech Recognition of isolated words in Urdu Language.

downloadDownload free PDF View PDFchevron_right

Cited by

Urdu Speech Corpus and Preliminary Results on Speech Recognition

Abdul Hafeez

Communications in Computer and Information Science, 2016

downloadDownload free PDF View PDFchevron_right

Speaker Independent Urdu Speech Recognition Using HMM

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics

Cited by