Speech to text
2016
Sign up for access to the world's latest research
Abstract
The Speech is the first important primary need, and the most convenient means of communication between people. The communication among human computer interaction is called human computer interface. This project basically gives an overview of major technological perspective and appreciation of the fundamental progress of speech to text conversion and also gives complete set of speech to text conversion based on Raspberry-Pi. The project also focuses on the language translator which is very important for daily life. A comparative study of different technique is done as per stages. This paper concludes with the decision on future direction for developing technique in human computer interface system in different mother tongue and it also discusses the various techniques used in each step of a speech recognition process and attempts to analyze an approach for designing an efficient system for speech recognition. However, with modern processes, algorithms, and methods we can process speech signals easily and recognize the text. In this system, we are going to develop an on-line speech-to-text engine. However, the transfer of speech into written language in real time requires special techniques as it must be very fast and almost 100% correct to be understandable. The objective of this paper is to recapitulate and match up to different speech recognition systems as well as approaches for the speech to text conversion based on Raspberry-Pi technology and identify research topics and applications which are at the forefront of this exciting and challenging field.
Related papers
This paper is demonstrating to convert the audio signals to perform the task. Speech recognition is one of the fastest growing technology nowadays. In this paper, we aimed at developing the speech recognition system as a helping tool for the differently able people. This paper demonstrates to convert the speech into English text. The conversion of speech into text is made by the speech recognizer. It can be used at various places with many possible solutions. There are around 20% people who are suffering from many disabilities. There are people who are blind, some cannot use their hands effectively and for illiterates, for them this system could be very helpful. This system will also be helpful for the enterprises where most of the work is to type. This system can recognize the audio signals and convert into text it can perform some operations, such as open calculator, open Google chrome etc. ; it also enables a user to perform operations such as "save, open, exit" a file by providing voice input . Likewise this system can perform some operations. At the initial level effort is made to provide help for basic operations as discussed above, to perform more operation this software can be updated and enhanced further. This paper presents a method to design a speech to text then performs a task accordingly using .net framework using Visual Studio.
This project is based on the text to speech conversion using the concept of IOT (internet of things) in which we can transmit the message or text in efficient manner. We will achieve an efficient and distortion free communication using internet as a medium, so that there is no restriction on the distance. The system will convert the text data into speech from anywhere. We are using the Raspberry Pi module to decode the data as well as convert it into the speech signal. The Raspberry Pi 2b module is a latest Embedded module, which having ARM 64 bit processor. This will make the operation faster. This text to speech conversions system makes the information/data transmission easier. The system will have the transmitter which can be any electronic device like Computer or laptop. We can also use an Android phone also. The message / text will be transmitted via E-mail ID to the raspberry pi module at the receiver side. This project can be applicable for the various organization as well as highly restricted areas. We can implement this project using IOT with GSM module.
2018
Smart Reader allows user to hear the text which is given as input. It involves extraction of text from the image and converting the text to speech. This is done with Raspberry Pi and a camera module by using the OCR [optical character recognition] technique. The system consists of a webcam interfaced with raspberry pi. Raspberry pi has the audio port where the output can be heard through the headphone or the speaker. The conversion time aimed is few milliseconds. This device can help visually impaired persons to hear the text in images to be read.
International Journal of Computer Applications, 2015
Speech recognition system is a natural way for the interaction of human to machine. Automatic Speech Recognition is advance way to operate computer without much efforts through speech only. In this paper survey related to indo Aryan languages usage for communicating directly with the machine has been performed. This mechanism includes various techniques and experimental results. Speech Recognition system is implemented for English, French, Spanish, German, Japanese and Chinese. Only little work has been performed for Indo-Indian languages like Gujarati, Marathi, Hindi, Tamil etc. Speech to Text is an emerging research area due to complexity and various frameworks of Indo-Aryan languages.
2013
Human computer interaction through Natural Language Conversational Interfaces plays a very important role in improving the usage of computers for the common man. It is the need of time to bring human computer interaction as close to human-human interaction as possible. There are two main challenges that are to be faced in implementing such an interface that enables interaction in a way similar to human-human interaction. These are Speech to Text conversion i.e. Speech Recognition & Text To Speech (TTS) conversion. In this paper the implementation of one issue Speech Recognition for Indian Languages is presented.
International Journal of Futuristic Innovation in Engineering, Science and Technology (IJFIEST)
Current research introduces a novel, efficient, and less expensive way for users to hear than to read the content of text images in real time. Includes Optical Character Recognition (OCR) ideas and a Text to Speech Synthesizer on Raspberry Pi (TTS). This type of technology uses a visual connection to allow visually impaired people to communicate with computers successfully. Extraction of text from colored images is a serious problem in computer vision. Converting text into speech is a process that scans and translates English letters and numbers into pictures using recognizable letter recognition (OCR) and converts them into words.
This paper presents exploration of speech enable operating systems, software, and applications. It begins with a description of how such systems work, and the level of accuracy that can be expected. It explains the applications of speech recognition technology in different areas education, medical, mobile computing, railway reservation, dictation, and web browsing. A brief comparison of the operating systems supported for voice, speech recognition software or tool. It gives the brief introduction about the potential of voice/speech recognition software. It explains the feature of different speech enable Operating system and speech recognition software. Windows speech recognition have many innovative features for Windows operating system and efficiently assist the computer to control, dictate, navigate, selecting the words, sending emails and correcting the words or sentences. It also explains the benefits and issue related to speech technology. In last era speech recognition technology grew tremendously. There are large number of companies who are working in these area and developing software for the people who are not able to control the system through keyboard or mouse such as physically impaired and senior citizens. This paper gives a brief introduction of speech enabled OS and speech recognition software.
Automatic speech recognition system is invented a few decade earlier and it improve day by day. In very first it started with 8 word but if we see now it has a huge database with almost 230million word. By this system normally we can able to interact with device through our voice command and can do our desire work easily. The process of speech to text conversion is one of the part of this system and a few model like: HMM (Hidden Markov Model), MFCC(Mel Frequency Cepstral Coefficient) etc. are use and a working procedure done behind step by step for this conversion process. So the main objective is to determine to show the whole process of speech to text conversion which done by automatic speech recognition which will help those people who want to know the whole process and those who have some interest in this field.
Internet has evolved over time and has revolutionized many fields and impacted many lives. Internet is a boon to mankind. The main field revolutionized by the internet is communication. Internet has enabled faster and easier communication. Through this paper we aim to study the different methodology for Speech-To-Text and Text-To-Speech conversion that will be used in a voice-based email system. This system is based on interactive voice response. The aim is to study and compare the various methods used for STT and TTS conversions and to figure out the most efficient technique that can be adapted for both the conversion processes. As a result, based on review study it is found that HMM is a statistical model therefore most suitable for both STT and TTS conversions. At last a model using HMM and ANN methods for STT and HMM for TTS conversions proposed.
This paper presents a brief survey on Speech recognition and discusses major themes and advances. Automatic speech recognition uses the process and related technology for converting speech signals into a sequence of words or other linguistic units by means of an algorithm implemented as a computer program. After years of research and development the accuracy of automatic speech recognition remains one of the important research challenges. Speech understanding systems presently are capable of understanding speech input for vocabularies of thousands of words in operational environments. Speech Recognition offers greater freedom to employ the physically handicapped in several applications like manufacturing processes, medicine and telephone network. The objective of this review paper is to summarize and compare some of the well known methods used in various stages of speech recognition system.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.