A Survey Paper on Automatic Speech Recognition by Machine
2015
Abstract
Speech is the expression of or the ability to express thoughts and feelings by articulate sounds. It is the main way of communication between humans. There are thousands of languages used in the world. Speech recognition is a process of recognition of human speech by computer and giving the string output of spoken sentence in written form. There are lots of advantages of speech recognition. There are many methods of speech recognition but yet we have not get 100% result of speech recognition. Here in this paper we will explain the development in speech recognition from 1952 to 2014. Finally we will give conclusion that which approach to speech recognition is best and will be beneficial for the future in the field of speech recognition. Keywords—Automatic Speech Recognition (ASR), AcousticPhonetic approach, pattern-comparison technique, Artificial Intelligence approach, Hidden Markov Model (HMM).
References (39)
- K.H. Davis, R.Biddulph and S. Balashek, "Automatic Recognition of spoken digits",j.Acoust. Soc. Am.,24(6):637-642,1952.
- H.F.Olson and H.Belar , "Phonetic Typewriter," j. Acoust. Soc. Am., 28(6); 1072-1081,1956.
- D.B. Fry, "Theoretical Aspects of mechnical speech recognition", and P.Denes, "The design and operation of the mechanical speech recognizer at University college London," j. British Inst. Radio engr. 19:4,211-229,1959.
- J.W. Forgie and C.D. Forgie, "Results obtained from a vowel recognition computer program," J. Acoust. Soc. Am., 31(11):1480- 1489,1959.
- J.Suzuki and k.Nakata, "Recognition of jaanese vowels-Preliminary to the recognition of speech,", J.Radio Res. Lab, 37(8):193- 212,1961.
- T.Sakai and S.Doshita ,"the phonetic typewriter, information processing 1962," Proc. IFIP congress, Munich, 1962.
- K. Nagata, Y.Kato, and S.Chiba, "Spoken digit recognizer for japanese language," NEC Res. Develop.,No. 6,1963.
- T.B. Martin, A.L. Nelson, and H.J. Zadell, "speech recognition by feature abstractin techniques," Tech. Report AL-TDR-64-176,Air Force Aviolnics Lab,1964.
- T.K. Vintsyuk, "speech descrimination by dynamic processing, "Kibernetika, 4(2);81-88, jan.-feb. 1968.
- D.R.Reddy,"An Approach to computer speech recognition by direct analysis of the speech wavw," Tech.Report No. c549, computer science Dept. Stanford University, september 1966.
- V.M. Velichko and N.G. Zagoruyko, "Automatic Recognition of 200 words,"Inst. J. man-machine studies, 2:233, june 1970.
- H.Sakoe and S.Chiba, "dynamic programming algorithm optimization for spoken word recognition," IEEE trans. Acoustics, speech, signal proc.,ASSP-26(1):43-49,February 1978.
- F.Itakura, "minimum prediction residual applied to speech recognition,", IEEE Trans. Acoustics, Speech , signal Proc. ASSP- 23(1):67-72, February 1975.
- C.C. Tappert, N.R. Dixon, A.S. Rabinowitz, and W.D. Chapman, "automatic recognition of continuous speech utilizing dynamic segmentation, dual classification , sequential decoding and error recovery,"Rome Air Dev. Cen. Rome, NY, Tech. Report TR-71-146, 1971.
- F.Jelie, L.R. Bahl, and R.L. Mercer, "design of a linguistic statistatical decoder for the recognition of continous speech", IEEE Trans. Information Theory, IT-21:250-256, 1975.
- F. Jelinek , "the development of an experimental discrete dictation recognizer," Proc. IEEE, 73(11):1616-1624, 1985.
- L.R. Rabiner, S.E. Levinson, A.E. Rosenberg, and J.G. Wilpon, "speaker independent recognition of isolated words using Clustering Techniques", IEEE Trans. Acoustics, speech signal Proc., ASSP- 27:336-349, August 1979.
- H.Sakoe, "two level DP matching -a dynamic programming based pattern matching algorith for connected word recognition," IEEE Trans. Acoustics, Speech signal Proc., ASSP-27:588-595, December 1979.
- J.S. Bridle and M.D. Brown , "Connected word recognition using whole word templates," Proc. Inst. Acoust. Autumn Conf., 25-28, November 1979.
- C.S. Mayers and L.R. Rabiner, "A Level Building dynamic time warping algorithm for connected word recognition," IEEE Trans. Acoustics, Speech , signal Proc. ASSP-2:284-297, April 1981.
- C.H. Lee and L.R. Rabiner,"A frame synchronous network search algorithm for connected word recognition," IEEE Trans. Acoustics, Speech, signal Proc., 37(11): 1649-1658, November 1989.
- J.Ferguson, Ed. Hdden markov models for speech, IDA, princeton, NJ, 1980.
- L.R. Rabiner ,"A tutorial on hidden markov models and selected applications in speech rwecognition," Proc. IEEE, 77(2):257-286, February 1989.
- K.F. Lee, H.W. Hon, and D.R. Reddy, "An overview of the SPHINX speech recognition system," IEEE Trans. acoustics, speech , signal proc. , 38:600-610, 1990.
- Y.L. Chow , M.O. Dunham, O.A. Kimball, M.A. Krasner, G.F. Kubala, J.Makhoul, S. Roucos, and R.M. Schwartz, "BBYLOS: the BBN continuous speech recognition system," proc. ICASSP 87, 89- 92, April 1987.
- D.B.Paul, "the lincoln robust continuous speech recognizer," proc. ICASSP 89, glasgow, scotland, 449-452, may 1989.
- M. Weintraub et al. "Linguistic constraints in hidden markov model based speech recognition," proc. ICASSP, glasgow , scotland, 699- 702, may 1989.
- V. Zue, J. Glass, M. Philips, and S. Seneff, "the MIT summit speech recognition systems: a progress report," proc. DARPA speech and natural language workshop, 179-189, february 1989.
- C. H. Lee , L.R. Rabiner, R.Pieraccinni and J.G. Wilpon, "acoustic modelling for large vacabulary speech recognition ," computer speech and language , 4:127-165, 1990.
- Jingdong Chen, Member, Yiteng (Arden) Huang, Qi Li, Kuldip K. Paliwal "Recognition of Noisy Speech Using Dynamic Spectral Subband Centroids" in IEEE SIGNAL PROCESSING LETTERS, VOL. 11, NO. 2, FEBRUARY 2004.
- Hakan Erdogan, Ruhi Sarikaya, Yuqing Gao "Using semantic analysis to improve speech recognition" performance in Elsevier 2005 .
- Chunyi Guo, Runzhi Li, Lei Shi "Research on the Application of Biomimetic Computing in Speech Recognition" in IEEE 2008 .
- Negar Ghourchian, Sid-Ahmed Selouani, Douglas O'Shaughnessy "Robust Distributed Speech Recognition using Two-Stage Filtered Minima Controlled Recursive Averaging" in IEEE 2009 .
- Yu-Hsiang Bosco Chiu, Richard M Stern "MINIMUM VARIANCE MODULATION FILTER FOR ROBUST SPEECH RECOGNITION" in 2009.
- Chadawan Ittichaichareon, Patiyuth Pramkeaw "Improving MFCC- based Speech Classification with FIR Filter" in International Conference on Computer Graphics, and Modeling (ICGSM'2012) July 28-29, 2012 Pattaya (Thailand).
- Kavita Sharma,Prateek Haksar " Speech Denoising Using Different Types of Filters" in International Journal of Engineering Research and Applications Vol. 2, Issue 1,Jan-Feb 2012, pp.809-811.
- Bhupinder Singh, Neha Kapur, Puneet Kaur "Speech Recognition with Hidden Markov Model: A Review" in International Journal of Advanced Research in Computer Science and Software Engineering Volume 2, Issue 3, March 2012.
- Suma Swamy and K.V Ramakrishnan "an efficient speech recognition system" in Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013.
- Alex Graves of Google DeepMind, London, United Kingdom and Navdeep Jaitly from Department of Computer Science, University of Toronto, Canada "Towards End-to-End Speech Recognition with Recurrent Neural Networks" International Conference on Machine Learning, Beijing, China, 2014.