IMAGE TEXT TO SPEECH CONVERSION IN DESIRED LANGUAGE
2023, International Journal of Creative Research Thoughts (IJCRT)
Abstract
The goal of this proposed work is to create an Android-based image text-to-speech (ITTS) application that enables users to translate text in photographs into spoken in formation in the language of their choice. The ability for users to customize the language in which the synthesized voice is produced is one of the application's standout features. Because of its user-friendly interface, a wide audience can access the Android application. Performance of an Android application is evaluated by precision, reactivity, and ability to customize language. This proposed workcan serve a variety of user demands, such as language learners, visually impaired people, and people looking for portable, effective tools for information consumption.
FAQs
AI
How does the proposed ITTS system handle multilingual text-to-speech conversion?
The research indicates that the system is designed to enable multilingual support, allowing users to select desired languages for text-to-speech conversion dynamically. This customization aims to enhance accessibility and improve comprehension across various user demographics.
What methodology is employed for text extraction in images?
The study employs Optical Character Recognition (OCR) technology for high-accuracy text extraction from images. This method systematically converts captured images into readable text formats, facilitating further processing such as translation and speech synthesis.
What were the primary assessments conducted to evaluate the system's performance?
Performance metrics such as accuracy in text extraction, responsiveness of speech synthesis, and user feedback were systematically assessed. These evaluations aim to refine the application for real-world usability, addressing varied user needs.
When was the basis for the developed system established in existing literature?
Key advancements in OCR and TTS systems were discussed in various studies, notably by Muhammad Ajmal et al. in 2018 and by Cong Ma et al. in 2022, highlighting the evolution of image text-to-speech technologies.
What challenges exist in extending TTS capabilities to low-resource languages?
The research identifies limited linguistic data availability as a significant barrier in extending TTS capabilities to low-resource languages. This issue necessitates innovative approaches to improve multilingual support and accuracy.
References (13)
- Muhammad Ajmal; Farooq Ahmad; Martinez-Enriquez A.M.; MudasserNaseer; Aslam Muhammad; Mohsin Ashraf ;Image to Multilingual Text Conversion for Literacy Education ; 17-20 December 2018.
- H.Waruna H. Premachandra Information Communication Technology Center, Wayamba University of Sri Lanka, Makandura, Sri Lanka ; AnuradhaJayakody; HiroharuKawanaka; Converting high resolution multi- lingual printed document images in to editable text using image processing and artificial intelligence; 12-13 March 2022.
- NikolaosBourbakis ;Image understanding for converting images into natural language text sentences; 21- 23 August 2010.
- Cong Ma; Yaping Zhang; Mei Tu; Xu Han; Linghui Wu; Yang Zhao; Yu Zhou ;Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task; 21-25 August 2022
- Fai Wong; Sam Chao; Wai Kit Chan; Yi Ping Li; Recognition of Chinese character in snapshot translation system; 23-25 November 2010.
- Victor Fragoso, Steffen Gauglitz, Shane Zamora, Jim Kleban, Matthew Turk "TranslatAR: A Mobile Augmented Reality Translator"2010 IEEE.
- Ariffin Abdul Muthalib1, Anas Abdelsatar1, Mohammad Salameh1, Juhriyansyah Dalle2 "Making Learning Ubiquitous With Mobile Translator Using Optical Character Recognition (OCR)" 2011 ICACSIS.
- Shalin A. Chopra1, Amit A. Ghadge2, Onkar A. Padwal3, Karan S. Punjabi4, Prof. Gandhali S. Gurjar5 " Optical Character Recognition" International Journal of Advanced Research in Computer and Communication Engineering Vol. 3, Issue 1, January 2014.
- Hideharu Nakajima, Yoshihiro Matsuo, Masaaki Nagata, Kuniko Saito "Portable Translator Capable of Recognizing Characters onSignboard and Menu Captured by Built-in Camera" 2005 Association for Computational Linguistics/Proceedings of the ACL Interactive Poster and Demonstration Sessions, pages 61- 64, Ann Arbor, June 2005..
- Nag, S., Ganguly, P. K., Roy, S., Jha, S., Bose, K., Jha, A., &Dasgupta, K. (2018). Offline Extraction of Indic Regional Language from Natural Scene Image using Text Segmentation and Deep Convolutional Sequence. arXiv preprint arXiv:1806.06208.
- Yang, C. S., & Yang, Y. H. (2017). Improved local binary pattern for real scene optical character recognition. Pattern Recognition Letters, 100, 14-21.
- Phangtriastu, M. R., Harefa, J., &Tanoto, D. F. (2017). Comparison between neural network and support vector machine in optical character recognition.Procedia Computer Science, 116, 351-357.
- Naz S, Hayat K, Razzak MI, Anwar MW, Madani SA, Khan SU. The optical character recognition of Urdu-like cursive scripts.Pattern Recognition. 2014 Mar 1;47(3):1229-48.