Parallel hardware for faster morphological analysis
Journal of King Saud University - Computer and Information Sciences
https://doi.org/10.1016/J.JKSUCI.2017.07.003Abstract
Morphological analysis of Arabic language is computationally intensive, has numerous forms and rules, and intrinsically parallel. The investigation presented in this paper confirms that the effective development of parallel algorithms and the derivation of corresponding processors in hardware enable implementations with appealing performance characteristics. The presented developments of parallel hardware comprise the application of a variety of algorithm modelling techniques, strategies for concurrent processing, and the creation of pioneering hardware implementations that target modern programmable devices. The investigation includes the creation of a linguistic-based stemmer for Arabic verb root extraction with extended infix processing to attain high-levels of accuracy. The implementations comprise three versions, namely, software, non-pipelined processor, and pipelined processor with high throughput. The targeted systems are high-performance multi-core processors for software implementations and high-end Field Programmable Gate Array systems for hardware implementations. The investigation includes a thorough evaluation of the methodology, and performance and accuracy analyses of the developed software and hardware implementations. The developed processors achieved significant speedups over the software implementation. The developed stemmer for verb root extraction with infix processing attained accuracies of 87% and 90.7% for analyzing the texts of the Holy Quran and its Chapter
References (20)
- Abu Shquier, M.M., Alhawiti, K.M., 2015. Novel prefix tri-literal word analyser: rule- based approach. J. Comput. Sci. 11 (4), 627-638. http://dx.doi.org/10.3844/ jcssp.2015.627.638.
- Abu-Errub, Aymen, Odeh, Ashraf, Shambour, Qusai, Al-Haj Hassan, Osama, 2014. Arabic roots extraction using morphological analysis. Int. J. Comput. Sci. Issues (IJCSI) 11 (2), 128.
- Agarwal, Basant, Mittal, Namita, 2016. Machine learning approach for sentiment analysis. Prominent Feature Extraction for Sentiment Analysis. Springer International Publishing, pp. 21-45.
- Al-Bawab, M., Al-Tayyan, M., 1998. Computerized processing of Arabic morphology. Arabian Mag. Sci. 32, 6-13.
- Al-Khalifah, Z.M., 1996. Automated Morphological Analysis of Words (MSc Thesis). Department of Computer Science, College of Computer and Information Systems, King Saud University, Riyad, Kingdom of Saudi Arabia.
- Al-Shalabi, Riyad, Evens, Martha, 1998. A computational morphology system for Arabic. Proceedings of the Workshop on Computational Approaches to Semitic Languages. Association for Computational Linguistics, pp. 66-72.
- Al-Sughaiyer, Imad A., Al-Kharashi, Ibrahim A., 2004. Arabic morphological analysis techniques: a comprehensive survey. J. Am. Soc. Inf. Sci. Technol. 55 (3), 189- 213.
- Asaad, Amal, Abbod, Maysam, 2014. Arabic text root extraction via morphological analysis and linguistic constraint. 16th International Conference on Computer Modelling and Simulation. IEEE, pp. 125-130.
- Boubas, Anas, Lulu, Leena, Belkhouche, Boumediene, Harous, Saad, 2011. GENESTEM: A novel approach for an Arabic stemmer using genetic algorithms. Proceedings of the 2011 International Conference on Innovations in Information Technology (IIT), pp. 77-82.
- Boudlal, Abderrahim, Bebah, Mohamed, Lakhouaja, Abdelhak, Mazroui, Azzeddine, Meziane, Abdelouafi, 2011. A Markovian approach for Arabic root extraction. Int. Arab J. Inf. Technol 8 (1), 91-98.
- Buckwalter, Tim. 2002. Buckwalter Arabic Morphological Analyzer Version 1.0.
- Cohen, Jonathan D., 1998. Hardware-assisted algorithm for full-text large- dictionary string matching using n-gram hashing. Inf. Process. Manage. 34 (4), 443-464.
- Sawalha, Majdi, Atwell, E.S., 2008. Comparative evaluation of arabic language morphological analyzers and stemmers. In: Proceedings of COLING 2008 22nd International Conference on Computational Linguistics (Poster Volume), pp. 107-110. Coling 2008 Organizing Committee, 2008.
- Sembok, Tengku Mohd T., Ata, Belal Abu, 2013. Arabic word stemming algorithms and retrieval effectiveness. Proceedings of the World Congress on Engineering, vol. 3, p. 1577.
- Sensory, NLP-5x Natural Language Processor. http://www.sensory.com/products/ integrated-circuits/nlp-5x-natural-language-processor/ (Accessed July 1, 2017).
- Soudi, Abdelhadi, Neumann, Günter, Van den Bosch, Antal, 2007. Arabic computational morphology: knowledge-based and empirical methods. Arabic Computational Morphology. Springer, Netherlands, pp. 3-14.
- Statista. ''Social Media Statistics and Facts." http://www.statista.com/topics/ 1164/social-networks/ (Accessed July 1, 2017).
- Yaghi, Jim, Titchener, Mark R., Yagi, Sane, 2003. T-Code compression for Arabic computational morphology. Proceedings of the Australasian Language Technology Workshop, pp. 425-465.
- Yagi, Sane M., Harous, Saad, 2003. Arabic morphology: an algorithm and statistics. Proceedings of the 2003 International Conference on Artificial Intelligence (IC- AI 2003).
- Yang, Jieming, Liu, Yuanning, Zhu, Xiaodong, Liu, Zhen, Zhang, Xiaoxu., 2012. A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization. Inf. Process. Manage. 48 (4), 741- 754.