Arabic text classification using linear discriminant analysis
2017, 2017 International Conference on Engineering & MIS (ICEMIS)
https://doi.org/10.1109/ICEMIS.2017.8272958Abstract
The linear discriminant analysis (LDA) is a dimensionality reduction technique that is widely used in pattern recognition applications. The LDA aims at generating effective feature vectors by reducing the dimensions of the original data (e.g. bag-of-words textual representation) into a lower dimensional space. Hence, the LDA is a convenient method for text classification that is known by huge dimensional feature vectors. In this paper, we empirically investigated two LDA based methods for Arabic text classification. The first method is based on computing the generalized eigenvectors of the ratio (between-class to within-class) scatters, the second method includes linear classification functions that assume equal population covariance matrices (i.e. pooled sample covariance matrix). We used a textual data collection that contains 1,750 documents belong to five categories. The testing set contains 250 documents belong to five categories (50 documents for each category). The experiment...
References (23)
- Marsland, Stephen. Machine learning: an algorithmic perspective. CRC press, 2015.
- Martinez, Wendy L., et al. Exploratory data analysis with MATLAB. CRC Press, 2010.
- Theodoridis, S. and K. Koutroumbas (2008). Pattern Recognition, Fourth Edition, Academic Press.2010.
- Kantardzic, Mehmed. Data mining: concepts, models, methods, and algorithms. John Wiley & Sons, 2011.
- Avialable: https://translate.google.com/
- Al-Anzi, Fawaz S., and Dia AbuZeina. "Stemming impact on Arabic text categorization performance: A survey." 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA). IEEE, 2015.
- Li, Ming, and Baozong Yuan. "2D-LDA: A statistical linear discriminant analysis for image matrix." Pattern Recognition Letters 26.5 (2005): 527-532.
- Torkkola, Kari. "Linear discriminant analysis in document classification."IEEE ICDM Workshop on Text Mining. 2001.
- Park, Cheong Hee, and Haesun Park. "A comparison of generalized linear discriminant analysis algorithms." Pattern Recognition 41.3 (2008): 1083-1097.
- Silva, Carolina Santos, et al. "Classification of blue pen ink using infrared spectroscopy and linear discriminant analysis." Microchemical Journal 109 (2013): 122-127.
- Martínez, Aleix M., and Avinash C. Kak. "Pca versus lda." IEEE transactions on pattern analysis and machine intelligence 23.2 (2001): 228-233.
- Liu, Chengjun, and Harry Wechsler. "Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition."IEEE Transactions on Image processing 11.4 (2002): 467- 476.
- Wang, Xiaogang, and Xiaoou Tang. "Dual-space linear discriminant analysis for face recognition." Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. Vol. 2. IEEE, 2004.
- Lu, Juwei, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos. "Face recognition using kernel direct discriminant analysis algorithms." IEEE Transactions on Neural Networks 14.1 (2003): 117-126.
- Zheng, Wei-Shi, Jian-Huang Lai, and Stan Z. Li. "1D-LDA vs. 2D- LDA: When is vector-based linear discriminant analysis better than matrix-based?."Pattern Recognition 41.7 (2008): 2156-2172.
- Li, Tao, Shenghuo Zhu, and Mitsunori Ogihara. "Using discriminant analysis for multi-class classification: an experimental investigation." Knowledge and information systems 10.4 (2006): 453- 472.
- Rezzi, Serge, et al. "Classification of olive oils using high throughput flow 1 H NMR fingerprinting with principal component analysis, linear discriminant analysis and probabilistic neural networks." Analytica Chimica Acta 552.1 (2005): 13-24.
- Kher, Ashwini, et al. "Forensic classification of ballpoint pen inks using high performance liquid chromatography and infrared spectroscopy with principal components analysis and linear discriminant analysis." Vibrational spectroscopy 40.2 (2006): 270-277.
- Ye, Jieping, Ravi Janardan, and Qi Li. "Two-dimensional linear discriminant analysis." Advances in neural information processing systems. 2004.
- Jain, Amit, and Jeffrey Huang. "Integrating independent components and linear discriminant analysis for gender classification." Automatic Face and Gesture Recognition, 2004. Proceedings. Sixth IEEE International Conference on. IEEE, 2004.
- Duda, Richard O., and Peter E. Hart. Pattern classification and scene analysis. Vol. 3. New York: Wiley, 1973.
- Rencher, Alvin C. Methods of multivariate analysis. Vol. 492. John Wiley & Sons, 2003.
- Avialable : http://www.alqabas.com.kw/Default.aspx