Mining of Malay Newspaper ( SAMNews ) Using Artificial Immune System
Abstract
There are sheer volume of rich web resources such as digital newspaper, e-forum, blogs, Facebook and Twitter. Mining the digital text resources may reveal interesting knowledge to respective individuals or organizations. Text mining and sentiment mining or analysis are parts of a new area in sentiment research. Sentiment mining for Malay Newspaper (SAMNews) is constructed based on the artificial immune system called negative selection algorithm which is able to classify the sentiment in newspaper's sentences into the polarity (positive, negative or neutral) intelligently. The sentiment analysis in this project utilized 1000 sentences from newspapers to evaluate the average accuracy. The research used 900 sentences from newspapers as the training data and another 100 as the testing data. The accuracy is achieved at 88.5%. In the future, a comparative study on Artificial Immune System and other techniques or algorithms can be carried out to enhance the performance of the sentiment classification model.
References (34)
- A. Abbasi, H. Chen, and A. Salem, (2006). "Sentiment Analysis in Multiple Languages : Feature Selection for Opinion Classification in Web Forums," ACM Transactions on Information System, vol. 3, pp.2-5.
- A. Kennedy, (2006, May). "Sentiment classification of movie reviews using contextual valence shifters," Computational Intelligence, vol. 22, no. 2, pp. 110-125.
- A. T. Kwee, F. S. Tsai, and W. Tang, (2009). "Sentence-Level Novelty Detection in English and Malay," Nanyang Technology University, pp. 40-51.
- B. Liu, (2010). "Sentiment analysis and subjectivity," Handbook of Natural Language Processing,, pp. 1-38.
- B. Liu, (2012). "Sentiment analysis and Opinion Mining," Morgan & Claypool Publishers.
- B. Pang and L. Lee, (2002). "Thumbs up?: sentiment classification using machine learning techniques," Conference on Empirical methods.
- C. D. Manning, P. Raghavan, and H. Schütze, (2009). "An Introduction to Information Retrieval", Working Paper on Cambridge UP, pp. 3-26.
- D. Aickelin, U & Dasgupta, (2003). "Artificial Immune System and Their Application,." in Proceedings of the International Conference on Artificial Immune Systems (ICARIS), pp. 7-10.
- D. Consoli (n. d.). "TEXTUAL EMOTIONS RECOGNITION WITH AN INTELLIGENT SOFTWARE OF SENTIMENT ANALYSIS," Word Journal Of The International Linguistic Association, pp. 997- 1009.
- D. Pyle, (1999). "Data Preprocessing Techniques for Data Mining," Winter Schools on "Data Mining Techniques and Tools for Knowledge Discovery in Agricultural Datasets", pp. 139-144.
- D. Zhang, S. Li, C. Zhu, and X. Niu, (2010). "A comparison study of multi-class sentiment classification for Chinese reviews," Fuzzy Systems and Knowledge Discovery, pp. 2433-2436.
- E. Dragut, F. Fang, P. Sistla, C. Yu, and W. Meng, (2009). "Stop word and related problems in web interface integration," Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 349-360.
- F. Sebastiani, "Machine learning in automated text categorization," ACM Computing Surveys, vol. 34, no. 1, pp. 1-47, Mar. 2002.
- G. Vinodhini and RM. Chandrasekaran, (2012). "Sentiment Analysis and Opinion Mining: A Survey," Interpretation Journal Of Advance Research in COmputer Science and Software Engineering, Vol 2, issues 6, pp282-292.
- J. Prager, (2006). "Open-Domain Question-Answering," Foundations and Trends® in Information Retrieval, vol. 1, no. 2, pp. 91-231.
- L. Alvim, P. Vilela, E. Motta, and R. L. Milidiú, (n. d.). "Sentiment of Financial News : A Natural Language Processing Approach," Learning, pp. 1-3.
- L. N. D. Castro and J. Timmis, (2002). "An Artificial Immune Network for Multimodal Function Optimization," in Proceedings of IEEE Congres on Evolutionary Computation, vol. 1, pp. 699-674.
- M. A. H. Omer, (2009). "STEMMING ALGORITHM TO CLASSIFY ARABIC DOCUMENTS," Symposium on Progress in Information & Communication Technology, pp. 111-115.
- M. Puteh, A.R. Hamdan, K.Omar dan A.Abu Bakar 2008. Flexible Immune Network for Mining Heterogeneous Data. LNCS 5132, hlm 232-241 @ Springer-Verlag Berlin Heidelberg.
- M. Puteh, A.R. Hamdan, K.Omar dan M.T.H. Mohd Tajuddin 2010. Artificial Immune Network: Classification on Heterogeneous Data, Machine Learning, InTech, ISBN: 978-953-307-033-9
- N. Godbole and S. Skiena, (2007). "Large-Scale Sentiment Analysis for News and Blogs," Interpretation A Journal Of Bible And Theology, pp. 765-770.
- N. Samsudin and M. Puteh, (2011). "Bess or xbest: Mining the Malaysian online reviews," in 2011 3 rd Conference on Data Mining and Optimization (DMO), pp.38-43.
- P. Turney, (2002). "Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews," Proceedings of the 40th Annual Meeting, pp. 2-5.
- P. Willett, (2006). "The Porter stemming algorithm: then and now," Program: electronic library and information systems, vol. 40, no. 3, pp. 219-223.
- R. M. R. Liebregts, (2008, July). "Evaluation of a University-wide Expert Search Engine,", Thesis for Humanities Department, Tilburg University. pp. 70-81.
- R. Schinke, M. Greengrass, A. M. Robertson, and P. Willett, (1993). "Journal of Documentation Emerald Article : A STEMMING ALGORITHM FOR LATIN TEXT DATABASES," Journal of Documentation, vol. 52, no. 2, pp. 172-187.
- S. Forrest, L. Allen, and A. S. Perelson, (1994). "Self-Nonself Discrimination in a Computer," in Proceedings of the IEEE Symposium on Research in Security and Privacy (in press), pp. 3-5.
- T. M. T. Sembok, I. I. H. Uman, and I. N. Rocessing, (2005). "Word Stemming Algorithms and Retrieval Effectiveness in Malay and Arabic Documents Retrieval Systems," Cognitive Psychology, vol. 10, pp. 95-97.
- W. Fan, S. Sun, and G. Song, (2011, Apr.). "Sentiment Classification for Chinese Netnews Comments Based on Multiple Classifiers Integration," 2011 Fourth International Joint Conference on Computational Sciences and Optimization, pp. 829-834.
- W. Wang, (2010.). "Sentiment Analysis of Online Product Reviews with Semi-supervised Topic Sentiment Mixture Model," Science, vol. 5, no. Fskd, pp. 2385-2389.
- Y. Mejova, (2009). "Sentiment Analysis : An Overview".
- Comprehensive Exam Paper, vol. 1, pp. 3-6.
- Z. Ji and D. Dasgupta, (2007). "Revisiting Negative Selection Algorithms," Evolutionary Computation, vol. 15, no. 2, pp. 223-251.
- Proceedings of the World Congress on Engineering 2013 Vol III, WCE 2013, July 3 -5, 2013, London, U.K. ISBN: 978-988-19252-9-9 ISSN: 2078-0958 (Print); ISSN: 2078-0966 (Online) WCE 2013