Academia.eduAcademia.edu

Outline

A Survey on Preprocessing of Web-Log Data in Web Usage Mining

2017, IJMTST

Abstract

Web mining is to determine and extract useful information. In the internet age web applications are increasing at enormous speed and the web users are increasing at exponential speed. As number of users grows, web site publishers are having increasing their information for attracting and satisfying users. It is possible to trace the users' essence and interactions with web applications through web server log file and Web log file contains only (.txt) file. The data stored in the web log file consist of large amount of eroded, incomplete, and unnecessary information. Because of large amount of irrelevant data's available in the web log file, an original log file cannot be directly used in the web usage mining. So preprocessing technique is applied to improve the quality and efficiency of a web log file. Different techniques are applied in preprocessing that is data cleaning, data fusion, data integration. In this paper we will survey different preprocessing technique to identify the issues in web log file and to improve web usage mining preprocessing for pattern mining and analysis.

References (31)

  1. Zidrina Pabarskaite, Aistis Raudys (2007), A process of knowledge discovery from web usage data: Systemization and critical review in Journal of Intelligent Information System, Springer Vol.28 Issue.1 Page(s): 79-104.
  2. C. Shahabi, F. Banaei-Kashani (2002), A Framework for Efficient and Anonymous Web Usage Mining Based on Client-Side Tracking in WEBKDD Third International Workshop on Mining Web Log Data, Page(s): 113-144.
  3. D. Pierrakos, G. Paliouras, C. Papatheodorou, and C. D. Spyropoulos (2003), Web usage mining as a tool for personalization: A survey in User Modeling and User Adapted Interaction journal, Vol. 13 Issues. 4 Page(s): 311-372.
  4. B. Naresh Kumar Reddy, M.H.Vasantha, Y.B.Nithin Kumar and Dheeraj Sharma, "Communication Energy Constrained Spare Core on NoC", 6 th International Conference on Computing, Communication and Networking Technologies (ICCCNT), PP. 1-4, 2015.
  5. B. Naresh Kumar Reddy, M.H.Vasantha, Y.B.Nithin Kumar and Dheeraj Sharma, "A Fine Grained Position for Modular Core on NoC IEEE International Conference on Computer, Communication and Control, PP. 1-4, 2015.
  6. Robert.Cooley,Bamshed Mobasher and Jaideep Srinivastava,"Data Preparation for Mining World Wide Web Browsing Patterns ,", journal of knowledge and Information Systems,1999.
  7. Cyrus Shahabi, Amir M.Zarkessh, Jafar Abidi and Vishal Shah "Knowledge discovery from users Web page navigation, ", In.Workshop on Research Issues in Data Engineering, Birmingham, England,1997.
  8. Yan Li, Boqin FENG and Qinjiao MAO, "Research on Path Completion Technique in Web Usage Mining,,", International Symposium on Computer Science and Computational Technology, IEEE,2008.
  9. Yan Li and Boqin FENG "The Construction of Transactions for Web Usage Mining,,", International Conference on Computational Intelligence and Natural Computing, IEEE,2009.
  10. R. Cooley, B. Mobasher, J. Srivastav (1999), Data preparation for mining world wide web browsing pattern in Journal of Knowledge and Data Engineering Workshop, IEEE, Vol.1 Page(s): 5-32.
  11. B. Naresh Kumar Reddy, M.H.Vasantha and Y.B.Nithin Kumar, "A Gracefully Degrading and Energy-Efficient Fault Tolerant NoC Using Spare core", 2016 IEEE Computer Society Annual Symposium on VLSI, pp. 146-151, 2016.
  12. D. Tanasa, B. Trousse (2004), Advanced Data Preprocessing for Intersites Web Usage Mining in IEEE Intelligent Systems, Vol. 19 Issues. 2 Page(s): 59-65.
  13. R. F. Dell (2008),Web user session reconstruction using integer programming in International Conference on Web Intelligence and Intelligent Agent Technology, IEEE/ACM/WIC, Vol. 1 Page(s): 385-388.
  14. Yan LI (2008), Research on path completion technique in web usage mining in International Symposium on Computer Science and Computational Technology, IEEE, Vol. 1 Page(s): 554-559.
  15. Xiang-ying Li (2013), Data Preprocessing in Web Usage Mining in 19th International Conference on Industrial Engineering and Engineering Management Page(s): 257-266.
  16. Sanjay Bapu Thakare, Prof. Sangram Z Gawali, "A Effective and Complete Preprocessing for Web Usage Mining", International Journalon Computer Science and Engineering, Vol. 02, No. 03, pp. 848-851,2010,
  17. Amit Dipchandji Kasliwal, Dr. Girish S. Katkar, " Web Usage mining for predicting User Access Behavior", International Journal of Computer Science and Information Technology, Vol. 6 (1), 2015, 201-204
  18. Navin Kumar Tyagi, A.K.Solanki, Sanjay Tyagi, " An Algorithmicapproach to data preprocessing in Web Usage Mining", InternationalJournal of Information Technology and Knowledge Management, Vol.2, No. 2, pp. 279-283,Dec-2010
  19. K.R. Suneetha , Dr. R. Krishnamoorthi, "Indentifying User Behavior byanalyzing Web Server Access Log File", International Journal ofComputer Science and Network Security, Vol. 9, No. 4, April 2009
  20. Jaideep Srivastava, Robert Cooley, Mukund Despande, Pang-Ning Tan,"Web Usage Mining: Discovery and Applications of Usage Patternsfrom Web Data", SIGKDD Explorations, Vol. 1, Issue 2 Jan 2000
  21. V.Chitraa, Dr. Antony Selvadoss Thanamani, "Web Log Data Cleaning for Enhancing Mining Process", International Journal of Communication and Computer Technologies", Vol. 01, No. 11, Issue 03, December 2012.
  22. B. Naresh Kumar Reddy, et al., An Efficient Data Transmission by using Modern USB Flash Drive", International Journal of Electrical and Computer Engineering, Vol. 4, Issue 5, 2014.
  23. Vellingiri J. and S. Chenthur Pandian, " A Novel Technique for Web Log mining with Better Data Cleaning and Transaction Identification ",Journal of Computer Science 7(5): 683-689, ISSN: 1549-3636, 2011
  24. P. Nithya and Dr. P. Sumathi, "Novel Pre-Processing Technique for Web Log Mining by Removing Global Noise, Cookies and Web Robots", International Journal of Computer Applications, Vol. 53, No.17, September-2012
  25. Vellingiri J., S. Kaliraj, S. Satheeshkumar and T. Parhtiban, " A Novel Approach for User Navigation Pattern Discovery and Analysis for Web Usage Mining", Journal of Computer Science 11(2); 372-382, 2015
  26. Ashwin G. Raiyani, Sheetal S. Pandya, "Discovering user identification mining techniques for preprocessed web log data", Journal ofInformation, Knowledge and Research in Computer Engineering, ISSN: 0975-6760, Vol. 2, Issue. 2, Pages 477-482, OCT-2013
  27. Renata I., Sandor J., "Analysis of Web User Identification Methods",World Academy of Science, Engineering and Technology, 2007
  28. Arvindkumar Dangi, Sunita Sangwan, " A new approach for user identification in web usage mining preprocessing", IOSR Journal of Computer Engineering, e-ISSN: 2278-0661, p-ISSN: 2287-8727, Vol. 11, Issue. 3, (May-June2013), Pages 57-61
  29. Vijaya Sree Boddu and et. al., "Low power and area efficient N-bit parallel processor on a chip", 13th International IEEE India Conference INDICON 2016, pp. 1-4, 2016.
  30. Shaily Langhnoja, Mehul Barot, Darshak Mehta, " Pre-processing: Procedure on Web Log File for Web Usage Mining", InternationalJournal of Emerging Technology and Advanced Engineering, ISSN: 2250-2459, ISO 9001:2008 Certified Journal, Vol. 2, Issue. 12,December 2012
  31. C.E. Dinuca, D. Ciobanu, " Improving the session identification using the mean time", International Journal of Mathematical Models andMethods in Applied Sciences, Vol. 6, Issue 2, 2012