Academia.eduAcademia.edu

Outline

A Survey on Web Mining Techniques and Applications

International Journal on Advanced Science, Engineering and Information Technology

https://doi.org/10.18517/IJASEIT.7.4.2803

Abstract

Information on Internet and especially on Web sites increasing rapidly day by day, Web sites play an important role in this manner where a lot of Web users are always upload, download and brows a lot of contents based on their needs. Web server provide a way to browse these contents that scattered in many Web sites that reside in it by assigning an Internet Protocol (IP) address or Domain Name System (DNS) to be accessed around the world. Server log the requests that made by users to access their interesting content then store this information in the form of log file. Log file size can be growth from some kilobytes to several megabytes in few days depending on data traffic and the popularity of Web sites. With the fast growth of the data and information in Web environment made a necessity to use sophisticated techniques that have never used in other domains to extract knowledge and significant Web patterns. Web Mining is an extension of Data Mining that integrated various technology in research fields including Artificial Intelligence (AI), statistics, informatics, knowledge discovery and computational linguistics. The aim of Web Mining is to provide a algorithm or technique to make data accesses more efficient and convenient. Web Mining techniques are categorized into three classes depend on which part to be mined which are: Web Content Mining (WCM), Web Structure Mining (WSM) and Web Usage Mining (WSM), in this paper a survey of Web Mining techniques and application are discussed briefly.

References (30)

  1. Bing Liu, "Web Data Mining, Exploring Hyperlinks, Contents and Usage data", 2nd edition, Springer New York, ISBN: 9783642194597, PP: 1-14, 2011.
  2. S. G. Langhnoja, M. P. Barot, D. B. Mehta, "Web Usage Mining ton Discover Visitor Group with Common Behavior using DBSCAN Clustering Algorithm", International Journal of Engineering and Innovative Technology, Vol.2, No.7, pp. 169-173, 2013.
  3. Emrah Donmez, Alper Ozcan, "Time Based Discovering of Web User Patterns to Optimize Web Sites and Hyperlinks", International Journal of Advanced Computational Engineering and Networking, Vol.3, No.2, pp. 14-20, 2015.
  4. Shyam N. Kumar, "World towards Advance Web Mining: A Review", American Journal of Systems and Software, Vol.3, No.8, pp. 44-61, 2015.
  5. J. Srivastava, P. Desilkan, V. Kumar, "Web Mining-Concepts, Application and Research Directories", Foundation and Advances in Data Mining, Wesley W. Chu and T. Y. Lin, Springer-Verlag, pp. 275-307, ISBN: 9783540250579, 2005.
  6. Neha Sharma, "A Review on Analysis to Improve Performance of Website", International Journal of Science and Research, Vol.6, No.6, pp. 2453-2455, 2017.
  7. Bamshad Mobasher, Olfa Nasraoui, "Web Usage Mining ", in Web Data Mining, Exploring Hyperlinks, Contents and Usage data, Bing Liu, 2nd edition, Springer New York, ISBN: 9783642194597, PP: 527-603, 2011.
  8. Anurag kumar, Ravi Kumar Singh., " A Study on Web Content Mining", International Journal of Engineering and Computer Science, Vol.6, No.1, PP: 20003-20006, 2017.
  9. Charu C. Aggarwal, "Mining Text Data ", in Data Mining the Text book, C. C. Aggaral, Springer Switzerlands, ISBN: 9783319141411, PP: 429-456, 2015.
  10. Tanveer Kaur Dewgun, Pushpraj Singh Chauhan, "A Survey on Web Usage Mining: Process, Techniques and Applications", International Journal of Engineering Research and Technology, Vol.4, No.4, pp. 1013-1015, 2015.
  11. Faustina Johnson, Santosh Kumar Gupta, "Web Content Mining Techniques: A Survey", International Journal of Computer Applications, Vol.47, No.11, pp. 44-50, 2012.
  12. Monika Yadav, Pradeep Mittal, "Web Mining: An Introduction", International Journal of Computer Science and Software Engineering, Vol.3, No.3, pp. 683-688, 2013.
  13. Shanthi S, " Survey on Web Usage Mining using Association Rule Mining", International Journal of Innovative Computer Science & Engineering, Vol.4, No.3, pp. 65-67, 2017.
  14. Athena Vakali, George Pallis, Lefteris Angelis, "Clustering Web Information Sources"; In Web Data management practices: Emerging Techniques and Technologies, IDEA group publishing, pp. 34-55, ISBN: 1599042282, 2007.
  15. Xiaoguang QI, Brian D. Davision, 2009. "Web Page Classification: Features and Algorithms", ACM comput. Survey 41, Article 12, http://doi.acm.org/10.1145/ 1459352. 1459357 (Accessed on July 7, 2016).
  16. John M. Pierre,2001."On the Automated Classification of Web Sites, Link¨oping Electronic Articles in Computer and Information Science, Vol.6, http://www.ep.liu.se/ea/cis/2001/000/, (Accessed on July 7, 2016).
  17. Einat Amitay, David Carmel, Adam Darlow, Ronny Lempel, Aya Soffer, " The Connectivity Sonar: Detecting Site Functionality by Structural Patterns", In Proceedings of the 14th ACM Conference on Hypertext and Hypermedia (HYPERTEXT). ACM Press, New York, NY, pp.38-47, 2003.
  18. Martin Ester, Hans-Peter Kriegel, Matthias Schuber, "Web Site Mining: A new way to spot Competitors, Customers and Suppliers in the World Wide Web", In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM Press, New York, NY, pp. 249-258, 2002.
  19. Shipra Saini, Hari Mohan Pandey, "Review on Web Content Mining Techniques", International Journal of Computer Applications, Vol.118, No.18, pp. 33-36, 2015.
  20. Filippo Menczer, "Web Crawling", in Web Data Mining, Exploring Hyperlinks, Contents and Usage data, Bing Liu, 2nd edition, Springer New York, ISBN: 9783642194597, PP: 311-362, 2011.
  21. Leslie F. Sikos, "Mastering Structured Data on the Semantic Web from HTML5 Microdata to Linked Open Data", Apress, New York, ISBN-13 (electronic): 9781484210499, PP: 256, 2015.
  22. Sarla More, Durgesh K. Mishra, "Multimedia Data Mining: A Survey", PRATIBHA, International Journal of Science, Spirituality, Business and Technology, Vol.1, No.1, pp. 49-55, 2012.
  23. S.Vijayarani, Ms. A.Sakila, "Multimedia Mining Research -An Overview, International Journal of Computer Graphics & Animation, Vol.5, No.1, pp. 69-77, 2015.
  24. Miguel G. C. Júnior, Zhiguo Gong, "Web Structure Mining: An Introduction, Proceedings of the International Conference on Information Acquisition, IEEE, Hong Kong and Macau, China, pp. 590-595, 2005.
  25. Mike Thelwall, "Data Cleaning and Validation for Multiple Site Link Structure Analysis ", In Web Mining: Application and Techniques, Anthony Scime, IDEA group publishing, USA, UK, pp. 208-227, ISBN: 1591404169, 2005.
  26. Bamshad Mobasher, Olfa Nasraoui, "Web Usage Mining ", in Web Data Mining, Exploring Hyperlinks, Contents and Usage data, Bing Liu, 2nd edition, Springer New York, ISBN: 9783642194597, PP: 527-603, 2011.
  27. Charu C. Aggarwal, "Mining Web Data ", in Data Mining the Text book, C. C. Aggarwal, Springer Switzerlands, ISBN: 9783319141411, PP: 589-617, 2015.
  28. Shaily G.Langhnoja, Mehul P. Barot, Darshak B. Mehta, "Web Usage Mining using Association Rule Mining on Clustering Data for Pattern Discovery", International Journal of Data Mining techniques and Application, Vol.2, No.1, pp. 141-150, 2013.
  29. Wen-Chen Hu, Xuli Zong, Chung-wei Lee, Jyh-haw Yeh, "World Wide Web Usage Mining Systems and Technologies, Journal of Systematic, Cybernetic and Informatics, Vol.1, No.4, pp. 53-59, 2014.
  30. Rashmi Sharma, Kamaljit Kaur, " Review of Web Structure Mining Techniques using Clustering and Ranking Algorithms", International Journal of Research in Computer and Communication Technology, Vol.3, No.6, pp. 663-668, 2014.