Ontology Based Data Mining Approach on Web Documents
2014, Computer and Information Science
https://doi.org/10.5539/CIS.V7N4P123Abstract
Internet which is included plenty of huge data source is now rapidly increasing in all domains. It is considered as valuable data sources if the data can be processed that results in information. Data mining techniques are widely utilized in web documents in order to extract information. In this paper a data mining approach based on Ontology is proposed to classify web documents in order to facilitate applications based on classified text documents like search engines. The proposed approach is implemented and applied on several web documents. The experimental results show considerable progress.
References (13)
- Berger, A. L., & Mittal, V. O. (2000). OCELOT: A system for summarizing Web pages. In Proceedings Of the 23 rd annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 144-151.
- Bracewell, D. B., Ren, F., & Kuroiwa, S. (2005). Multilingual single document keyword extraction for information retrieval. In Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, 517-522.
- Christian, S., Helmuth, P., & Burkhard, F. R. (2010). Ontology Extraction and Wikipedia Expansion Using Language Resources, 11 th international conference on Web Age Information Management (WAIM 2010) china, 2010 LNCS volume 6186.
- Denny, J. C., & Smithers, J. D. (2002). A new tool to identify key biomedical concepts in text documents, with special application to curriculum content. In Proceedings Of the AMIA Symposium, 1007.
- Ercan, G., & Cicekli, I. (2007). Using lexical chains for keyword extraction. Inf. Process. Manage, 43(6), Gutwin, C., Paynter, G., Witten, I., Nevill-Manning, C., & Frank, E. (1999). Improving browsing in digital libraries with keyphrase indexes. Decision Support Systems, 27, 81-104.
- Huang, C., Tian, Y., Zhou, Z., Ling, C. X., & Huang, T. (2006). Keyphrase extraction using semantic networks structure analysis. In ICDM '06: Proceedings of the Sixth International Conference on Data Mining. Washington, DC, USA: IEEE Computer Society, 275-284
- Jones, S., & Mahoui, M. (2000). Hierarchical document clustering using automatically extracted keyphrases.
- Litvak, M., & Last, M. (2008). Graph-based keyword extraction for single-document summarization. In MMIES '08: Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization. Morristown, NJ, USA: Association for Computational Linguistics, 17-24.
- Pudota, N., Dattolo, A., Baruzzo, A., Ferrara, F., & Tasso, C. (2010). Automatic keyphrase extraction and ontology mining for content-based tag recommendation. International Journal of Intelligent Systems -New Trends for Ontology-Based Knowledge Discovery archive, 25(12).
- Sarkar, K. (2011). Automatic Keyphrase Extraction from Bengali Documents: A Preliminary Study. 2011 Second International Conference on Emerging Applications of Information Technology, 125-129.
- Schönberg, C., Pree, H., & Freitag, B. (2010). Rich Ontology Extraction and Wikipedia Expansion Using Language Resources. WAIM, 151-156.
- Song, M., Bleik, S., Yu, H., & Hon, W. (2011). Extracting Biomedical Concepts from Fulltext by Relative Importance in a Graph Model. 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops, 586-593.
- Yu, F. Xuan, H., & Zheng, D. (2012). Key-Phrase Extraction Based on a Combination of CRF Model with Document Structure. 2012 Eighth International Conference on Computational Intelligence and Security, 406-410.