Academia.eduAcademia.edu

Outline

Automatically inducing ontologies from corpora

2002, Corpus

Abstract

The emergence of vast quantities of on-line information has raised the importance of methods for automatic cataloguing of information in a variety of domains, including electronic commerce and bioinformatics. Ontologies can play a critical role in such cataloguing. In this paper, we describe a system that automatically induces an ontology from any large on-line text collection in a specific domain. The ontology that is induced consists of domain concepts, related by kind-of and part-of links. To achieve domain-independence, ...

References (18)

  1. Abney, S. 1996. Partial parsing Via Finite-State Cascades. Proceedings of the ESSLLI '96 Robust Parsing Workshop.
  2. Caraballo, S. A. 1999. Automatic Construction of a hypernym-labeled noun hierarchy from text. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL'1999), 120-122.
  3. Cohen, P. R., Chaudhri, V., Pease, A. and Schrag, R. 1999. Does Prior Knowledge Facilitate the Development of Knowledge-based Systems? The Sixteenth National Conference on Artificial Intelligence (AAAI-99).
  4. Craven, M. and Kumlien, J. 1999. Constructing biological knowledge bases by extracting information from text sources. Proc Int Conf Intell Syst Mol Biol., 77-86.
  5. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., and Slattery, S.. 1998. Learning to Extract Symbolic Knowledge from the World Wide Web. Proceedings of AAAI-98, 509-516.
  6. Daude, J., Padro, L. and Rigau, G. 2001 A Complete WN1.5 to WN1.6 Mapping. NAACL- 2001 Workshop on WordNet and Other Lexical Resources: Applications, Extension, and Customization, 83-88.
  7. Doan, A., Madhavan, J. , Domings, P. and Halevy, A. 2002. Learning to Map between Ontologies on the Semantic Web. WWW'2002.
  8. Dunning, T. 1993. Accurate Methods for the Statistics of Surprise and Coincidence," Computational Linguistics, 19(1):61-74, 1993.
  9. Girju, R., Badulescu, A., and Moldovan, D. 2003. Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations. Proceedings of HLT'2003, Edmonton.
  10. Grefenstette, G. 1997. Explorations in Automatic Thesaurus Discovery. Kluwer International Series in Engineering and Computer Science, Vol 278.
  11. Hearst, M. 1992. Automatic Acquisition of Hyponyms from Large Text Corpora. Proceedings of the fourteenth International Conference on Computational Linguistics, Nantes, France, July 1992.
  12. Hull, R. and Gomez, F. 1993. Inferring Heuristic Classification Hierarchies from Natural Language Input. Telematics and Informatics, 9(3/4), pp. 265-281.
  13. IRS (Internal Revenue Service). 2001. Tax Guide 2001. Publication 17. http://www.irs.gov/pub/irs- pdf/p17.pdf
  14. Lawrie, D., Croft, W. B., and Rosenberg, A. 2001. Finding topic words for hierarchical summarization. 24th ACM Intl. Conf. on Research and Development in Information Retrieval, 349-357, 2001.
  15. Miller, G. (1995). WordNet: A Lexical Database for English. Communications Of the Association For Computing Machinery (CACM) 38, 39-41.
  16. Sanderson, M. and Croft, B. 1995. Deriving concept hierarchies from text. Proceedings of the 22 nd Annual Internationaql ACM SIGIR Conference on Research and Development in Information Retrieval, 160-170.
  17. Sekine, S., Sudo, K. and Ogino, T. 1999. Statistical Matching of Two Ontologies. Proceedings of ACL SIGLEX99 Workshop: Standardizing Lexical Resources.
  18. Zhang, K., Wang, J. T. L. and Shasha, D. 1996. On the Editing Distance between Undirected Acyclic Graphs and Related Problems. International Journal of Foundations of Computer Science 7, 43-58.