Academia.eduAcademia.edu

Outline

Using perception in managing unstructured documents

2003, Crossroads

https://doi.org/10.1145/1027328.1027333

Abstract
sparkles

AI

The paper discusses the increasing challenges posed by the large volume of unstructured documents available digitally, defining unstructured documents and their relevance in the landscape of information retrieval. It presents an overview of existing information management techniques, especially focusing on natural language processing (NLP) and cognitive science contributions that facilitate the extraction and categorization of information. The future directions highlighted include integrating user context into information retrieval processes and enhancing automation in extracting meanings from unstructured documents.

References (25)

  1. Allen, J. Natural Language Understanding. Redwood City,California: Benjamin/Cummings Publishing Company, 1995.
  2. Becks, A., Sklorz, S., and Jarke, M. A Modular Approach forExploring the Semantic Structure of Technical Document Collections.In Proceedings of the Working Conference on Advanced VisualInterfaces, May 2000.
  3. Berners-Lee, T., Hendler, J., and Lassila, O. The Semantic Web.Scientific American, May 2001.
  4. Bowman, C. and Danzig, P. The Harvest Information Discovery andAccess System. In Proceedings of the Second InternationalWorld-Wide Web Conference, October 1994.
  5. Chakrabarti, S., van der Berg, M., and Dom, B. FocusedCrawling: A New Approach to Topic-Specific Web Resource Discovery.In Proceedings of the 8th International World Wide WebConference, 1999.
  6. Chau, M., Chen, H., Qin, J., Zhou, Y., Qin, Y., Sung, W., andMcDonald, D. Novel Search Environments: Comparison of TwoApproaches to Building a Vertical Search Tool: A Case Study in theNanotechnology Domain. In Proceedings of the Second ACM/IEEE-CSJoint Conference on Digital Libraries, July 2002.
  7. Cho, J., Garcia-Molina, H., and Page, L. Efficient CrawlingThrough URL Ordering. In Proceedings of the 7th World Wide WebConference Brisbane, Australia, April 1998.
  8. Choi, Y, S. and Yoo, Y. S. Multi-agent Learning Approach to WWWInformation Retrieval Using Neural Network. In Proceedings ofthe 4th International Conference on Intelligent UserInterfaces, December 1998.
  9. 9 Fellbaum, C. WordNet: An Electronic Lexical Database.Cambridge: MIT Press, 1999.
  10. Hendler, J., Berners-Lee, T., and Miller, E. IntegratingApplications on the Semantic Web. Journal of the Institute ofElectrical Engineers of Japan, Volume 122(10), pp. 676-680,October 2002.
  11. Kim, H. J. and Lee S. G. A Semi-supervised Document ClusteringTechnique for Information Organization. In Proceedings of theninth international conference on Information and knowledgemanagement, McLean, Virginia, 2000.
  12. Kobayashi, M. and Takeda, K. Information Retrieval on the Web.ACM Computing Surveys (CSUR). Volume 32, Issue 2, June2000.
  13. Labrou, Y., Finin, T. Yahoo! as an Ontology: Using Yahoo!Categories to Describe Documents. In Proceedings of the eighthinternational conference on Information and knowledgemanagement, November 1999.
  14. Lawrence, S., Bollacker, K., and Giles, C. L. Indexing andRetrieval of Scientific Literature. In Proceedings of the eighthinternational conference on Information and knowledgemanagement, November 1999.
  15. Maes, P. Agents that Reduce Work and Information Overload.Communications of the ACM, 37(7), 1994, pp. 31-40.
  16. Miller, G. Wordnet: An Online Lexical Database.International J. Lexicography, Vol. 3, No. 4, 1990, pp.235-312.
  17. N. E. Sondak , V. K. Sondak. Neural Networks and ArtificialIntelligence. In ACM SIGCSE Bulletin, Proceedings of theTwentieth SIGCSE Technical Symposium on Computer ScienceEducation. Volume 21, Issue 1, February 1989.
  18. Nardi, B, A. Awareness Essay: Concepts of Cognition andConsciousness: Four Voices. ACM SIGDOC Asterisk Journal ofComputer Documentation. Volume 22, Issue 1, February 1998.
  19. Patel-Schneider, P., and Sim´on, J. Languages &Authoring for the Semantic Web: The Yin/Yang web: XML syntax andRDF semantics. In Proceedings of the eleventh internationalconference on World Wide Web, May 2002.
  20. Rauber, A. and Merkl, D. SOMLib: A Digital Library System Basedon Neural Networks. In Proceedings of the fourth ACM conferenceon Digital libraries, August 1999.
  21. Sebastiani, F. Machine Learning in Automated TextCategorization. ACM Computing Surveys (CSUR). Volume 34Issue 1, March 2002.
  22. Sleator, D. and Temperley, D. Parsing English with a LinkGrammar. Carnegie Mellon University Computer Science TechnicalReport CMU-CS-91-196, 1991.
  23. Yang, H. and M, Palaniswami. On the Issue of Neighborhood inSelf-organizing Maps. In Proceedings of the 1992 ACM/SIGAPPSymposium on Applied Computing: Technological Challenges of the1990's, April 1992. Biographies
  24. Ching Kang Cheng (ckcheng@calpoly.edu) is agraduate student at California Polytechnic State University, SanLuis Obispo, working towards his MS in Computer Science. Hisresearch interests include Knowledge Management, KnowledgeRepresentation, and Multi-agents Systems.
  25. Xiaoshan Pan (xpan@stanford.edu) is a graduatestudent pursuing a PhD from the Department of Civil andEnvironmental Engineering at Stanford University. His researchinterests include Machine Learning, Natural Language Processing,Complex Adaptive Systems, and Multi-agent Systems.