Academia.eduAcademia.edu

Outline

Scalable Multilingual Information Access

2003, Lecture Notes in Computer Science

https://doi.org/10.1007/978-3-540-45237-9_17

Abstract

The third Cross-Language Evaluation Forum workshop (CLEF-2002) provides the unprecedented opportunity to evaluate retrieval in eight different languages using a uniform set of topics and assessment methodology. This year the Johns Hopkins University Applied Physics Laboratory participated in the monolingual, bilingual, and multilingual retrieval tasks. We contend that information access in a plethora of languages requires approaches that are inexpensive in developer and run-time costs. In this paper we describe a simplified approach that seems suitable for retrieval in many languages; we also show how good retrieval is possible over many languages, even when translation resources are scarce, or when query-time translation is infeasible. In particular, we investigate the use of character n-grams for monolingual retrieval, pre-translation expansion as a technique to mitigate errors due to limited translation resources, and translation of document representations to an interlingua for computationally efficient retrieval against multiple languages.

References (5)

  1. C. Buckley, M. Mitra, J. Walz, and C. Cardie, 'Using Clustering and Super Concepts within SMART: TREC-6'. In E. Voorhees and D. Harman (eds.), Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500-240, 1998.
  2. F. Gey, H. Jiang, A. Chen, and R. Larson, 'Manual Queries and Machine Translation in Cross-language Retrieval and Interactive Retrieval with Cheshire II at TREC-7'. In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Seventh Text REtrieval Conference (TREC-7), pp. 527-540, 1999.
  3. P. McNamee and J. Mayfield, 'JHU/APL Experiments at CLEF: Translation Resources and Score Normalization'..In Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck (eds.), Evaluation of Cross- Language Information Retrieval Systems: Proceedings of the CLEF 2001 Workshop, Lecture Notes in Computer Science 2406, Springer, pp. 193-208, 2001.
  4. Paul McNamee and James Mayfield, 'Comparing Cross-Language Query Expansion Techniques by Degrading Translation Resources'. In the Proceedings of the 25th Annual International Conference on Research and Development in Information Retrieval (SIGIR-2002), Tampere, Finland, August 2002.
  5. E. M. Voorhees, 'The Philosophy of Information Retrieval Evaluation.' ..In Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck (eds.), Evaluation of Cross-Language Information Retrieval Systems: Proceedings of the CLEF 2001 Workshop, Lecture Notes in Computer Science 2406, Springer, pp. 355-370, 2001.