CLIR Evaluation at TREC
2000
https://doi.org/10.1007/3-540-44645-1_2…
16 pages
1 file
Sign up for access to the world's latest research
Abstract
Starting in 1997, the National Institute of Standards and Technology conducted 3 years of evaluation of cross-language information retrieval systems in the Text REtrieval Conference (TREC). Twentytwo participating systems used topics (test questions) in one language to retrieve documents written in English, French, German, and Italian. A large-scale multilingual test collection has been built and a new technique for building such a collection in a distributed manner was devised.
Related papers
2000
This year the Eurospider team, with help from Columbia, focused on trying different combinations of translation approaches. We investigated the use and integration of pseudo-relevance feedback, multilingual similarity thesauri and machine translation. We also looked at different ways of merging individual crosslanguage retrieval runs to produce multilingual result lists. We participated in both the CLIR main task and the GIRT sub task.
Lecture Notes in Computer Science, 2003
This paper describes the official runs of our team for CLEF 2002. We took part in the monolingual tasks for each of the seven non-English languages for which CLEF provides document collections (Dutch, Finnish, French, German, Italian, Spanish, and Swedish). We also conducted our first experiments for the bilingual task (English to Dutch, and English to German), and took part in the GIRT and Amaryllis tasks. Finally, we experimented with the combination of runs.
Proceedings of TWLT14, …, 1998
TWLT is an acronym of Twente Workshop(s) on Language Technology. These workshops on natural language theory and technology are organised by the Parlevink Project, a language theory and technology project of the . For each workshop proceedings are published containing the papers that were presented. TWLT 14, has been organised together with the German Research Center for Artificial Intelligence, DFKI Saarbrücken, Germany. The idea for this workshop grew out of a longstanding cooperation between the University of Twente, TNO-TPD in Delft and DFKI. This co-operation manifested itself for the first time in the Twenty-One project, which inspired a whole series of other projects, such as Pop-Eye and Olive, but which also led to a close contact and exchange with independently established projects such as Mulinex and MIETTA for which DFKI was responsible. All of these projects had in common that they were funded by the Telematics Application Programme of the European Commission, all, except for Twenty-One, by the Language Engineering Sector.
Transactions on Engineering, Computing …, 2005
Abstract-Classical Information Retrieval (IR) is the sifting out of the documents most relevant to a user's information requirement (expressed as a query), from a large electronic store of documents. A search engine performs IR by retrieving relevant web pages from the ...
2021
Two key assumptions shape the usual view of ranked retrieval: (1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and (2) that ranking retrieved documents will suffice because the searcher will be able to recognize those which they wished to find. When the documents to be searched are in a language not known by the searcher, neither assumption is true. In such cases, Cross-Language Information Retrieval (CLIR) is needed. This chapter reviews the state of the art for cross-language information retrieval and outlines some open research questions.
Proceedings of the sixth …, 2007
Proceedings of the Fifth Workshop on Important Unresolved Matters, 2005
The performance of cross-language information retrieval (CLIR) systems has been improved to the level of practical use. The next step is to inform potential users that CLIR technologies are ready to be used. A good way of doing this is to present attractive scenarios of using multilingual information sources. For this purpose, we need to obtain more knowledge on the occasions when CLIR is more beneficial as compared with monolingual information retrieval from the utility perspective. The difficulty lies in the ...
QUILT (Query User Interface with Light Translations) is prototype implementation of a complete cross-language text retrieval system that takes English queries and produces English gloss translations of Spanish documents. The system indexes the Spanish documents in Spanish, but converts the English query into a Spanish equivalent set through a novel combination of lexical methods and parallel-corpus disam- biguatinn. Similar methods are applied to the returned docu- ment to produce a simple translation that can be examined by non-Spanish speakers to gauge the relevance of the document to the original English query. The system integrates tradi- tional, glossary-based machine txanslation technology with information retrieval approaches and demonstrates that rela- tively simple term substitution and disambiguation approaches can he viable for cross-language text retrieval. Components of QUILT have been used to build a CLTR inter- face to WWW-based search services.
Information processing & …, 2000
In this paper, we present the system MULINEX, a fully implemented system which supports cross-lingual search of the WWW. Users can formulate, expand and disambiguate queries, filter the search results and read the retrieved documents by using only their native language. This multilingual functionality is achieved by the use of dictionary-based query translation, multilingual document categorisation and automatic translation of summaries and documents. The system supports French, German and English and has been installed and tested in the online services of two European internet content and service provider companies. This paper focuses on the techniques and algorithms used in the MULINEX system, explaining how each component works and how it contributes to the overall functionality of the integrated system. The primary system functionalities are outlined from the user perspective, followed by a description of the document database used in the system. The technologies and linguistic resources used in the various system components are then described in detail.
Indonesian Journal of Electrical Engineering and Computer Science
Cross language information retrieval (CLIR) is a retrieval process in which the user fires queries in one language to retrieve information from another (different) language. The diversity of information and language barriers are the serious issues for communication and cultural exchange across the world. To solve such barriers, Cross language information retrieval system, are nowadays in strong demand. CLIR is a subset of Information Retrieval (IR) system. Information Retrieval deals with finding useful information from a large collection of unstructured, structured and semi-structured data to a user query where the query is a set of keywords. Information Retrieval can be classified into different classes such as Monolingual information retrieval, Bi-Lingual Information Retrieval, Multilingual information retrieval and Cross language information retrieval. This paper focuses on the various IR variants and techniques used in CLIR system. Further, based on available literature, a numb...

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.