Academia.eduAcademia.edu

Query Reformulation

description400 papers
group12 followers
lightbulbAbout this topic
Query reformulation is the process of modifying a user's search query to improve the relevance and accuracy of search results. This involves techniques such as synonym replacement, phrase restructuring, and the addition of context-specific terms to enhance information retrieval in databases and search engines.
lightbulbAbout this topic
Query reformulation is the process of modifying a user's search query to improve the relevance and accuracy of search results. This involves techniques such as synonym replacement, phrase restructuring, and the addition of context-specific terms to enhance information retrieval in databases and search engines.

Key research themes

1. How can query expansion and term weighting techniques improve the effectiveness of query reformulation in information retrieval systems?

This research theme investigates algorithmic and automated methods for expanding or reformulating an initial query by adding semantically related or relevant terms and determining their importance to improve recall and precision in information retrieval. These approaches address challenges such as lexical gaps, short query lengths, and user naivety in query formulation by leveraging semantic similarity metrics, lexical relations, or graph-based syntactic dependencies to generate richer queries and better represent user intent.

Key finding: The paper presents Xu, an automated query expansion technique that integrates multiple semantic similarity sources including Datamuse API and a Wikipedia-trained Word2Vec model to generate expanded queries. Xu demonstrates... Read more
Key finding: This study proposes a hybrid algorithmic refinement model that classifies web queries and refines them by generating candidate terms using both ontology and thesaurus sources. The classification-based approach reduces query... Read more
Key finding: The STRICT technique automatically identifies suitable search terms for software change tasks using graph-based term weighting algorithms TextRank and POSRank, which analyze term co-occurrences and linguistic syntactic... Read more
Key finding: This work develops automated semantic resource generation methods exploiting taxonomies with synonymy, antonymy, and other semantic relations to reformulate user queries as intents. It constructs semantic expansion corpora... Read more
Key finding: The paper introduces a novel exploratory search system enabling both positive and negative feedback directly on keyword features of a probabilistic user intent model via an interactive visual interface. Negative relevance... Read more

2. How does adapting and optimizing query reformulation benefit specialized domains like bug localization and software maintenance?

This research theme explores query reformulation techniques tailored to domain-specific applications such as bug localization and software maintenance, where the input queries (e.g., bug reports) often lack explicit structured information or contain noisy elements like stack traces. These approaches focus on contextual query reformulation, quality-aware preprocessing, and dynamic term selection to improve localization accuracy and reduce developer effort by automatically refining suboptimal queries inherent in domain texts.

Key finding: Through an empirical study on 2,320 bug reports and multiple query construction methodologies, this paper exposes that many natural language-only bug reports inherently contain high-quality keywords for bug localization even... Read more
Key finding: BLIZZARD, the proposed technique, classifies bug reports into noisy, rich, or poor based on their structured information content and applies suitable query reformulations accordingly. Evaluations conducted on 5,139 bug... Read more
Key finding: This empirical study replicates three existing IR-based bug localization techniques using a dataset of 5,500 bug reports clustered by structured information quality (stack traces, program entities, plain text). Results reveal... Read more

3. What roles do cognitive abilities and alternative query formulation strategies play in users' interaction with query reformulation processes?

This theme investigates the interplay between individual cognitive differences and query reformulation behaviors during information seeking. It focuses on understanding how cognitive abilities such as visualization and memory impact users' usage of query modification moves, and explores alternative query formulation interfaces and languages that go beyond traditional Boolean-based querying, aiming to reduce user difficulty and broaden effective query construction across diverse user profiles.

Key finding: Secondary analysis of user study data reveals that higher visualization ability correlates with significantly more frequent use of query reformulation moves, including term manipulations, compared to users with lower... Read more
Key finding: This experimental study compares a conventional Boolean-based query language (SQL) with a Truth-table Exemplar-Based Interface (TEBI) that allows users to construct queries by selecting exemplar tuples, bypassing the need for... Read more

All papers in Query Reformulation

Nowadays organizations not only are increasing the data volume, but also they have to work with a large variety of data sources with different types of data. The central problem of information sources integration resides on their... more
Nous présentons dans cet article un logiciel permettant d’assister l’usager, de manière personnalisée lors de la recherche documentaire sur le Web. L’architecture du logiciel est basée sur l’intégration d’outils numériques de traitements... more
We present an approach to increasing the effectiveness of rankedoutput retrieval systems that relies on graphical display and user manipulation of "views" of retrieval results, where a view is the subset of retrieved documents that... more
In this paper we consider the implications for belief revision of weakening the logic under which belief sets are taken to be closed. A widely held view is that the usual belief revision functions are highly classical, especially in being... more
SPARQL query rewriting is a fundamental mechanism for uniformly querying heterogeneous ontologies in the Linked Data Web. However, the complexity of ontology alignments, particularly rich correspondences (c : c), makes this process... more
SPARQL query rewriting is a fundamental mechanism for uniformly querying heterogeneous ontologies in the Linked Data Web. However, the complexity of ontology alignments, particularly rich correspondences (c : c), makes this process... more
GridVine is a Peer Data Management System based on a decentralized access structure. Built following the principle of data independence, it separates a logical layer--where data, schemas and mappings are managed--from a physical layer... more
As a simple XML query language but with enough expressive power, XPath has become very popular. To expedite evaluation of XPath queries, we consider the problem of rewriting XPath queries using materialized XPath views. This problem is... more
We consider the problem of finding equivalent minimalsize reformulations of SQL queries in presence of embedded dependencies . Our focus is on select-project-join (SPJ) queries with equality comparisons, also known as safe conjunctive... more
The Hagsgate town episode which was omitted in the animated film. There are very interesting parts concerning the prophesy and curse set on the prosperous town. The existence of prince Lír aggregates “antifantasy” elements, through... more
Interoperability plays an important role for a variety of applications. One of them are Peer Data Management Systems, where autonomous data sources (peers) interact with each other based on semantic mappings between their schemas. The... more
Users of online retrieval systems experience many difficulties, particularly with search tactics, User studies have indicated that searchers use vocabulary incorrectly and do not take full advantage of iteration to improve their queries.... more
In order to search across factual knowledge and content explicated using different data formats this paper leverages a generic data model (schema) that transforms keyword-based retrieval models and queries to knowledge-oriented models and... more
Le Web des donnees offre un environnement de partage et de diffusion des donnees, selon un cadre particulier qui permet une exploitation des donnees tant par l’humain que par la machine. Pour cela, le framework RDF propose de formater les... more
With the exponential growth in web users, search history is also growing exponentially. To manage the web search , search engine uses different techniques. It gives users an easy feel to search their interest by providing page ranking,... more
Executive Summary Digital libraries can be viewed as an infrastructure for supporting both the creation of information sources and the movement of information across global networks, and moreover the effective and efficient interaction... more
Data warehouses are nowadays an important component in every competitive system, it's one of the main components on which business intelligence is based. We can even say that many companies are climbing to the next level and use a set of... more
We support exible query processing with autonomous networked information sources. Flexibility allows a query to be accepted in a dynamic environment with unavailable sources. Flexibility provides the ability to identify equivalent... more
We propose an intelligent and an efficient query processing approach for semantic mediation of information systems. We propose also a generic multi agent architecture that supports our approach. Our approach focuses on the exploitation of... more
With evolving Web, short length parallel corpora is becoming very common and some of these include user queries, web snippets etc. This paper concerns situations where short length parallel corpora has to be analyzed in order to find... more
With evolving Web, short length parallel corpora is becoming very common and some of these include user queries, web snippets etc. This paper concerns situations where short length parallel corpora has to be analyzed in order to find... more
In a number of domains, particularly in bioinformatics, there is a need for complex data analysis. For that issue, elementary data analysis operations called tasks are composed as workflows. The composition of tasks is however difficult... more
The FORUM project aims at extending existing data integration techniques in order to facilitate the development of mediation systems in large and dynamic environments. It is well known from the literature that a crucial point that hampers... more
,johannes.heinecke}@orange-ftgroup.com RÉSUMÉ. Cet article décrit l'utilisation d'une plateforme de traitement automatique des langues naturelles pour le développement d'une fonction de réponses à des questions dans un moteur de... more
The fuzzy information retrieval model was proposed some years ago to solve several limitations of the Boolean model without a need of a complete redesign of the information retrieval system. However, the complexity of the fuzzy query... more
Automatic query expansion (AQE) is an effective measure to improve information retrieval performance by including additional terms in a user query. The pseudo relevance feedback (PRF) method employed for AQE so far has suffered from a... more
When computational researchers from several domains cooperate, one recurrent problem is finding tools, methods and approaches that can be used across disciplines, to enhance collaboration through reuse. The paper presents our ongoing work... more
Defining a measure of similarity between queries is an interesting and difficult problem. A reliable query-similarity measure can be used in a variety of applications such as query recommendation, query expansion, and advertising. In this... more
We introduce a new multimodal retrieval technique which combines query reformulation and visual image reranking in order to deal with results sparsity and imprecision, respectively. Textual queries are reformulated using Wikipedia... more
Personal agents have been developed that assist a user with information processing needs by generating, filtering, collecting, or transforming information. On the other hand internet stores are providing services customized by the needs... more
We describe a visualization tool for XPath expressions called XViz. Starting from a workload of XQueries, the tool extracts the set of all XPath expressions, and displays them together with some relationships. XViz is intended to be used... more
Web searching is becoming more and more complex due to increased size of information on the web. Users have to face a lot of problems in specifying their needs in the form of query. Query Reformulation techniques are required in order to... more
This report describes the activities of SLP Spoken Document Processing Working Group (SDPWG). The SDPWG was organized in 2006. The working group was reorganized in 2009. This report mainly describes the activities of the second period of... more
This is a demonstration of data coordination in a peer data management system through the employment of distributed triggers. The latter express in a declarative manner individual security and consistency requirements of peers, that... more
Knowledge workers such as healthcare information professionals, legal researchers, and librarians need to create and execute search strategies that are comprehensive, transparent, and reproducible. The traditional solution is to use... more
Information searching in web environment is habitually tedious and challenging task. Rapid growth of web information infrastructure has led to the rapid publication of information on web environment. Too many information publish on web... more
N-ary conjunctive queries, ie, queries with any number of answer variables, are the formal core of many Web query languages including XSLT, XQuery, SPARQL, and Xcerpt. Despite a considerable body of research on the optimization of such... more
While a major share of prior work have considered search sessions as the focal unit of analysis for seeking behavioral insights, search tasks are emerging as a competing perspective in this space. In the current work, we quantify user... more
The paper deals with assisting e n d-users in accessing a European Reliability Data System-the design of an inte l ligent int.erface aimed at large and complex technical data base (namely, the ERDS) in a friendly , correct , and effective... more
In distributed data environments, peers (data sources) are connected with each other through a set of semantic correspondences in such a way that peers directly connected are called semantic neighbours. Queries are submitted considering... more
Dynamic environments are descentralized systems that provide users with querying capabilities over a set of heterogeneous, distributed and autonomous data sources. Data Integration Systems, Peer Data Management Systems (PDMS) and... more
Query formulation and reformulation is recognized as one of the most difficult tasks that users in information retrieval systems are asked to perform. This study investigated the use of two different techniques for supporting query... more
Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research... more
Success of query reformulation and relevant information retrieval depends on many factors, such as users' prior knowledge, age, gender, and cognitive styles. One of the important factors that affect a user's query reformulation behaviour... more
Success of query reformulation and relevant information retrieval depends on many factors, such as users' prior knowledge, age, gender, and cognitive styles. One of the important factors that affect a user's query reformulation behaviour... more
InOrder is a query refinement tool that works on top of Goolge and helps individual users to collaboratively participate in best Web query formulations. The incremental refinement works via an indirect communication process facilitated by... more
lyon1.fr Résumé. Le développement du Web Sémantique a conduità l'élaboration de standards pour la représentation des connaissances sur le Web. RDF, comme un de ces standards, est devenu une recommandation du W3C. Même s'il aété conçu... more
lyon1.fr Résumé. Le développement du Web Sémantique a conduità l'élaboration de standards pour la représentation des connaissances sur le Web. RDF, comme un de ces standards, est devenu une recommandation du W3C. Même s'il aété conçu... more
Download research papers for free!