Academia.eduAcademia.edu

Query Reformulation

description400 papers
group12 followers
lightbulbAbout this topic
Query reformulation is the process of modifying a user's search query to improve the relevance and accuracy of search results. This involves techniques such as synonym replacement, phrase restructuring, and the addition of context-specific terms to enhance information retrieval in databases and search engines.
lightbulbAbout this topic
Query reformulation is the process of modifying a user's search query to improve the relevance and accuracy of search results. This involves techniques such as synonym replacement, phrase restructuring, and the addition of context-specific terms to enhance information retrieval in databases and search engines.

Key research themes

1. How can query expansion and term weighting techniques improve the effectiveness of query reformulation in information retrieval systems?

This research theme investigates algorithmic and automated methods for expanding or reformulating an initial query by adding semantically related or relevant terms and determining their importance to improve recall and precision in information retrieval. These approaches address challenges such as lexical gaps, short query lengths, and user naivety in query formulation by leveraging semantic similarity metrics, lexical relations, or graph-based syntactic dependencies to generate richer queries and better represent user intent.

Key finding: The paper presents Xu, an automated query expansion technique that integrates multiple semantic similarity sources including Datamuse API and a Wikipedia-trained Word2Vec model to generate expanded queries. Xu demonstrates... Read more
Key finding: This study proposes a hybrid algorithmic refinement model that classifies web queries and refines them by generating candidate terms using both ontology and thesaurus sources. The classification-based approach reduces query... Read more
Key finding: The STRICT technique automatically identifies suitable search terms for software change tasks using graph-based term weighting algorithms TextRank and POSRank, which analyze term co-occurrences and linguistic syntactic... Read more
Key finding: This work develops automated semantic resource generation methods exploiting taxonomies with synonymy, antonymy, and other semantic relations to reformulate user queries as intents. It constructs semantic expansion corpora... Read more
Key finding: The paper introduces a novel exploratory search system enabling both positive and negative feedback directly on keyword features of a probabilistic user intent model via an interactive visual interface. Negative relevance... Read more

2. How does adapting and optimizing query reformulation benefit specialized domains like bug localization and software maintenance?

This research theme explores query reformulation techniques tailored to domain-specific applications such as bug localization and software maintenance, where the input queries (e.g., bug reports) often lack explicit structured information or contain noisy elements like stack traces. These approaches focus on contextual query reformulation, quality-aware preprocessing, and dynamic term selection to improve localization accuracy and reduce developer effort by automatically refining suboptimal queries inherent in domain texts.

Key finding: Through an empirical study on 2,320 bug reports and multiple query construction methodologies, this paper exposes that many natural language-only bug reports inherently contain high-quality keywords for bug localization even... Read more
Key finding: BLIZZARD, the proposed technique, classifies bug reports into noisy, rich, or poor based on their structured information content and applies suitable query reformulations accordingly. Evaluations conducted on 5,139 bug... Read more
Key finding: This empirical study replicates three existing IR-based bug localization techniques using a dataset of 5,500 bug reports clustered by structured information quality (stack traces, program entities, plain text). Results reveal... Read more

3. What roles do cognitive abilities and alternative query formulation strategies play in users' interaction with query reformulation processes?

This theme investigates the interplay between individual cognitive differences and query reformulation behaviors during information seeking. It focuses on understanding how cognitive abilities such as visualization and memory impact users' usage of query modification moves, and explores alternative query formulation interfaces and languages that go beyond traditional Boolean-based querying, aiming to reduce user difficulty and broaden effective query construction across diverse user profiles.

Key finding: Secondary analysis of user study data reveals that higher visualization ability correlates with significantly more frequent use of query reformulation moves, including term manipulations, compared to users with lower... Read more
Key finding: This experimental study compares a conventional Boolean-based query language (SQL) with a Truth-table Exemplar-Based Interface (TEBI) that allows users to construct queries by selecting exemplar tuples, bypassing the need for... Read more

All papers in Query Reformulation

Dans cet article, nous proposons d'exploiter des liens sémantiques entre concepts pour améliorer la recherche d'information. Un thesaurus électronique de langue générale est utilisé pour la reformulation des requêtes utilisateurs en... more
Understanding query reformulation patterns is a key task towards next generation web search engines. If we can do that, then we can build systems able to understand and possibly predict user intent, providing the needed assistance at the... more
With the rapid growth of information on the Web, the study of information searching has let to an increased interest. Information behaviour (IB) researchers and information systems (IS) developers are continuously exploring user - Web... more
The Web of Data is an open environment consisting of a great number of large inter-linked RDF datasets from various do-mains. In this environment, organizations and companies adopt the Linked Data practices utilizing Semantic Web (SW)... more
SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework... more
"In the context of the emergent Web of Data, a large number of organizations, institutes and companies (e.g., DBpedia, Geonames, PubMed ACM, IEEE, NASA, BBC) adopt the Linked Data practices and publish their data utilizing Semantic Web... more
Purpose: To investigate and identify the patterns of interaction between searchers and search engine during Web searching. Design: We examined 2,465,145 interactions from 534,507 users of Dogpile.com submitted on May 6, 2005. We compared... more
Intentional services have been proposed to bridge the gap between low level, technical software-service descriptions and high level, strategic expressions of business needs for services. This proposal leverages on the SOA to an... more
Web searcher is an important area of research for designing more helpful searching systems and targeting content to particular users. Methods explored by other researchers include both qualitative (i.e., the use of human judges to... more
Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research... more
In the context of the emergent Web of Data, a large number of organizations, institutes and companies (e.g., DBpedia, Geonames, PubMed ACM, IEEE, NASA, BBC) adopt the Linked Data practices and publish their data utilizing Semantic Web... more
The purpose of this analysis was to evaluate an existing set of search tasks in terms of their effectiveness as part of a " shared infrastructure " for conducting interactive IR research. Twenty search tasks that varied in their cognitive... more
Designing good user interfaces to information retrieval systems is a complex activity. The design space is large and evaluation methodologies that go beyond the classical precision and recall figures are not well established. In this... more
Web-search queries are known to be short, but little else is known about their structure. In this paper we investigate the applicability of part-of-speech tagging to typical Englishlanguage web search-engine queries and the potential... more
The Web of Data encourages organizations and companies to publish their data according to the Linked Data practices and offer SPARQL endpoints. On the other hand, the dominant standard for information exchange is XML. The SPARQL2XQuery... more
Abstract. We propose a framework that supports a federated environment based on a Mediator Architecture in the Semantic Web. The Mediator supports mappings between the OWL Ontology of the Mediator and the other ontologies in the federated... more
This dissertation is dedicated to my family: my wife, Lhamo, my son, Tenzi Jurmin Lhendup, and my daughter, Kinley Om, for their love and support. (x) (xi) Lastly, a special thank you to my wife, Lhamo and kids, Tenzi Lhendup and Kinley... more
Recent findings suggest that Information Retrieval (IR)-based bug localization techniques do not perform well if the bug report lacks rich structured information (e.g., relevant program entity names). Conversely, excessive structured... more
Problem Statement: The huge number of information on the web as well as the growth of new inexperienced users creates new challenges for information retrieval. It has become increasingly difficult for these users to find relevant... more
People have different mental strengths and weakness, which can be measured according to cognitive ability. Learning about strengths and preferences in terms of search behavior, and looking for patterns between behaviors and cognitive... more
a b s t r a c t Improving the retrieval accuracy of MEDLINE documents is still a challenging issue due to low retrieval precision. Focusing on a query expansion technique based on pseudo-relevance feedback (PRF), this paper addresses the... more
Understanding query reformulation patterns is a key task towards next generation web search engines. If we can do that, then we can build systems able to understand and possibly predict user intent, providing the needed assistance at the... more
We propose a framework that supports a federated environment based on a Mediator Architecture in the Semantic Web. The Mediator supports mappings between the OWL Ontology of the Mediator and the other ontologies in the federated sites.... more
A web search engine log is a very rich source of semantic knowledge. In this paper we focus on the extraction of hyponymy relations from individual user sessions by examining, search behavior. The results obtained allow us to identify... more
Mining user web search activity potentially has a broad range of applications including web result pre-fetching, automatic search query reformulation, click spam detection, estimation of document relevance and prediction of user... more
Problem Statement: The huge number of information on the web as well as the growth of new inexperienced users creates new challenges for information retrieval. It has become increasingly difficult for these users to find relevant... more
Defining a measure of similarity between queries is an interesting and difficult problem. A reliable query-similarity measure can be used in a variety of applications such as query recommendation, query expansion, and advertising.
This report firstly summarises the work thus far on the XMAP data integration algorithm and on middleware with regard to Grid query processing services, secondly, proposes an architecture for data integration-enabled query processing on... more
Data Grids rely on the coordinated sharing of and interaction across multiple autonomous database management systems. They provide transparent access to heterogeneous and autonomous data resources stored on Grid nodes. Data sharing tools... more
tournées. ABSTRACT. This paper presents SHIRI-Querying, an approach for semantic search in semi- structured documents. We propose a solution to tackle incompleteness and imprecision of an- notations at querying time. This solution relies... more
Understanding query reformulation patterns is a key step towards next generation web search engines: it can help improving users' web-search experience by predicting their intent, and thus helping them to locate information more... more
This paper proposes an approach for query reformulation based on the generation of appropriate query-biased concepts. Query-biased concepts are generated from retrieved documents using their content and structure. In this paper, we focus... more
We present and analyze an algorithm for equivalent rewriting of XQuery queries using XQuery views, which is complete for a large class of XQueries featuring nested FLWR blocks, XML construction and join equalities by value and identity.... more
We state and solve the query reformulation problem for XML publishing in a general setting that allows mixed (XML and relational) storage for the proprietary data and exploits redundancies (materialized views, indexes and caches) to... more
This paper examines the reliability of implicit feedback generated from clickthrough data and query reformulations in WWW search. Analyzing the users' decision process using eyetracking and comparing implicit feedback against manual... more
Peers in a peer-to-peer data management system often have heterogeneous schemas and no mediated global schema. To translate queries across peers, we assume each peer provides correspondences between its schema and a small number of other... more
Learning to rank plays an important role in information retrieval. In most of the existing solutions for learning to rank, all the queries with their returned search results are learnt and ranked with a single model. In this paper, we... more
We consider Cooperative Information Systems (CIS) that are multidatabase systems (MDBMS), with a common object-oriented model, based on the ODMG standard, together with local databases that may be relational, object-oriented, or dedicated... more
Examines the use of query reformulation, and particularly the use of relevance feedback by users of the Excite Web search engine. A total of 985 user search sessions from a data set of 18,113 user search sessions containing 51,473 queries... more
Inter-business collaborative contexts prefigure a distributed scenario where companies organize and coordinate themselves to develop common and shared opportunities. Traditional business intelligence systems do not provide support to this... more
Knowledge-intensive applications pose new challenges to metadata management, including distribution, access control, uniformity of access, and evolution in time. The authors identify general requirements for metadata management and... more
Nous présentons certains des nouveaux résultats du projet RAP (Recherche, Analyse et Propose) qui s'inscrit dans le cadre de la recherche et de l'extraction ciblée d'informations sur le Web. RAP propose à ses utilisateurs d'organiser... more
Inter-business collaborative contexts prefigure a distributed scenario where companies organize and coordinate themselves to develop common and shared opportunities. Traditional business intelligence systems do not provide support to this... more
Relevance feedback (RFB) involves requesting some user judgments for an initial set of search results and then using these judgments to improve search results. Typical queries may have multiple possible interpretations or facets, , but... more
As Internet resources become accessible to more and more countries, there is a need to develop efficient methods for information retrieval across languages. In the present paper, we focus on query expansion techniques to improve the... more
Query recommendations are an integral part of modern search engines. Their goal is to facilitate users' search tasks, as well as help them discover and explore concepts related to their information needs. In this paper, we present a... more
Long queries form a difficult, but increasingly important segment for web search engines. Query reduction, a technique for dropping unnecessary query terms from long queries, improves performance of ad-hoc retrieval on TREC collections.... more
Page 1. Querying Structured Data in an Unstructured P2P System Verena Kantere School of Electr. and Comp. Engineering National Technical University of Athens vkante@dbnet.ece.ntua.gr Dimitrios Tsoumakos Department ...
Download research papers for free!