Academia.eduAcademia.edu

Query Generation

description7 papers
group0 followers
lightbulbAbout this topic
Query Generation is the process of automatically creating search queries or questions from a given set of data or user input, aimed at retrieving relevant information from databases or search engines. It involves natural language processing and understanding to enhance information retrieval efficiency and accuracy.
lightbulbAbout this topic
Query Generation is the process of automatically creating search queries or questions from a given set of data or user input, aimed at retrieving relevant information from databases or search engines. It involves natural language processing and understanding to enhance information retrieval efficiency and accuracy.

Key research themes

1. How can automated query expansion and refinement improve information retrieval accuracy and query relevance?

This research area focuses on enhancing initial user queries by automatically adding, modifying, or selecting candidate terms to improve the relevance and coverage of retrieved documents. It matters because many users formulate brief or poorly constructed queries that cause low recall or precision in information retrieval. Automated expansion and refinement techniques harness linguistic resources, semantic similarity models, and query classification to systematically augment queries, balancing recall and precision.

Key finding: Xu demonstrates that automated query expansion using semantic similarity measures from Word2Vec and lexical APIs can improve recall by more than ten percent while managing precision using Boolean operators. The paper... Read more
Key finding: This work develops and experimentally validates a hybrid refinement method combining ontology and thesaurus resources, guided by an initial classification of query types. Tested on TREC 2014 queries with real-time search... Read more
Key finding: This seminal survey outlines logic-based query transformations and cost-based optimization strategies that can be viewed as a foundational parallel to query expansion/refinement in databases. It emphasizes that heuristic... Read more

2. What are the challenges and methods to translating natural language queries into formal SQL queries for relational databases?

This research theme addresses the problem of bridging the gap between natural user language and structured query languages like SQL to enable non-expert users to retrieve data accurately from relational databases. It involves semantic parsing, syntactic and semantic analysis, and the use of grammars and machine learning methods to generate executable SQL commands from free-text inputs. Accurate SQL generation facilitates enhanced accessibility and user-friendly database querying.

Key finding: The survey presents an organized analysis of deep learning architectures (e.g., CNNs, RNNs, pointer networks, reinforcement learning) that have been applied to map natural language questions to SQL queries. It stresses the... Read more
by saparja dey and 
1 more
Key finding: This paper proposes a semantic grammar-based architecture for converting English queries into SQL commands, targeting users without SQL proficiency. It details the process including morphological, syntactic and semantic... Read more
Key finding: SQGNL is designed as a database-independent system that utilizes linguistic dependencies and metadata to build sets of possible SELECT and WHERE clauses, generating multiple candidate SQL queries for a given natural language... Read more

3. How can query languages and interfaces be improved to facilitate intuitive, flexible, and efficient database querying for diverse user types?

This theme explores the human factors, linguistic, and logical foundations of query languages and interfaces, focusing on usability for novices and experts alike. It includes research into flexible query languages employing fuzzy logic, exemplarbased interfaces, and hierarchical taxonomies of query languages, aiming to reduce complexity and improve the expressiveness and accessibility of database querying.

Key finding: This experimental study compares traditional SQL query languages with a Truth-table Exemplar-Based Interface (TEBI), finding that users of TEBI performed better and with greater resilience to cognitive skill variability. The... Read more
Key finding: The paper proposes taxonomies for flexible query languages based on fuzzy set theory, separating approaches for crisp and fuzzy relational databases. It demonstrates that fuzzy linguistic terms can better represent user... Read more
Key finding: This work presents a hierarchical taxonomy categorizing query languages by user interaction senses (e.g., visual, verbal) and method-level conceptual models and methodologies (e.g., declarative, imperative, programmatic). The... Read more

All papers in Query Generation

Min-based qualitative possibilistic networks are one of the effective tools for a compact representation of decision problems under uncertainty. The exact approaches for computing decision based on possibilistic networks are limited by... more
We are developing a web-based plagiarism detection system to detect plagiarism in written Arabic documents. This paper describes the proposed framework of our plagiarism detection system. The proposed plagiarism detection framework... more
A considerable interest has been given to Multiword Expression (MWEs) identification and treatment. The identification of MWEs affects the quality of results of different tasks heavily used in natural language processing (NLP) such as... more
Discharge summaries serve a variety of aims, ranging from clinical care to legal purposes. They are also important tools in patient empowerment, but a patient's comprehension of the information is often suboptimal. Continuing in the... more
Discharge summaries serve a variety of aims, ranging from clinical care to legal purposes. They are also important tools in patient empowerment, but a patient's comprehension of the information is often suboptimal. Continuing in the... more
Online information is growing enormously day by day with the blessing of World Wide Web. Search engines often provide users with abundant collection of articles; in particular, news articles which are retrieved from different news sources... more
In this research work, we examine one of the most applied networking website, namely the Facebook, for conducting courses as a replacement of valuable classical electronic learning platforms. At the initial stage of the Internet... more
In this paper, we study the social networking website, Facebook, for conducting courses as a replacement of high-cost classical electronic learning platforms. At the early stage of the Internet community, users of the Interned used email... more
We are developing a web-based plagiarism detection system to detect plagiarism in written Arabic documents. This paper describes the proposed framework of our plagiarism detection system. The proposed plagiarism detection framework... more
This article describes an ongoing research which intends to develop a plagiarism detection system for Arabic documents. We developed different heuristics to generate effective queries for document retrieval from the Web. The performance... more
Energy efficient building design and construction calls for extensive collaboration between different subfields of the Architecture, Engineering and Construction (AEC) community. Performing building design and construction engineering... more
In this research work, we examine one of the most applied networking website, namely the Facebook, for conducting courses as a replacement of valuable classical electronic learning platforms. At the initial stage of the Internet... more
In this paper, we study the social networking website, Facebook, for conducting courses as a replacement of high-cost classical electronic learning platforms. At the early stage of the Internet community, users of the Interned used email... more
A considerable interest has been given to Multiword Expression (MWEs) identification and treatment. The identification of MWEs affects the quality of results of different tasks heavily used in natural language processing (NLP) such as... more
Document keywords are associated to documents as summarized versions of the documents' content. Considering that the number of documents is quickly growing every day, the availability of these keywords is very important. Although,... more
In this paper we describe our participation to the CLEF 2018 Consumer Health Search Task, sub task IRTask1. This track aims to evaluate and advance search technologies aimed at supporting consumers to find health advice online. Our... more
Nowadays, social medias are very popular among their users. One of the most well-known social networks is Twitter. It is a micro-blog that enables its users to send short messages called tweets. A tweet is a 140 characters long message... more
Multi Sentence Compression (MSC) is of great value to many real world applications, such as guided microblog summarization, opinion summarization and newswire summarization. Recently, word graph-based approaches have been proposed and... more
This article describes an ongoing research which intends to develop a plagiarism detection system for Arabic documents. We developed different heuristics to generate effective queries for document retrieval from the Web. The performance... more
Nowadays, social medias are very popular among their users. One of the most well-known social networks is Twitter. It is a micro-blog that enables its users to send short messages called tweets. A tweet is a 140 characters long message... more
To evaluate and improve medical information retrieval, benchmarking data sets need to be created. Few benchmarks have been focusing on patients’ information needs. There is a need for additional benchmarks to enable research into... more
Plagiarism is becoming more of a problem in academics. It's made worse by the ease with which a wide range of resources can be found on the internet, as well as the ease with which they can be copied and pasted. It is academic theft since... more
Abstract. This paper describes the participation of MRIM team in Task 3: Patient-Centered Information Retrieval-IRTask 1: Ad-hoc search of CLEF eHealth Evaluation lab 2016. The aim of this task is to evaluate the effectiveness of... more
An Automatic Summary generation process creates a shortened version of the text using a Digital programming Technology, with the aim of holding the most advanced important points of the original text. In a Common Law system, previous... more
Nowadays, social medias are very popular among their users. One of the most well-known social networks is Twitter. It is a micro-blog that enables its users to send short messages called tweets. A tweet is a 140 characters long message... more
Plagiarism in students' source codes constitutes an important drawback for the educational process. In addition, plagiarism detection in source codes is time consuming and tiresome task. Therefore, many approaches for plagiarism detection... more
This paper details the collection, systems and evaluation methods used in the IR Task of the CLEF 2016 eHealth Evaluation Lab. This task investigates the e↵ectiveness of web search engines in providing access to medical information for... more
Discharge summaries serve a variety of aims, ranging from clinical care to legal purposes. They are also important tools in patient empowerment, but a patient's comprehension of the information is often suboptimal. Continuing in the... more
Plagiarism detection is gaining importance due to requirements for integrity in Research works especially when it comes to Cross-lingual plagiarism. In this paper, we have researched a new approach for Cross-Lingual sentence level... more
Many applications require the affiliation of sentences (which includes text summarization, answering questions, producing natural language, analyzing natural language, and text clustering). The similarity of terms may be improved using... more
In this paper we introduce the MatchDetectReveal(MDR) system, which is capable of identifying overlapping and plagiarised documents. Each component of the system is briefly described. The matching-engine component uses a modified suffix... more
An Automatic Summary generation process creates a shortened version of the text using a Digital programming Technology, with the aim of holding the most advanced important points of the original text. In a Common Law system, previous... more
The recruitment of new personnel is one of the most essential business processes which affect the quality of human capital within any company. It is highly essential for the companies to ensure the recruitment of right talent to maintain... more
Plagiarism has become an infamous problem in the global academic community. Detecting plagiarism in Arabic documents is particularly a challenging task due to the complexity of the structure of the language. This paper introduces a... more
We are developing a web-based plagiarism detection system to detect plagiarism in written Arabic documents. This paper describes the proposed framework of our plagiarism detection system. The proposed plagiarism detection framework... more
This article describes an ongoing research which intends to develop a plagiarism detection system for Arabic documents. We developed different heuristics to generate effective queries for document retrieval from the Web. The performance... more
by Wei Liu and 
1 more
This paper discusses a new metric that has been applied to verify the quality in translation between sentence pairs in parallel corpora of Arabic-English. This metric combines two techniques, one based on sentence length and the other... more
The shift in human computer interaction from desktop computing to mobile interaction highly influences the needs for new designed interfaces. In this paper, we address the issue of searching for information on mobile devices, an area also... more
Plagiarismisdescribedasthereuseofsomeoneelse’spreviousideas,workorevenwordswithoutsufficientattributiontothesource.Thispaperpresentsamethodtodetectexternalplagiarismusingtheintegrationofsemanticrelationsbetweenwordsandtheirsyntacticcomposit... more
Min-based qualitative possibilistic networks are one of the effective tools for a compact representation of decision problems under uncertainty. The exact approaches for computing decision based on possibilistic networks are limited by... more
Motivated by the problem of computing investment portfolio weightings we investigate various methods of clustering as alternatives to traditional mean-variance approaches. Such methods can have significant benefits from a practical point... more
Min-based qualitative possibilistic networks are one of the effective tools for a compact representation of decision problems under uncertainty. The exact approaches for computing decision based on possibilistic networks are limited by... more
This paper stresses on the need of using Knowledge Management (KM) in the higher education institutions of Saudi Arabia. The paper is based on the literature review and personal experience of the author in the education sector. The paper... more
This paper discusses a new metric that has been applied to verify the quality in translation between sentence pairs in parallel corpora of Arabic-English. This metric combines two techniques, one based on sentence length and the other... more
In this paper, we study the social networking website, Facebook, for conducting courses as a replacement of high-cost classical electronic learning platforms. At the early stage of the Internet community, users of the Interned used email... more
In this paper, implementations of three Hough Transform based fingerprint alignment algorithms are analyzed with respect to time complexity on Java Card environment. Three algorithms are: Local Match Based Approach (LMBA), Discretized... more
We are developing a web-based plagiarism detection system to detect plagiarism in written Arabic documents. This paper describes the proposed framework of our plagiarism detection system. The proposed plagiarism detection framework... more
Clustering is an extensively studied data mining problem in the text domains. The difficulty finds numerous applications in customer segmentation, classification, collaborative filtering, visualization, document organization, and... more
Abstract—The nature of Arabic language structure exposes the need for fuzzy or vague concept to reveal dishonest practices in Arabic documents. In this paper, we present a statement-based plagiarism detection approach in Arabic scripts... more
Abstract Research in Information Retrieval has traditionally focused on serving the best results for a single query. Real users, however, often begin an interaction with a search engine with a sufficiently under-specified information need... more
Download research papers for free!