Using Semantic Query Expansion and Relevance Feedback for an Effective Tweet Contextualization Process
Bound to 240 characters, tweets are short and ambiguous by nature due to the way they are written... more Bound to 240 characters, tweets are short and ambiguous by nature due to the way they are written, without maintaining formal grammar and proper spelling. These constraints increase the possibility of the tweet’s misunderstanding. Thus, it is essential to know the original context of their realization. This paper falls under the tweet contex-tualization task which aims to produce an informative and coherent paragraph, called a context, from a set of documents in response to topics treated by the tweet, allowing a reader to better understand the tweet. We propose to combine a semantic query expansion approach and Relevance Feedback technique in order to enhance queries (tweets) and documents (returned by the Information Retrieval System), to produce more informative contexts.The effectiveness of our method is proved through an experimental study conducted on the INEX 2014 collection.
Nowadays, social medias are very popular among their users. One of the most well-known social net... more Nowadays, social medias are very popular among their users. One of the most well-known social networks is Twitter. It is a micro-blog that enables its users to send short messages called tweets. A tweet is a 140 characters long message that is rarely self-cont, hence additional information are necessary to allow better readability of the tweet. This new task has attracted a great deal of attention recently. Given a tweet, the aim of tweet contextualization is to produce an informative and coherent paragraph, called a context, from a set of documents in response to topics treated by the tweet. In this paper, we propose a new approach of Tweet Contextualization based on combining automatic summarization techniques and sentence aggregation. The main idea of our proposed method is to select relevant, informative and semantically related sentences that best describe themes expressed by the tweet, and then build a concise context.
Proceedings of the 11th International Conference on Agents and Artificial Intelligence, 2019
Tweet contextualization (TC) is a new issue that aims to answer questions of the form What is thi... more Tweet contextualization (TC) is a new issue that aims to answer questions of the form What is this tweet about? The idea of this task was imagined as an extension of a previous area called multi-document summarization (MDS), which consists in generating a summary from many sources. In both TC and MDS, the summary should ideally contain most relevant information of the topic that is being discussed in the source texts (for MDS) and related to the query (for TC). Furthermore of being informative, a summary should be coherent, i.e. well written to be readable and grammatically compact. Hence, coherence is an essential characteristic in order to produce comprehensible texts. In this paper, we propose a new approach to improve readability and coherence for tweet contextualization based on bipartite graphs. The main idea of our proposed method is to reorder sentences in a given paragraph by combining most expressive words detection and HITS (Hyperlink-Induced Topic Search) algorithm to make up a coherent context.
Uploads
Papers by Amira Dhokar