Jérémy Barbay

Followers

Following

Public Views

Interests

Uploads

Papers by Jérémy Barbay

09171 Executive Summary – Adaptive, Output Sensitive, Online and Parameterized Algorithms

Traditionally the analysis of algorithms measures the complexity of a problem or algorithm in ter... more Traditionally the analysis of algorithms measures the complexity of a problem or algorithm in terms of the worst-case behavior over all inputs of a given size. However, in certain cases an improved algorithm can be obtained by considering a finer partition of the input space. As this idea has been independently rediscovered in many areas, the workshop gathered participants from different fields in order to explore the impact and the limits of this technique, in the hope to spring new collaboration and to seed the unification of the technique.

09171 Abstracts Collection – Adaptive, Output Sensitive, Online and Parameterized Algorithms

From 19.01. to 24.04.2009, the Dagstuhl Seminar 09171 ``Adaptive, Output Sensitive, Online and Pa... more From 19.01. to 24.04.2009, the Dagstuhl Seminar 09171 ``Adaptive, Output Sensitive, Online and Parameterized Algorithms '' was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

Download

Deterministic Algorithm for the t-Threshold Set Problem

Algorithms and Computation, 2003

Given k sorted arrays, the t-Threshold problem, which is motivated by indexed search engines, con... more Given k sorted arrays, the t-Threshold problem, which is motivated by indexed search engines, consists of finding the elements which are present in at least t of the arrays. We present a new deterministic algorithm for it and prove that, asymptotically in the sizes of the arrays, it is optimal in the alternation model used to study adaptive algorithms. We define the Opt-Threshold problem as finding the smallest non empty t-threshold set, which is equivalent to find the largest t such that the t-threshold set is non empty, and propose a naive algorithm to solve it.

Download

Adaptive intersection and t -threshold problems

ACM-SIAM Symposium on Discrete Algorithms, 2002

Consider the problem of computing the intersection of k sorted sets. In the comparison model, we ... more Consider the problem of computing the intersection of k sorted sets. In the comparison model, we prove a new lower bound which depends on the non-deterministic complexity of the instance, and implies that the algorithm of Demaine, López-Ortiz and Munro [2] is usually optimal in this "adaptive" sense. We extend the lower bound and the algorithm to the t-Threshold Problem,

Download

On the discrete Bak-Sneppen model of self-organized criticality

ACM-SIAM Symposium on Discrete Algorithms, 2001

We propose a discrete variant of the Bak-Sneppen model for self-organized criticality. In this pr... more We propose a discrete variant of the Bak-Sneppen model for self-organized criticality. In this process, a configuration is an n-bit word, and at each step one chooses a random bit of minimum value (usually a zero) and replaces it and its two neighbors by independent Bernoulli variables with parameter p. We prove bounds on the average number of ones in

Download

Efficient Algorithms for Context Query Evaluation over a Tagged Corpus

2009 International Conference of the Chilean Computer Science Society, 2009

We present an optimal adaptive algorithm for context queries in tagged content. The queries consi... more We present an optimal adaptive algorithm for context queries in tagged content. The queries consist of locating instances of a tag within a context specified by the query using patterns with preorder, ancestor-descendant and proximity operators in the document tree implied by the tagged content. The time taken to resolve a query Q on a document tree T is logarithmic in the size of T , proportional to the size of Q, and to the difficulty of the combination of Q with T , as measured by the minimal size of a certificate of the answer. The performance of the algorithm is no worse than the classical worst-case optimal, while provably better on simpler queries and corpora. More formally, the algorithm runs in time O(δk lg(n/δk)) in the standard RAM model and in time O(δk lg lg min(n, σ)) in the Θ(lg(n))-word RAM model, where k is the number of edges in the query, δ is the minimum number of operations required to certify the answer to the query, n is the number of nodes in the tree, and σ is the number of labels indexed.

Download

Faster Adaptive Set Intersections for Text Searching

Lecture Notes in Computer Science, 2006

The intersection of large ordered sets is a common problem in the context of the evaluation of bo... more The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this paper we engineer a better algorithm for this task, which improves over those proposed by Demaine, Munro and López-Ortiz [SODA 2000/ALENEX 2001], by using a variant of interpolation search. More specifically, our contributions are threefold. First, we corroborate and complete the practical study from Demaine et al. on comparison based intersection algorithms. Second, we show that in practice replacing binary search and galloping (one-sided binary) search [4] by interpolation search improves the performance of each main intersection algorithms. Third, we introduce and test variants of interpolation search: this results in an even better intersection algorithm. Topics. Evaluation of Algorithms for Realistic Environments, Implementation, Testing, Evaluation and Fine-tuning of Algorithms, Information Retrieval.

Download

An experimental investigation of set intersection algorithms for text searching

ACM Journal of Experimental Algorithmics, 2009

The intersection of large ordered sets is a common problem in the context of the evaluation of bo... more The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this article, we propose several improved algorithms for computing the intersection of sorted arrays, and in particular for searching sorted arrays in the intersection context. We perform an experimental comparison with the algorithms from the previous studies from Demaine, López-Ortiz, and Munro [ALENEX 2001] and from Baeza-Yates and Salinger [SPIRE 2005]; in addition, we implement and test the intersection algorithm from Barbay and Kenyon [SODA 2002] and its randomized variant [SAGA 2003]. We consider both the random data set from Baeza-Yates and Salinger, the Google queries used by Demaine et al., a corpus provided by Google, and a larger corpus from the TREC Terabyte 2006 efficiency query stream, along with its own query log. We measure the performance both in terms of the number of comparisons and searches performed, and in terms of the CPU time ...

Download

Alternation and redundancy analysis of the intersection problem

ACM Transactions on Algorithms, 2008

The intersection of sorted arrays problem has applications in search engines such as Google. Prev... more The intersection of sorted arrays problem has applications in search engines such as Google. Previous work has proposed and compared deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analysis , based on the nondeterministic complexity of an instance. In this analysis we prove that there is a deterministic algorithm asymptotically performing as well as any randomized algorithm in the comparison model. We define the redundancy analysis , based on a measure of the internal redundancy of the instance. In this analysis we prove that any algorithm optimal in the redundancy analysis is optimal in the alternation analysis, but that there is a randomized algorithm which performs strictly better than any deterministic algorithm in the comparison model. Finally, we describe how these results can be extended beyond the comparison model.

Download

Smaller and Faster: Succinct Data Structures

Faster Set Intersection Algorithms for Text Searching

ACM Journal of Experimental Algorithmics, Sep 1, 2006

Abstract. The intersection of large ordered sets is a common problem in the context of the evalua... more Abstract. The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this paper we propose several improved algorithms for computing the intersection of sorted arrays, and in particular for searching sorted arrays in the intersection context. We perform an experimental comparison with the algorithms from the previous studies from Demaine, López-Ortiz and Munro [ALENEX 2001], and from Baeza-Yates and Salinger [SPIRE 2005]; in addition, we ...

Download

Convex hull of the union of convex objects in the plane: an adaptive analysis

Proc. 20th CCCG, 2008

We prove a tight asymptotic bound of Θ (δ log (n/δ)) on the worst case computational complexity o... more We prove a tight asymptotic bound of Θ (δ log (n/δ)) on the worst case computational complexity of the convex hull of the union of two convex objects of sizes summing to n requiring δ orientation tests to certify the answer. For more convex objects, we prove a (non optimal) asymptotic bound of O (δ∑ ki= 1 log (ni/δ)) on the worst case computational complexity of the convex hull of the union of k convex objects of respective sizes (n1,..., nk) requiring δ orientation tests to certify the answer. Our algorithms are ...

Download

Jérémy Barbay

Uploads

Papers by Jérémy Barbay

Log In