Papers by Ohad Ben-Baruch

Zenodo (CERN European Organization for Nuclear Research), Oct 7, 2022
This paper presents a generic approach for deriving detectably recoverable implementations of man... more This paper presents a generic approach for deriving detectably recoverable implementations of many widely-used concurrent data structures. Such implementations are appealing for emerging systems featuring byte-addressable non-volatile main memory (NVMM), whose persistence allows to efficiently resurrect failed threads after crashes. Detectable recovery ensures that after a crash, every executed operation is able to recover and return a correct response, and that the state of the data structure is not corrupted. Our approach, called Tracking, amends descriptor objects used in existing lock-free helping schemes with additional fields that track an operation's progress towards completion and persists these fields in order to ensure detectable recovery. Tracking avoids full-fledged logging and tracks the progress of concurrent operations in a per-thread manner, thus reducing the cost of ensuring detectable recovery. We have applied Tracking to derive detectably recoverable implementations of a linked list, a binary search tree, and an exchanger. Our experimental analysis introduces a new way of analyzing the cost of persistence instructions, not by simply counting them but by separating them into categories based on the impact they have on the performance. The analysis reveals that understanding the actual persistence cost of an algorithm in machines with real NVMM, is more complicated than previously thought, and requires a thorough evaluation, since the impact of different persistence instructions on performance may greatly vary. We consider this analysis to be one of the major contributions of the paper.
arXiv (Cornell University), May 25, 2021
Non-Volatile Random Access Memory (NVRAM) is a novel type of hardware that combines the benefits ... more Non-Volatile Random Access Memory (NVRAM) is a novel type of hardware that combines the benefits of traditional persistent memory (persistency of data over hardware failures) and DRAM (fast random access). In this work, we describe an algorithm that can be used to execute NVRAM programs and recover the system after a hardware failure while taking the architecture of real-world NVRAM systems into account. Moreover, the algorithm can be used to execute NVRAM-destined programs on commodity persistent hardware, such as hard drives. That allows us to test NVRAM algorithms using only cheap hardware, without having access to the NVRAM. We report the usage of our algorithm to implement and test NVRAM CAS algorithm.

arXiv (Cornell University), Dec 7, 2020
Linearizability, the traditional correctness condition for concurrent data structures is consider... more Linearizability, the traditional correctness condition for concurrent data structures is considered insufficient for the non-volatile shared memory model where processes recover following a crash. For this crash-recovery shared memory model, strict-linearizability is considered appropriate since, unlike linearizability, it ensures operations that crash take effect prior to the crash or not at all. This work formalizes and answers the question of whether an implementation of a data type derived for the crash-stop shared memory model is also strict-linearizable in the crash-recovery model. This work presents a rigorous study to prove how helping mechanisms, typically employed by non-blocking implementations, is the algorithmic abstraction that delineates linearizability from strict-linearizability. Our first contribution formalizes the crash-recovery model and how explicit process crashes and recovery introduces further dimensionalities over the standard crash-stop shared memory model. We make the following technical contributions that answer the question of whether a help-free linearizable implementation is strict-linearizable in the crash-recovery model: (i) we prove surprisingly that there exist linearizable implementations of object types that are help-free, yet not strict-linearizable; (ii) we then present a natural definition of help-freedom to prove that any obstruction-free, linearizable and help-free implementation of a total object type is also strict-linearizable. The next technical contribution addresses the question of whether a strict-linearizable implementation in the crash-recovery model is also help-free linearizable in the crash-stop model. To that end, we prove that for a large class of object types, a non-blocking strict-linearizable implementation cannot have helping. Viewed holistically, this work provides the first precise characterization of the intricacies in applying a concurrent implementation designed for the crash-stop (and resp. crash-recovery) model to the crash-recovery (and resp. crash-stop) model.

arXiv (Cornell University), Dec 23, 2020
Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock... more Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock, collects requests by multiple threads for accessing a concurrent data structure and applies their combined requests to it. Although FC is sequential, it significantly reduces synchronization overheads and cache invalidations and thus often provides better performance than that of lock-free implementations. The recent emergence of non-volatile memory (NVM) technologies increases the interest in the development of persistent concurrent objects. These are objects that are able to recover from system failures and ensure consistency by retaining their state in NVM and fixing it, if required, upon recovery. Of particular interest are detectable objects that, in addition to ensuring consistency, allow recovery code to infer if a failed operation took effect before the crash and, if it did, obtain its response. In this work, we present the first FC-based persistent object implementations. Specifically, we introduce a detectable FC-based implementation of a concurrent LIFO stack, a concurrent FIFO queue, and a double-ended queue. Our empirical evaluation establishes that due to flat combining, the novel implementations require a much smaller number of costly persistence instructions than competing algorithms and are therefore able to significantly outperform them.

arXiv (Cornell University), Feb 26, 2020
The emergence of systems with non-volatile main memory (NVM) increases the interest in the design... more The emergence of systems with non-volatile main memory (NVM) increases the interest in the design of recoverable concurrent objects that are robust to crash-failures, since their operations are able to recover from such failures by using state retained in NVM. Of particular interest are recoverable algorithms that, in addition to ensuring object consistency, also provide detectability, a correctness condition requiring that the recovery code can infer if the failed operation was linearized or not and, in the former case, obtain its response. In this work, we investigate the space complexity of detectable algorithms and the external support they require. We make the following three contributions. First, we present the first wait-free bounded-space detectable read/write and CAS object implementations. Second, we prove that the bit complexity of every N-process obstruction-free detectable CAS implementation, assuming values from a domain of size at least N , is Ω(N). Finally, we prove that the following holds for obstruction-free detectable implementations of a large class of objects: their recoverable operations must be provided with auxiliary state-state that is not required by the non-recoverable counterpart implementation-whose value must be provided from outside the operation, either by the system or by the caller of the operation. In contrast, this external support is, in general, not required if the recoverable algorithm is not detectable.

Lecture Notes in Computer Science, 2021
Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock... more Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock, collects requests by multiple threads for accessing a concurrent data structure and applies their combined requests to it. Although FC is sequential, it significantly reduces synchronization overheads and cache invalidations and thus often provides better performance than that of lock-free implementations. The recent emergence of non-volatile memory (NVM) technologies increases the interest in the development of persistent concurrent objects. These are objects that are able to recover from system failures and ensure consistency by retaining their state in NVM and fixing it, if required, upon recovery. Of particular interest are detectable objects that, in addition to ensuring consistency, allow recovery code to infer if a failed operation took effect before the crash and, if it did, obtain its response. In this work, we present the first FC-based persistent object implementations. Specifically, we introduce a detectable FC-based implementation of a concurrent LIFO stack, a concurrent FIFO queue, and a double-ended queue. Our empirical evaluation establishes that due to flat combining, the novel implementations require a much smaller number of costly persistence instructions than competing algorithms and are therefore able to significantly outperform them.

arXiv (Cornell University), May 31, 2019
This paper presents the tracking approach for deriving detectably recoverable (and thus also dura... more This paper presents the tracking approach for deriving detectably recoverable (and thus also durable) implementations of many widely-used concurrent data structures. Such data structures, satisfying detectable recovery, are appealing for emerging systems featuring byte-addressable non-volatile main memory (NVRAM), whose persistence allows to efficiently resurrect failed processes after crashes. Detectable recovery ensures that after a crash, every executed operation is able to recover and return a correct response, and that the state of the data structure is not corrupted. Info-Structure Based (ISB)-tracking amends descriptor objects used in existing lock-free helping schemes with additional fields that track an operation's progress towards completion and persists these fields to memory in order to ensure detectable recovery. ISB-tracking avoids full-fledged logging and tracks the progress of concurrent operations in a per-process manner, thus reducing the cost of ensuring detectable recovery. We have applied ISB-tracking to derive detectably recoverable implementations of a queue, a linked list, a binary search tree, and an exchanger. Experimental results show the feasibility of the technique.
The Limits of Helping in Non-volatile Memory Data Structures
Springer eBooks, 2022

The emergence of systems with non-volatile main memory (NVRAM) increases the need for persistent ... more The emergence of systems with non-volatile main memory (NVRAM) increases the need for persistent concurrent objects. Of specific interest are recoverable implementations that, in addition to being robust to crash-failures, are also detectable. Detectability ensures that upon recovery, it is possible to infer whether the failed operation took effect or not and, in the former case, obtain its response. This work presents two recoverable detectable Fetch&Add (FAA) algorithms that are selfimplementations, i.e, use only a fetch&add base object, in addition to read/write registers. The algorithms target two different models for recovery: the global-crash model and the individual-crash model. In both algorithms, operations are wait-free when there are no crashes, but the recovery code may block if there are repeated failures. We also prove that in the individual-crash model, there is no implementation of recoverable and detectable FAA using only read, write and fetch&add primitives in whic...

Tracking: PPoPP 22 - Detectable Recovery of Lock-Free Data Structures
In the PPoPP 2022 paper [1] entitled "Detectable Recovery of Lock-Free Data Structures"... more In the PPoPP 2022 paper [1] entitled "Detectable Recovery of Lock-Free Data Structures", we present a generic approach called Tracking for deriving detectably recoverable implementations of many widely-used concurrent data structures. Such implementations are appealing for emerging systems featuring byte-addressable nonvolatile main memory (NVMM), whose persistence allows to efficiently resurrect failed processes after crashes. Detectable recovery ensures that after a crash, every executed operation is able to recover and return a correct response, and that the state of the data structure is not corrupted. We have applied Tracking to derive detectably recoverable implementations of a linked list, a binary search tree, and an exchanger. Our experimental analysis introduces a new way of analyzing the cost of persistence instructions, not by simply counting them but by separating them into categories based on the impact they have on the performance. The analysis reveals that ...

Detectable recovery of lock-free data structures
Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2022
This paper presents a generic approach for deriving detectably recoverable implementations of man... more This paper presents a generic approach for deriving detectably recoverable implementations of many widely-used concurrent data structures. Such implementations are appealing for emerging systems featuring byte-addressable non-volatile main memory (NVMM), whose persistence allows to efficiently resurrect failed threads after crashes. Detectable recovery ensures that after a crash, every executed operation is able to recover and return a correct response, and that the state of the data structure is not corrupted. Our approach, called Tracking, amends descriptor objects used in existing lock-free helping schemes with additional fields that track an operation's progress towards completion and persists these fields in order to ensure detectable recovery. Tracking avoids full-fledged logging and tracks the progress of concurrent operations in a per-thread manner, thus reducing the cost of ensuring detectable recovery. We have applied Tracking to derive detectably recoverable implementations of a linked list, a binary search tree, and an exchanger. Our experimental analysis introduces a new way of analyzing the cost of persistence instructions, not by simply counting them but by separating them into categories based on the impact they have on the performance. The analysis reveals that understanding the actual persistence cost of an algorithm in machines with real NVMM, is more complicated than previously thought, and requires a thorough evaluation, since the impact of different persistence instructions on performance may greatly vary. We consider this analysis to be one of the major contributions of the paper.
Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020

ArXiv, 2020
Linearizability, the traditional correctness condition for concurrent data structures is consider... more Linearizability, the traditional correctness condition for concurrent data structures is considered insufficient for the non-volatile shared memory model where processes recover following a crash. For this crash-recovery shared memory model, strict-linearizability is considered appropriate since, unlike linearizability, it ensures operations that crash take effect prior to the crash or not at all. This work formalizes and answers the question of whether an implementation of a data type derived for the crash-stop shared memory model is also strict-linearizable in the crash-recovery model. This work presents a rigorous study to prove how helping mechanisms, typically employed by non-blocking implementations, is the algorithmic abstraction that delineates linearizability from strict-linearizability. Our first contribution formalizes the crash-recovery model and how explicit process crashes and recovery introduces further dimensionalities over the standard crash-stop shared memory model...

Lecture Notes in Computer Science, 2021
Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock... more Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock, collects requests by multiple threads for accessing a concurrent data structure and applies their combined requests to it. Although FC is sequential, it significantly reduces synchronization overheads and cache invalidations and thus often provides better performance than that of lock-free implementations. The recent emergence of non-volatile memory (NVM) technologies increases the interest in the development of persistent concurrent objects. These are objects that are able to recover from system failures and ensure consistency by retaining their state in NVM and fixing it, if required, upon recovery. Of particular interest are detectable objects that, in addition to ensuring consistency, allow recovery code to infer if a failed operation took effect before the crash and, if it did, obtain its response. In this work, we present the first FC-based persistent object implementations. Specifically, we introduce a detectable FC-based implementation of a concurrent LIFO stack, a concurrent FIFO queue, and a double-ended queue. Our empirical evaluation establishes that due to flat combining, the novel implementations require a much smaller number of costly persistence instructions than competing algorithms and are therefore able to significantly outperform them.
Lecture Notes in Computer Science, 2021
Non-Volatile Random Access Memory (NVRAM) is a novel type of hardware that combines the benefits ... more Non-Volatile Random Access Memory (NVRAM) is a novel type of hardware that combines the benefits of traditional persistent memory (persistency of data over hardware failures) and DRAM (fast random access). In this work, we describe an algorithm that can be used to execute NVRAM programs and recover the system after a hardware failure while taking the architecture of real-world NVRAM systems into account. Moreover, the algorithm can be used to execute NVRAM-destined programs on commodity persistent hardware, such as hard drives. That allows us to test NVRAM algorithms using only cheap hardware, without having access to the NVRAM. We report the usage of our algorithm to implement and test NVRAM CAS algorithm.

Proceedings of the 39th Symposium on Principles of Distributed Computing, 2020
The emergence of systems with non-volatile main memory (NVM) increases the interest in the design... more The emergence of systems with non-volatile main memory (NVM) increases the interest in the design of recoverable concurrent objects that are robust to crash-failures, since their operations are able to recover from such failures by using state retained in NVM. Of particular interest are recoverable algorithms that, in addition to ensuring object consistency, also provide detectability, a correctness condition requiring that the recovery code can infer if the failed operation was linearized or not and, in the former case, obtain its response. In this work, we investigate the space complexity of detectable algorithms and the external support they require. We make the following three contributions. First, we present the first wait-free bounded-space detectable read/write and CAS object implementations. Second, we prove that the bit complexity of every N-process obstruction-free detectable CAS implementation, assuming values from a domain of size at least N , is Ω(N). Finally, we prove that the following holds for obstruction-free detectable implementations of a large class of objects: their recoverable operations must be provided with auxiliary state-state that is not required by the non-recoverable counterpart implementation-whose value must be provided from outside the operation, either by the system or by the caller of the operation. In contrast, this external support is, in general, not required if the recoverable algorithm is not detectable.

Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, 2018
We presents a novel abstract individual-process crash-recovery model for non-volatile memory, whi... more We presents a novel abstract individual-process crash-recovery model for non-volatile memory, which enables modularity, so that complex recoverable objects can be constructed in a modular manner from simpler recoverable base objects. Within the framework of this model, we define nesting-safe recoverable linearizability (NRL)-a novel correctness condition that captures the requirements for nesting recoverable objects. Informally, NRL allows the recovery code to extend the interval of the failed operation until the recovery code succeeds to complete (possibly after multiple failures and recovery attempts). Unlike previous correctness definitions, the NRL condition implies that, following recovery, an implemented (higherlevel) recoverable operation is able to complete its invocation of a base-object operation and obtain its response. We present algorithms for nesting-safe recoverable primitives, namely, recoverable versions of widely-used primitive sharedmemory operations such as read, write, test-and-set and compareand-swap, which can be used to implement higher-level recoverable objects. We then exemplify how these recoverable base objects can be used for constructing a recoverable counter object. Finally, we prove an impossibility result on wait-free implementations of recoverable test-and-set (TAS) objects from read, write and TAS operations, thus demonstrating that our model also facilitates rigorous analysis of the limitations of recoverable concurrent objects.

Lecture Notes in Computer Science, 2016
Obstruction-free consensus, ensuring that a process running solo will eventually terminate, is at... more Obstruction-free consensus, ensuring that a process running solo will eventually terminate, is at the core of practical ways to solve consensus, e.g., by using randomization or failure detectors. An obstructionfree consensus algorithm may not terminate in many executions, but it must terminate whenever a process runs solo. Such an algorithm can be evaluated by its solo step complexity, which bounds the worst case number of steps taken by a process running alone, from any configuration, until it decides. This paper presents a lower bound of Ω(log n) on the solo step complexity of obstruction-free binary anonymous consensus. The proof constructs a sequence of executions in which more and more distinct variables are about to be written to, and then uses the backtracking covering technique to obtain a single execution in which many variables are accessed.

Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, 2015
Mutual exclusion is a fundamental distributed coordination problem. Shared-memory mutual exclusio... more Mutual exclusion is a fundamental distributed coordination problem. Shared-memory mutual exclusion research focuses on local-spin algorithms and uses the remote memory references (RMRs) metric. To ensure the correctness of concurrent algorithms in general, and mutual exclusion algorithms in particular, it is often required to prohibit certain re-orderings of memory instructions that may compromise correctness, by inserting memory fence (a.k.a. memory barrier) instructions. Memory fences incur non-negligible overhead and may significantly increase time complexity. A mutual exclusion algorithm is adaptive to total contention (or simply adaptive), if the time complexity of every passage (an entry to the critical section and the corresponding exit) is a function of total contention, that is, the number of processes, k, that participate in the execution in which that passage is performed. We say that an algorithm A is f-adaptive (and that f is an adaptivity function of A), if the time complexity of every passage in A is O f (k). Adaptive implementations are desirable when contention is much smaller than the total number of processes, n, sharing the implementation. Recent work [5] presented the first read/write mutual exclusion algorithm with asymptotically optimal complexity under both the RMRs and fences metrics: each passage through the critical section incurs O(log n) RMRs and a constant number of fences. The algorithm works in the popular Total Store Ordering (TSO) model. The algorithm of [5] is non-adaptive, however, and they posed the question of whether there exists an adaptive mutual exclusion algorithm with the same complexities. We provide a negative answer to this question, thus capturing an inherent cost of adaptivity. In fact, we prove a stronger result: adaptive read/write mutual exclusion algo-* Partially supported by the Israel Science Foundation (grants 1227/10, 1749/14) and by the Lynne and William Frankel Center for Computing Science at Ben-Gurion University.

ArXiv, 2020
Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock... more Flat combining (FC) is a synchronization paradigm in which a single thread, holding a global lock, collects requests by multiple threads for accessing a concurrent data structure and applies their combined requests to it. Although FC is sequential, it significantly reduces synchronization overheads and cache invalidations and thus often provides better performance than that of lock-free implementations. The recent emergence of non-volatile memory (NVM) technologies increases the interest in the development of persistent (a.k.a. durable or recoverable) objects. These are objects that are able to recover from system failures and ensure consistency by retaining their state in NVM and fixing it, if required, upon recovery. Of particular interest are detectable objects that, in addition to ensuring consistency, allow recovery code to infer if a failed operation took effect before the crash and, if it did, obtain its response. In this work, we present the first FC-based persistent object....
Uploads
Papers by Ohad Ben-Baruch