Academia.eduAcademia.edu

Data Races

description397 papers
group16 followers
lightbulbAbout this topic
Data races occur in concurrent programming when two or more threads access shared data simultaneously, and at least one thread modifies the data without proper synchronization mechanisms. This can lead to unpredictable behavior and inconsistent results, making it a critical concern in the design and implementation of multithreaded applications.
lightbulbAbout this topic
Data races occur in concurrent programming when two or more threads access shared data simultaneously, and at least one thread modifies the data without proper synchronization mechanisms. This can lead to unpredictable behavior and inconsistent results, making it a critical concern in the design and implementation of multithreaded applications.

Key research themes

1. How can benchmark suites and algorithmic innovations improve the precision and efficiency of data race detection?

This research theme focuses on the creation and enhancement of benchmark suites designed to systematically evaluate data race detection tools and on the development of algorithms that improve the accuracy and performance of these detection methods. Accurate detection is crucial to ensuring correctness and reliability in multi-threaded programs, while efficient algorithms make real-time or on-the-fly detection feasible, reducing overhead during program execution.

Key finding: This paper presents the significant expansion of the DataRaceBench suite, adding 222 benchmarks including Fortran versions and new OpenMP 5.0 features, and introduces a distance-based code similarity analysis to reduce... Read more
Key finding: This work advances DataRaceBench by integrating 20 new data race cases and developing DataRaceBench-ML, a dataset tailored for machine learning and large language model (LLM) applications. The dataset includes detailed labels... Read more
Key finding: The authors propose iFT, an epoch-based algorithm that eliminates the need for vector clock switching in race detection, requiring only O(1) operations to maintain access histories and detect data races. Compared to... Read more
Key finding: This paper introduces VcTrace, a practical, efficient dynamic monitoring tool based on vector clock analysis to detect data races in multithreaded C/C++ programs. VcTrace uses dynamic binary instrumentation with minimal... Read more
Key finding: The study presents IDRC, an Eclipse plugin providing interactive, incremental static analysis for early detection of data races in Java projects during development. By integrating data race warnings directly in the IDE and... Read more

2. What hardware and programming model innovations can reduce the complexity and nondeterminism caused by data races in parallel systems?

This theme addresses how disciplined parallel programming models and novel hardware architectures can mitigate data race complexities in shared-memory systems. It investigates programming language abstractions ensuring data-race-freedom and deterministic behaviors, alongside hardware designs leveraging these guarantees for simpler, scalable, and energy-efficient cache coherence and memory systems. This alignment potentially reduces nondeterministic bugs and aids maintainability in multicore architectures.

Key finding: This paper argues that disciplined parallel programming models enforcing data-race-freedom and structured parallel control allow a radical redesign of shared-memory hardware, eliminating complex directory-based coherence and... Read more
Key finding: The authors propose four novel language abstractions ('domains') enabling safe shared mutable state within the pure actor model by categorizing state as immutable, isolated, observable, or shared, each with operational... Read more

3. How do socio-technical perspectives and engagement with data influence contentious data practices and the politics surrounding datafication?

This research focus explores the role of social movements, activism, and civil society in shaping data politics through engagements that contest dominant datafication processes. It examines bottom-up transformative practices—termed 'contentious politics of data'—that challenge or reappropriate data infrastructures, emphasizing data both as a tool and object in political struggle. Understanding these dynamics is essential to comprehending how data acts as a site of power, resistance, and care in contemporary digital societies.

Key finding: This article conceptualizes 'contentious politics of data' as civil society's bottom-up initiatives that interfere with dominant datafication, mapping data activism along two analytical dimensions: 'data as stakes'... Read more
Key finding: Through an ethnographic study of the Housing Justice League's Tenant Power Hotline, this work highlights how grassroots organizations engage in 'careful tinkering' with data practices to negotiate between care and efficiency.... Read more
Key finding: This paper analyzes the spectacle of large-scale data visualization within tech industry contexts, framing it as 'below the line' labor involving rhetorical work to produce and sustain myths of technological progress and... Read more

All papers in Data Races

Detecting data races in parallel programs is important for both software development and production-run diagnosis. Recently, there have been several proposals for hardware-assisted data race detection. Such proposals typically modify the... more
This paper describes the methods used in Empire, a tool to detect concurrency-related bugs, namely atomic-set serializability violations in Java programs. The correctness criterion is based on atomic sets of memory locations, which share... more
While multi-GPU (MGPU) systems are extremely popular for compute-intensive workloads, several inefficiencies in the memory hierarchy and data movement result in a waste of GPU resources and difficulties in programming MGPU systems. First,... more
Graphics Processing Units (GPUs) are popular hardware accelerators for data-parallel applications, enabling the execution of thousands of threads in a Single Instruction Multiple Thread (SIMT) fashion. However, the SIMT execution model is... more
This paper describes alternative memory semantics for Java programs using an enriched version of the Commit/Reconcile/Fence (CRF) memory model [16]. It outlines a set of reasonable practices for safe multithreaded programming in Java. Our... more
We present a novel framework for defining memory models in terms of two properties: thread-local Instruction Reordering axioms and Store Atomicity, which describes inter-thread communication via memory. Most memory models have the store... more
We present a solution to the reaching definitions problem for programs with explicit lexicully specified parallel constructs, such as cobeginicoend orparallel.sections, hothwith and without explicit synchronization operations, such as... more
With the omnipresent usage of APIs in software development, it has become important to analyse how the routines and functionalities of APIs are actually used. This information is in particular useful for API developers, to make decisions... more
Happens-before detectors are precise but can be too conservative to detect certain data races in repeated test runs as they are sensitive to thread interleaving. By making the opposite tradeoffs, lockset detectors can detect more races... more
Modern operating systems are monolithic. Today, however, lack of isolation is one of the main factors undermining security of the kernel. Inherent complexity of the kernel code and rapid development pace combined with the use of unsafe,... more
Efficient management of concurrent access to shared resources is crucial in modern multi-threaded systems to avoid race conditions and performance bottlenecks. Traditional locking mechanisms, such as standard read-write locks, often... more
We present and solve a path optimization problem on programs. Given a set of program nodes, called critical nodes, we find a shortest path through the program's control flow graph that touches the maximum num-ber of these nodes.... more
Pushdown Systems (PDSs) are an important formalism for modeling programs. Reachability analysis on PDSs has been used extensively for program verification. A key result, which made PDSs popular in the model-checking community was that the... more
We propose a novel approach for runtime verification on computers with a large number of computation cores, without any hardware extension to mainstream PC environment. The goal of the approach is making use of all hardware resources to... more
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or... more
We present an extension of Astree to concurrent C software. Astree is a sound static analyzer for run-time errors previously limited to sequential C software. Our extension employs a scalable abstraction which covers all possible thread... more
Deterministic execution offers many benefits for debugging, fault tolerance, and security. Current methods of executing parallel programs deterministically, however, often incur high costs, allow misbehaved software to defeat... more
With the widespread deployment of multi-core hardware, writing concurrent programs has become inescapable. This has made fixing concurrency bugs (or crugs) critical in modern software systems. Static analysis techniques to find crugs such... more
SCOOP (Simple Concurrent Object-Oriented Programming) [18] is a model and practical framework for building concurrent applications. It comes as a refinement of the Eiffel [15] programming language and is in the process of being integrated... more
Multiprocessors are now dominant, but real multiprocessors do not provide the sequentially consistent memory that is assumed by most work on semantics and verification. Instead, they have subtle relaxed (or weak) memory models, usually... more
This paper describes JSpy, a system for high-level instrumentation of Java bytecode and its use with JPaX, our system for runtime analysis of Java programs. JPaX monitors the execution of temporal logic formulas and performs predicative... more
Happens-before detectors are precise but can be too conservative to detect certain data races in repeated test runs as they are sensitive to thread interleaving. By making the opposite tradeoffs, lockset detectors can detect more races... more
Object-oriented programs involve many unique features that are not present in their conventional counterparts. Examples are message passing, synchronization, dynamic binding, object instantiation, persistence, encapsulation, inheritance,... more
Heterogeneous systems that integrate a multicore CPU and a GPU on the same die are ubiquitous. On these systems, both the CPU and GPU share the same physical memory as opposed to using separate memory dies. Although integration eliminates... more
This paper describes the methods used in Empire, a tool to detect concurrency-related bugs, namely atomic-set serializability violations in Java programs. The correctness criterion is based on atomic sets of memory locations, which share... more
Runtime monitoring, where some part of a pro- gram's behavior and/or data is observed during execution, is a very useful technique that software developers to use for un- derstanding, analyzing, debugging, and improving their... more
With the rapid expansion of process mining implementation in global enterprises distributed across numerous branches, there is a critical requirement to develop an application qualified for real-time operation with fast and precise data... more
The virtues of deterministic parallelism have been argued for decades and many forms of deterministic parallelism have been described and analyzed. Here we are concerned with one of the strongest forms, requiring that for any input there... more
We present a logical tool which allows understanding the rationality of the translation underlying some interactions in Nature. In an abstract, formal way, we can demonstrate the epistemological link between a sequence and a... more
This paper describes a performance evaluation technique of parallel programs based on software tracing. The interest of the proposed method is to enable post-mortem correction of the intrusion of software tracing of non deterministic... more
This paper describes a performance evaluation technique of parallel programs based on software tracing. The interest of the proposed method is to enable post-mortem correction of the intrusion of software tracing of non deterministic... more
This paper describe the multithreaded execution and data race detectors which are commonly viewed as debugging tools.The C++ Standard defines single-threaded program execution. Basically, multithreaded execution requires a much more... more
Quantum Computing lies at the frontier of computing, offering a radically different and unconventional model of computation. In the absence of practical quantum computers today, we must simulate their execution. This creates a performance... more
Today's designs contain several hundreds to thousands of registers and memory elements. Starting from documentation to design implementation to verification of each single register, each bit and its property involves a lot of time and... more
Dynamic program analysis tools based on code instrumentation serve many important software engineering tasks such as profiling, debugging, testing, program comprehension, and reverse engineering. Unfortunately, constructing new analysis... more
The collection of dynamic metrics is an important part of performance analysis and workload characterization. We demonstrate JP2, a new tool for collecting dynamic bytecode metrics for standard Java Virtual Machines (JVMs). The... more
We study conflict detection for programs with procedures, dynamic thread creation and a fixed finite set of (reentrant) monitors. We show that deciding the existence of a conflict is NP-complete for our model (that abstracts guarded... more
In the single-instruction multiple-threads (SIMT) execution model, small groups of scalar threads operate in lockstep. Within each group, current SIMT hardware implementations serialize the execution of threads that follow different... more
For C programs, flow-sensitivity is important to enable pointer analysis to achieve highly usable precision. Despite significant recent advances in scaling flow-sensitive pointer analysis sparsely for sequential C programs, relatively... more
The C programming language continues to play an essential role in the development of system software. May-Happen-in-Parallel (MHP) analysis is the basis of many other analyses and optimisations for concurrent programs. Existing MHP... more
Robust modules guarantee to do only what they are supposed to do-even in the presence of untrusted, malicious clients, and considering not just the direct behaviour of individual methods, but also the emergent behaviour from calls to more... more
We present a logical tool which allows understanding the rationality of the translation underlying some interactions in Nature. In an abstract, formal way, we can demonstrate the epistemological link between a sequence and a... more
The last decade has witnessed the blooming emergence of many-core platforms, especially the graphic processing units (GPUs). With the exponential growth of cores in GPUs, utilizing them efficiently becomes a challenge. The dataparallel... more
Concurrency has been a perpetual problem in Android apps, mainly due to event-based races. Several event-based race detectors have been proposed, but they produce false positives, cannot reproduce races, and cannot distinguish between... more
The Software Defect Prediction (SDP) method forecasts the occurrence of defects at the beginning of the software development process. Early fault detection will decrease the overall cost of software and improve its dependability. However,... more
The increasing demand for lower power forces designers to use sophisticated power management strategies such as multivoltage and power gating which are often accompanied with many design bugs. Correcting such bugs can be a timeconsuming... more
MPI is commonly used to write parallel programs for distributed memory parallel computers. MPI-CHECK is a tool developed to aid in the debugging of MPI programs that are written in free or fixed format Fortran 90 and Fortran 77. MPI-CHECK... more
Memory consistency models, or memory models, allow both programmers and program language implementers to reason about concurrent accesses to one or more memory locations. Memory model specifications balance the often conflicting needs for... more
Download research papers for free!