Gang Scheduling

description223 papers

group2 followers

lightbulbAbout this topic

Gang scheduling is a scheduling algorithm used in operating systems and parallel computing, where a group of related processes or threads are scheduled to run simultaneously on a set of processors. This approach aims to improve performance by minimizing context switching and enhancing data locality among processes that share resources.

lightbulbAbout this topic

Key research themes

1. How can fairness and equity be integrated into scheduling algorithms to improve multi-day and multi-client resource allocation?

This research theme focuses on extending traditional scheduling frameworks by embedding equity and fairness considerations, particularly when resources need to be allocated over multiple time periods or among multiple clients. This issue is critical in real-world contexts where consistent service or job completion guarantees across different users or days improve satisfaction and fairness indicators, presenting algorithmic and complexity challenges for offline scheduling with fairness constraints.

Equitable Scheduling on a Single Machine

by Danny Hermelin

2025, Proceedings of the AAAI Conference on Artificial Intelligence

Key finding: Introduces the Equitable Scheduling (ES) problem, which generalizes minimizing tardy jobs on a single machine by ensuring that each client’s job meets its deadline at least k times over m days, thereby guaranteeing fairness.... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What are the algorithmic and theoretical frameworks for scheduling multiprocessor tasks involving simultaneous execution and precedence constraints?

This theme addresses scheduling problems where tasks require multiple processors simultaneously (gang scheduling) and are subject to precedence or incompatibility constraints. Such problems are modeled through graph-theoretic frameworks, notably mixed graph coloring, which unifies scheduling constraints with vertex colorings under precedence and conflict edges. These models facilitate leveraging complexity results and approximation strategies from graph coloring to design efficient scheduling algorithms for parallel tasks with synchronization requirements.

Scheduling Multiprocessor Tasks with Equal Processing Times as a Mixed Graph Coloring Problem

by Yuri Sotskov

2021, Algorithms

Key finding: Demonstrates that scheduling multiprocessor tasks with unit processing times and constraints such as precedence and simultaneous execution requirements is equivalent to finding an optimal coloring of a mixed graph (combining... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can gang scheduling be optimized and adapted in contemporary multiprocessor and parallel computing environments, including clusters and real-time systems?

This theme explores the application, improvements, and adaptations of gang scheduling strategies to efficiently manage parallel jobs across multicore processors, clusters, and real-time systems. Emphasis is on algorithmic innovations that reduce scheduling overhead, incorporate cache and multicore architectures, manage energy and fairness tradeoffs, and handle periodic and rigid parallel tasks. The studies contribute practical scheduling frameworks enhancing parallel workload throughput, response times, and resource utilization in large-scale and time-constrained computational environments.

Enhancing Capability of Gang scheduling by integration of Multi Core Processors and Cache

by International Journal of Scientific Research in Computer Science, Engineering and Information Technology IJSRCSEIT

2023, International Journal of Scientific Research in Computer Science, Engineering and Information Technology

Key finding: Develops optimal linear programming and approximation algorithms for gang scheduling parallel implicit-deadline periodic task systems on identical multiprocessors, explicitly accounting for multi-core processors and cache... Read more

articleView Paper downloadDownload

STORM: Scalable Resource Management for Large-Scale Parallel Computers

by Eitan Frachtenberg

2022

Key finding: Presents the STORM resource management system enabling fast, scalable gang scheduling on large parallel clusters, achieving job launch speeds an order of magnitude faster than prior systems. STORM reduces scheduling overhead... Read more

articleView Paper downloadDownload

Gang FTP scheduling of periodic and parallel rigid real-time tasks

by Vandy Berten

2015, Computing Research Repository

Key finding: Provides exact schedulability tests and structural characterizations for Fixed Task Priority (FTP) gang schedulers managing parallel rigid real-time tasks. The work identifies subclasses of gang scheduling that are... Read more

articleView Paper downloadDownload

A comprehensive performance and energy consumption analysis of scheduling alternatives in clusters

by Andy Yo

2022, The Journal of Supercomputing

Key finding: Through experimental evaluation on a 16-node myrinet Linux cluster, the study compares batch scheduling, gang scheduling, and multiple coscheduling algorithms, including a newly proposed HYBRID scheme combining advantages of... Read more

articleView Paper downloadDownload

Concerning the Length of Time Slots for Efficient Gang Scheduling

by Andrzej Goscinski

2016

Key finding: Addresses an underexplored parameter in gang scheduling—the length of scheduling time slots—and proposes a strategy balancing overhead reduction against job waiting times. The paper shows that overly long time slots reduce... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Gang Scheduling

Implications of I/O for gang scheduled workloads

by Victor Lee

2025, Lecture Notes in Computer Science

The job workloads of general-purpose multiprocessors usually include both compute-bound parallel jobs, which often require gang scheduling, as well as I/O-bound jobs, which require high CPU priority for the individual gang members of the... more

descriptionView Paper arrow_downwardDownload

Defining a Comprehensive Threat Model for High Performance Computational Clusters

by william yurcik

2025, arXiv (Cornell University)

descriptionView Paper arrow_downwardDownload

On the periodic behavior of real-time schedulers on identical multiprocessor platforms

by Emmanuel Grolleau

2025, HAL (Le Centre pour la Communication Scientifique Directe)

This paper is proposing a general periodicity result concerning any deterministic and memoryless scheduling algorithm (including non-work-conserving algorithms), for any context, on identical multiprocessor platforms. By context we mean... more

descriptionView Paper arrow_downwardDownload

Periodicity of real-time schedules for dependent periodic tasks on identical multiprocessor platforms

by Emmanuel Grolleau

2025, Real-Time Systems

This paper gives and proves correct a simulation interval for any schedule generated by a deterministic and memoryless scheduler (i.e., one where the scheduling decision is the same and unique for any two identical system states) for... more

descriptionView Paper arrow_downwardDownload

On the periodic behavior of real-time schedulers on identical multiprocessor platforms

by Emmanuel Grolleau

2025

descriptionView Paper arrow_downwardDownload

1 Checkpointing Implementation for Real-time and Fault Tolerant Applications on RTAI

by Nianen Chen

2024

Checkpointing Rollback Recovery protocol is often used to provide fault tolerance for real-time applications. However, existing checkpointing implementations support only non-real-time applications as the checkpointing overhead is usually... more

descriptionView Paper arrow_downwardDownload

Scheduling of Parallel Jobs on Dynamic, Heterogenous Networks

by Jeremy Casas

2024

In using a shared network of workstations for parallel processing, it is not only important to consider heterogeneity and differences in processing power between the workstations but also the dynamics of the system as a whole. In such a... more

descriptionView Paper arrow_downwardDownload

RodosVisor - An object-oriented and customizable hypervisor: The CPU virtualization

by Mongkol Ekpanyapong

2024, Embedded Systems Computational Intelligence and Telematics in Control, 1st IFAC Conference on

RodosVisor is an object-oriented and bare-metal virtual machine monitor (VMM) or hypervisor designed for the aerospace industry, mainly to provide time and spatial separation to the NetworkCentric core avionics machine, Montenegro and... more

descriptionView Paper arrow_downwardDownload

Improving throughput and utilization in parallel machines through concurrent gang

by Fabrício Pereira Quintanilha da Silva

2024, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000

In this paper we propose a new class of scheduling policies, dubbed Concurrent Gang, that combines the advantages of gang scheduling for communication and synchronization intensive parallel jobs with the flexibility of a Unix scheduler... more

descriptionView Paper arrow_downwardDownload

Simulation-based average case analysis for parallel job scheduling

by Fabrício Pereira Quintanilha da Silva

2024, Proceedings. 34th Annual Simulation Symposium

descriptionView Paper arrow_downwardDownload

Efficient Parallel Job Scheduling Using Gang Service

by Fabrício Pereira Quintanilha da Silva

2024, International Journal of Foundations of Computer Science

Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. To overcome some of the limitations of traditional Gang scheduling algorithms, Concurrent Gang is proposed as a class of... more

descriptionView Paper arrow_downwardDownload

Further Analysis with Linear Programming on Blocking Time Bounds for Partitioned Fixed Priority Multiprocessor Scheduling

by 亮倉地

2024, Journal of Information Processing

The recently developed FMLP + provides significant advantages for partitioned fixed priority scheduling, since it ensures asymptotically optimal O(n) maximum priority-inversion blocking. The constraints under the FMLP + can be exploited... more

descriptionView Paper arrow_downwardDownload

vSMT-IO: Improving I/O Performance and Efficiency on SMT Processors in Virtualized Clouds

by Tsz On LI

2023

The paper focuses on an under-studied yet fundamental issue on Simultaneous Multi-Threading (SMT) processors — how to schedule I/O workloads, so as to improve I/O performance and efficiency. The paper shows that existing techniques used... more

descriptionView Paper arrow_downwardDownload

Processor-Oblivious Parallel Stream Computations

by Daouda Traoré

2023, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)

We study the problem of parallel stream computations on a multiprocessor architecture. Modelling the problem, we exhibit that any parallelisation introduces an arithmetic overhead related to intermediate copy operations. We provide lower... more

descriptionView Paper arrow_downwardDownload

Dynamic Load Distribution in MIST

by Dylan J McNamee

2023

This paper presents an algorithm for scheduling parallel applications in large-scale, multiuser, heterogeneous distributed systems. The approach is primarily targeted at systems that harvest idle cycles in general-purpose workstation... more

descriptionView Paper arrow_downwardDownload

Fine-grain priority scheduling on multi-channel memory systems

by Zhichun Zhu

2023, Proceedings Eighth International Symposium on High Performance Computer Architecture

Configurations of contemporary DRAM memory systems become increasingly complex. A recent study [5] shows that application performance is highly sensitive to choices of configurations, and suggests that tuning burst sizes and channel... more

Figure 1. The order of transferring sub-blocks on a DRAM system with four memory channels: (a) with- out priority scheduling and (b) using fine-grain pri- ority scheduling. The letters A—D represent cache blocks, each of which is split into eight sub-blocks. The boxes with bold letters represent the critical sub- blocks that contain the desired data.

Figure 2. Fractions of bursty phase in execution for SPEC2000 programs.

Figure 3. Distribution of the number of concurrent accesses.

Figure 5. Probabilities of multiple critical sub-blocks mapping to the same channel.

Figure 4. Waiting time distribution of critical and non-critical load sub-blocks.

Figure 6. IPC on 2-channel and 4-channel Direct Rambus DRAM systems.

Table 1. Key processor parameters. We use SimpleScalar 3.0b [3] to simulate an out-of- order execution processor. An event-driven simulation of a multi-channel Direct Rambus DRAM system is incorporated into the original simulator. Table 1 gives the key parameters of the processor model.

Table 2. Key parameters of the Direct Rambus DRAM used in the simulation. The bus cycle time is 2.5 ns (400 MHz). the key parameters of this DRAM. We configure the simulated system as 2-channel and 4-channel systems, where each channel has four devices.

descriptionView Paper arrow_downwardDownload

An Effective Self-test Scheduling for Realtime Processor based System

by Yeswanth Reddy

2023, International Journal of Smart Home

Now a days Jobs are Scheduled in a single processor or more than one processor, a real time job is scheduled or executed based on requirements, An Successful task in embedded system ought to have constrained asset necessities: Memory,... more

descriptionView Paper arrow_downwardDownload

Mitigating performance unpredictability in the IaaS using the Kyoto principle

by bao bui

2023, Proceedings of the 17th International Middleware Conference

Performance isolation is enforced in the cloud by setting to each virtual machine (VM) a given fraction of each resource type (physical memory, processor, and IO bandwidth). However, microarchitectural-level resources such as processor's... more

descriptionView Paper arrow_downwardDownload

Fast switching of threads between cores

by Dean Tullsen

2023, ACM SIGOPS Operating Systems Review

We address the software costs of switching threads between cores in a multicore processor. Fast core switching enables a variety of potential improvements, such as thread migration for thermal management, fine-grained load balancing, and... more

Figure 3. Example of system call modified to support core- switching

Table 3. Microbenchmark results with cross-core wakeup

Table 4 summarizes the mechanisms used in the five different core-switching versions we implemented. idle-core intervals. Finding the right mechanism to decide dynami- cally whether to power down is the subject of future research. 7. Effects of architectural parameters

Table 5. Effect of L1 cache size on microbenchmark results Results based on V5 core-switching mechanism 7.1 L1 cache sizes The cost of migration will be sensitive to the sizes of the caches (both instruction and data). The L1 caches (8KB, direct-mapped, 64B block size, 1-cycle) for our simple cores are relatively small, so we also simulated two versions of larger caches. Both are 16KB total size, 64B block size. One is direct-mapped, the other is two- way set associative. Table 5 shows that, for the V5 mechanism, increasing the simple-core L1 cache size to 16KB does indeed improve performance, by 7% (for sim_Cs, with one slow simple core) to 15% (for sim_SS, with one fast simple core). (There is no line in this table for sim_CC, since that configuration has only complex cores, which we always model with 64KB, 2-way L1 caches.) We looked at the effect of two architectural parameters on the performance of our simulated microbenchmarks: L1 cache size, and core-wakeup delay.

Table 9. Core-switch counts for 1 Web trial, dual-core X86 8.1 Simulation pitfalls

Table 6. Effect of power-up delay on performance

Values are transactions/sec. rates (for 100 transactions) *: this trial used 16KB L1 caches Table 10. Simulated throughput for ex_tpceb

Values are KB transferred during 0.133 seconds. Table 8. Simulated Web results on quad-core CPUs

Values are KB transferred during 0.133 seconds. Table 7. Simulated Web results on dual-core CPUs

Table 11. Throughput for ex_tpcb on dual-core X86 Table 12. Core-switch counts for 1 ex_tpcb trial, dual-core X86

Table 15. Core-switch counts for 1 netperf trial, dual-core X86 On the real dual-core Xeon system, with a 1 Gbit/sec NIC, multiple trials of both streaming benchmarks always transferred between 941.2 and 941.45 Mbit/sec regardless of the software configuration (core-switching or unmodified Linux), implying that the system was network-limited. (We have no 10 Gbit/sec NIC for this system.) We did measure the number of times, during one 60- second trial of each benchmark, various system calls performed core-switches. Table 15 shows that almost all of these core switches were in the socketcall system call. (Linux on x86 differs from Linux on Alpha in that its C library funnels the socket API through this one system call.)

Values are KB transferred during 0.167 seconds. Table 13. Simulated Netperf results: TCPstream

Values are KB transferred during 0.167 seconds. Table 14. Simulated Netperf results: TCPmaerts

descriptionView Paper arrow_downwardDownload

Dynamic Bin Packing with Predictions

by Mozhengfu Liu

2023, Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

The MinUsageTime Dynamic Bin Packing (DBP) problem aims to minimize the accumulated bin usage time for packing a sequence of items into bins. It is often used to model job dispatching for optimizing the busy time of servers, where the... more

descriptionView Paper arrow_downwardDownload

System-level fault-tolerance in large-scale parallel machines with buffered coscheduling

by Jose Luis Lopez Sancho

2023, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.

As the number of processors for multi-teraflop systems grows to tens of thousands, with proposed petaflops systems likely to contain hundreds of thousands of processors, the assumption of fully reliable hardware has been abandoned.... more

descriptionView Paper arrow_downwardDownload

Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers

by Jose Luis Lopez Sancho

2023, ACM/IEEE SC 2005 Conference (SC'05)

We describe the software architecture, technical features, and performance of TICK (Transparent Incremental Checkpointer at Kernel level), a system-level checkpointer implemented as a kernel thread, specifically designed to provide fault... more

descriptionView Paper arrow_downwardDownload

Resource management for isolation enhanced cloud services

by Abhishek SIngh

2023, Proceedings of the 2009 ACM workshop on Cloud computing security

descriptionView Paper arrow_downwardDownload

Resource management for isolation enhanced cloud services

by Abhishek Singh

2023, Proceedings of the 2009 ACM workshop on Cloud computing security

descriptionView Paper arrow_downwardDownload

RodosVisor - An object-oriented and customizable hypervisor: The CPU virtualization

by Adriano tavares

2023, Embedded Systems Computational Intelligence and Telematics in Control, 1st IFAC Conference on

descriptionView Paper arrow_downwardDownload

Xen and the Art of Cluster Scheduling

by Matthew Smith

2023, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006)

In shared use clusters, scheduling systems must schedule both serial and parallel jobs in a fair manner, while at the same time optimizing overall cluster efficiency. Since serial and parallel jobs conflict considerably, scheduling both... more

descriptionView Paper arrow_downwardDownload

Extensible resource management for cluster computing

by Md Nayeem Islam

2023, Proceedings of 17th International Conference on Distributed Computing Systems

Advances in server virtualization offer new mechanisms to provide resource management for shared server infrastructures. Resource sharing requires coordination across self-interested system participants (e.g., providers from different administrative domains or third-party brokering intermediaries). Assignments of the shared infrastructure must be fluid and adaptive to meet the dynamic demands of clients. This thesis addresses the hypothesis that a new, foundational layer for virtual computing is sufficiently powerful to support a diversity of resource management needs in a general and uniform manner. Incorporating resource management at a lower virtual computing layer provides the ability to dynamically share server infrastructure between multiple hosted software environments (e.g., grid computing middleware and job execution systems). Resource assignments within the virtual layer occur through a lease abstraction, and extensible policy modules define management functions. This research makes the following contributions: • Defines the foundation for resource management in a virtual computing layer. Defines protocols and extensible interfaces for formulating resource contracts between system participants. Separates resource management functionalities across infrastructure providers, application controllers, and brokering intermediaries, and explores the implications and limitations of this structure. • Demonstrates policy extensibility by implementing a virtual computing layer prototype, Shirako, and evaluating a range of resource arbitration policies for various objectives. Provides results with proportional share, priority, worst-fit, and multi-dimensional resource slivering. • Defines a proportional share policy, Winks, that integrates a fair queuing algorithm with a calendar scheduler. Provides a comprehensive set of features There are many people who have contributed to this work as well as supported me throughout this process. First, I want to thank my advisor, Jeff Chase, for his guidance throughout my graduate career. I am also grateful for the mentorship and suggestions provided by my committee John Wilkes, Carla Ellis, and Jun Yang. A significant portion of this work is drawn from collaborations with the Shirako team. David Irwin and Aydan Yumerefendi have been my key partners in this work over the years. This work also benefited from discussions with many other people including La

descriptionView Paper arrow_downwardDownload

Concurrent Gang: Towards a Flexible and Scalable Gang Scheduler

by Fabricio Alves

2023, Anais do XI International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 1999)

Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. Parallel tasks of a job are scheduled for simultaneous execution on a partition of a parallel computer. Gang Scheduling has many... more

descriptionView Paper arrow_downwardDownload

Improving throughput and utilization in parallel machines through concurrent gang

by Fabricio Alves

2023, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000

descriptionView Paper arrow_downwardDownload

Virtualization Overhead of Multithreading in X86 State-of-the-Art & Remaining Challenges

by Kris Aerts

2023, IEEE Transactions on Parallel and Distributed Systems

Despite great advancements in hardware-assisted virtualization of the x86 architecture, certain workloads still suffer significant overhead. This work dissects said overhead in the context of multi-threading. We describe the state of the... more

descriptionView Paper arrow_downwardDownload

Mutable locks: Combining the best of spin and sleep locks

by Pierangelo Di Sanzo

2023, Concurrency and Computation: Practice and Experience

In this article we present Mutable Locks, a synchronization construct with the same semantic of traditional locks (such as spin locks or sleep locks), but with a self-tuned optimized trade off between responsiveness and CPU-time usage... more

descriptionView Paper arrow_downwardDownload

Schedulling Malleable Task with Convex Processing Speed Functions

by Jan Weglarz

2023, Computación y Sistemas

IN THE PAPER, THE PROBLEM IS CONSIDERED OF SHEDULING VERY LARGE APPLICATIONS ON PARALLEL COMPUTER SYSTEMS. THESE APPLICATIONS CAN BE MODELED AS MALLEABLE TASK, IE, TASKS WHICH PROCESSING SPEED DEPENDS ON A NUMBER OF PROCESSORS GRANTED.... more

descriptionView Paper arrow_downwardDownload

The Global Feasibility and Schedulability of General Task Models on Multiprocessor Platforms

by Nathan Fisher

2023, 19th Euromicro Conference on Real-Time Systems (ECRTS'07)

Feasibility analysis determines (prior to system execution-time) whether a specified collection of hard-realtime jobs executed on a processing platform can meet all deadlines. In this paper, we derive near-optimal sufficient tests for... more

descriptionView Paper arrow_downwardDownload

Scheduling jobs strategies for grid computing: A review

by Salim Amdani

2023, International Journal of Advanced Technology and Engineering Exploration

Simulation results of new scheduling method show quicker cluster system response time than FCFS and Backfilling scheduling methods.

descriptionView Paper arrow_downwardDownload

Reducing the Checkpointing Burden of Condor: Analysis and Implementation

by John Bent

2023

Condor is a distributed system that harnesses the power of users' unused workstations to deliver large amounts of computing to CPU intensive projects. Because users can and do claim their machines at unforeseeable times, Condor... more

descriptionView Paper arrow_downwardDownload

Mitigating performance unpredictability in the IaaS using the Kyoto principle

by Bao Chau Bui

2023, Proceedings of the 17th International Middleware Conference

descriptionView Paper arrow_downwardDownload

A Taxonomy of Adaptive Resource Management Mechanisms in Virtual Machines: Recent Progress and Challenges

by José Rofino Francisco Simão

2023, Computer Communications and Networks

Cloud infrastructures make extensive use of hypervisors (e.g. Xen, ESX), containers (e.g. LXC) and high level virtual machines (e.g. CLR, Java), broadly known as virtual machine (VM) technologies, to achieve workload isolation and... more

Fig. 1.3 c. Container type VM Fig. 1.4 d. High-level language VM

Fig. 1.6 Techniques used by System VMs in the monitoring, decision and action phases

Fig. 1.7 Techniques used by HLL-VMs in the monitoring, decision and action phases

Fig. 1.8 A step-by-step classification process

We think the three metrics are able to capture a design interval as pre- sented in Figure 1.9. They are a proxy for time, space and complexity-related characteristics. Our conjecture is that we will see systems that are away from the minimum and the maximum of the cube, that is, neither too simple (e.g., near the base of the coordinates) nor excelling in the three metrics (e.g., near or coincident with the maximum point in the design space). The following list points the exact meaning of the three criteria, regarding each of the adapta- tion phases. Next, we will detail how they are mapped to a numeric scale, in each phase, which will be used to determine the RCI of systems. 1.4.1 Quantitative Criteria of the RCI taxonomy

Fig. 1.10 Quantitative values for the design options of the RCI framework

Table 1.2 System VMs: Decision techniques

Table 1.3 System VMs: Actuators used in the action phase Looking at the techniques used in the monitor phase, Tables 1.1 and 1.4 show us that only two techniques have the minimum responsiveness. This is so because most of the sensors are near the VM execution space (either in a sub-system of the VM or in the operating system). Low intricateness also is dominant as most sensors are already available.

Table 1.1 System VMs: Sensors monitored value, for responsiveness (second column) and intricateness (third column). Tables 1.4-1.6 are the ones corresponding to the high-level language virtual machines and follow the same logic.

Table 1.4 HLL VMs: Sensors monitored Finally, regarding the action phase, we note that all actuators are either already available in the VM code base or are extensions to the VM code base. Contrary to sensors, no new actuators are proposed for other layers of the execution stack. This leads to not having, in practice, actuators with the maximum intricateness.

Table 1.6 HLL VMs: Actuators used in the action phase

Table 1.7 Example of the aggregations made in step 2 for system Sq The values from the last line of Table 1.7 are the ones used to determine R and I in Table 1.8, following the equation 1.2.

Table 1.8 Example of the arithmetic operations in step 2 for system Sq

Table 1.9 Sys-VM Systems Fig. 1.11 RCI of Sys-VMs

Table 1.10 HLL-VM Systems Fig. 1.12 RCI of HLL-VMs

descriptionView Paper arrow_downwardDownload

An implementation scheme for a virtual machine monitor to be realized on user - microprogrammable minicomputers

by Bruce Shriver

2023, Proceedings of the annual conference on - ACM 76

A virtual machine monitor allows several different operating systems to run concurrently on the same machine. This paper presents the description of a virtual machine monitor and its support structure which can be implemented on a... more

descriptionView Paper arrow_downwardDownload

Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers

by jose carlos

2023, ACM/IEEE SC 2005 Conference (SC'05)

descriptionView Paper arrow_downwardDownload

Enhancement of the Service Delivery for the Cloud Service Providers and Consumers

by Aritra Ghosh

2023

Cloud computing is one of the most recent technology. It is an innovative and exciting style of programming and using computers. It creates tremendous opportunities for software developers: cloud computing can provide an amazing new... more

descriptionView Paper arrow_downwardDownload

Performance, fairness and effectiveness in space-slicing multi-cluster schedulers

by John Ngubiri

2023, … of the 19th IASTED International Conference …

PERFORMANCE, FAIRNESS AND EFFECTIVENESS IN SPACE-SLICING MULTI-CLUSTER SCHEDULERS John Ngubiri Department of Computer Science Makerere University POBox 7062 Kampala, Uganda email: ngubiri@cit.mak.ac.ug Mario van Vliet Informatics and ...

descriptionView Paper arrow_downwardDownload

The greedy multi-cluster scheduler: Performance bounds and parametric sensitivity

by John Ngubiri

2023, International Journal

Abstract. Most schedulers in parallel job scheduling do not put (job) schedulability into consideration when prioritizing jobs. Performance eval-uation is mostly done using average values of the measurement metric. Using the average... more

descriptionView Paper arrow_downwardDownload

Hardware Translation Coherence for Virtualized Systems

by Guilherme Cox

2023, ACM SIGARCH Computer Architecture News

To improve system performance, operating systems (OSes) often undertake activities that require modification of virtual-to-physical address translations. For example, the OS may migrate data between physical pages to manage heterogeneous... more

descriptionView Paper arrow_downwardDownload

Applying backfilling over a non-dedicated cluster

by Emilio Luque

2023

The resource utilization level in open laboratories of several universities has been shown to be very low. Our aim is to take advantage of those idle resources for parallel computation without disturbing the local load. In order to... more

descriptionView Paper arrow_downwardDownload

CISNE: A New Integral Approach for Scheduling Parallel Applications on Non-dedicated Clusters

by Emilio Luque

2023, Lecture Notes in Computer Science

Our main interest is oriented towards keeping both local and parallel jobs together in a non-dedicated cluster. In order to obtain some profits from the parallel applications, it is important to consider time and space sharing as a mean... more

descriptionView Paper arrow_downwardDownload

Evaluating job packing in warehouse-scale computing

by Madhukar R Korupolu

2023, 2014 IEEE International Conference on Cluster Computing (CLUSTER)

One of the key factors in selecting a good scheduling algorithm is using an appropriate metric for comparing schedulers. But which metric should be used when evaluating schedulers for warehouse-scale (cloud) clusters, which have machines... more

descriptionView Paper arrow_downwardDownload

Using SimICS to Evaluate the Penny System

by Peter Magnusson

2023, International Logic Programming Symposium/International Symposium on Logic Programming/North American Conference on Logic Programming/Symposium on Logic Programming

Abstract: this paper assume a 4Kbyte first-level cache with 32-byte cache lines, and a2Mbyte second-level cache with 64-byte cache lines, both direct-mapped

descriptionView Paper arrow_downwardDownload

CISNE: A New Integral Approach for Scheduling Parallel Applications on Non-dedicated Clusters

by emilio luque

2022, Lecture Notes in Computer Science

descriptionView Paper arrow_downwardDownload

A Simulation Environment for Job Scheduling on Distributed Systems

by Peter M A Sloot

2022, Lecture Notes in Computer Science

In this paper we present a simulation environment for the study of hierarchical job scheduling on distributed systems. The environment provides a multi-level mechanism to simulate various types of jobs. An execution model of jobs is... more

descriptionView Paper arrow_downwardDownload

Improvements in parallel job scheduling using gang service

by Fabricio Silva

2022, Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'99)

Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. Parallel threads of a single job are scheduled for simultaneous execution on a parallel computer even if the job does not fully... more

descriptionView Paper arrow_downwardDownload

Gang Scheduling

Key research themes

1. How can fairness and equity be integrated into scheduling algorithms to improve multi-day and multi-client resource allocation?

2. What are the algorithmic and theoretical frameworks for scheduling multiprocessor tasks involving simultaneous execution and precedence constraints?

3. How can gang scheduling be optimized and adapted in contemporary multiprocessor and parallel computing environments, including clusters and real-time systems?

Related Topics

All papers in Gang Scheduling