Key research themes
1. How can CPU and GPU be efficiently combined to optimize heterogeneous computing performance?
This theme investigates strategies for workload partitioning, memory management, communication, and synchronization to maximize overall system performance when CPUs and GPUs operate cooperatively. Such heterogeneous computing leverages fundamentally different architectures—the latency-optimized CPU and throughput-oriented GPU—to accelerate data-intensive and parallel applications. Proper optimization must address load balancing, data access latency, and minimizing communication overhead to realize the full potential of hybrid systems.
2. What programming models and architectural features enable effective GPU task scheduling for dynamic and irregular workloads?
Traditional GPU programming predominantly targets static, data-parallel workloads, but many advanced applications have dynamic, irregular, or recursive parallelism patterns that challenge existing shading or compute language models. Research in this theme is focused on developing programming abstractions and scheduling models that support multiple instruction streams, dynamic work creation, fine-grained load balancing, data locality preservation, and varying parallelism granularities to better map irregular tasks onto massively parallel GPU architectures.
3. How can GPUs be leveraged and optimized for domain-specific high-performance computing applications?
This research area explores the implementation and acceleration of computationally demanding scientific and engineering problems using GPUs. It involves algorithm redesign, architecture-tailored numerical methods, and efficient programming to exploit GPUs’ massive parallelism and memory hierarchies. Across domains such as thermal simulations, fluid dynamics, radar signal processing, and bioinformatics, GPU computing enables orders-of-magnitude speedups, real-time analysis capability, and improved simulation fidelity, which are vital for engineering design, environmental monitoring, and biomedical research.
4. What are the architectural design considerations and comparative analyses for GPU programming models on modern supercomputers?
With the increasing reliance on GPUs in high-performance computing (HPC) facilities such as pre-exascale and exascale systems, this research area examines the performance tradeoffs, ease of use, portability, and optimization techniques of various GPU programming models. It involves evaluating vendor-supported languages (e.g., CUDA, HIP), directive-based models (OpenMP, OpenACC), and other abstractions (SYCL, Kokkos), especially focusing on AMD and NVIDIA GPUs in production supercomputers. Insights gained inform best practices for efficient GPU utilization and software portability across diverse HPC architectures.
5. How can GPU memory access and data transfer mechanisms be enhanced using neural networks and advanced DMA controllers to improve multimedia and computing system performance?
GPU performance is often bottlenecked by inefficient memory access and data transfer between host and device. This research direction develops intelligent direct memory access (DMA) controllers leveraging back-propagation neural networks and advanced adaptive data placement to optimize memory channel usage, support high bandwidth transfers, and reduce power consumption. Applications include multimedia processing and heterogeneous computing involving GPU-FPGA systems, showing performance gains and reduced latency crucial for high-throughput GPU workloads.