Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region, 2000
This paper presents an efJicient communication subsystem, DP-11, for clustering standard high-vol... more This paper presents an efJicient communication subsystem, DP-11, for clustering standard high-volume (SHV) servers using Gigabit Ethernet. The DP-11 employs several lightweight messaging mechanisms to achieve low-latency and high-bandwidth communication. The test shows an 18.32 us single-trip latency and 72.8 MB/s bandwidth on a Gigabit Ethernet network for connecting two Dell PowerEdge 6300 Quad Xeon SMP servers running Linux. To improve the programmabitity of the DP-11 communication subsystem, the development of DP-I1 was based on a concise yet poweijiul abstract communication model, Directed Point Model, which can be conveniently used to depict the inter-process communication pattern of a parallel task in the cluster environment. In addition, the API of DP-I1 preserves the syntax and semantics of traditional UNIX U0 operations, which make it easy to use.
Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004.
The introduction of mobile code in the pervasive computing environment provides a good opportunit... more The introduction of mobile code in the pervasive computing environment provides a good opportunity for research in ways to enhance execution flexibility. We note that current mobile code is too heavyweight and not adaptive enough to be used in pervasive computing where devices are resourcelimited and heterogeneity is the norm. In this paper, we propose a new lightweight, component-based mobile agent system that can adapt to diverse devices and features resource saving as one of its aims. The system supports strong mobility of mobile code, which is a prerequisite for achieving system flexibility and good performance. The system discretize the transmission of code and execution states and relies on a scheme called state-on-demand (SOD) for the execution of the mobile code. We provide performance results to demonstrate the effectiveness of the SOD scheme.
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)
Home migration is used to tackle the home assignment problem in home-based sofnvare distributed s... more Home migration is used to tackle the home assignment problem in home-based sofnvare distributed shared memoiy systems. We propose an adaptive home migration protocol to optimize the single-writer pattern which occurs frequently in distributed applications. Our approach is unique in its use of a per-object threshold which is continuously adjusted to facilitate home migration decisions This adaptive threshold is monotonously decreasing with increased likelihood that a particular object exhibits a lasting single-writer pattern. The threshold is tuned according to the feedback of previous home migration decisions at runtime. We implement this new adaptive home migration protocol in a distributed Java virtual Machine that suppons truly parallel execution of multi-threaded Java applications on clusters. The analysis arid the experiments show that our new home migration protocol demonstrates both the sensitivily to the lasting single-writer pattern and the robustness against the transient single-writer paltern. In the latter case, the pmtocol inhibits home migration in order to reduce the home redirection overhead.
Machine (DJVM) is a cluster-wide virtual machine that supports parallel execution of a multithrea... more Machine (DJVM) is a cluster-wide virtual machine that supports parallel execution of a multithreaded Java application on clusters, as if it was executed on a single machine but with improved computation power. The DJVM hides the physical boundaries between the cluster nodes and allows parallelly executed Java threads to access all cluster resources through a unified interface. It is a more user-friendly parallel environment than many other existing parallel languages [8], or libraries for parallel programming such as MPI [13], CORBA [16], and Java RMI [7]. The DJVM research is valuable for high-performance computing as Java has become the dominant language for building the server-side applications, such as enterprise information systems, Web services, and large-scale Grid computing systems, due to its platform independency and built-in multithreading support at language level. This chapter addresses the realization of a distributed Java virtual machine, named JESSICA2, on clusters. Section 1.1 describes Java, Java Virtual Machine, and the main programming paradigms using Java for high-performance computing. We then focus our study on the newly emerging distributed JVM research in Section 1.2. In Section 1.3, we introduce our JESSICA2 Distributed JVM. Section 1.4 gives the performance analysis of JESSICA2. Related work is given in Section 1.5. Section 1.6 concludes this chapter. † This research is supported by Hong Kong RGC Grant HKU-7030/01E and HKU Large Equipment Grant 01021001. i BACKGROUND iii app.java app.class Java compiler Java Virtual Machine Method Area Heap PC PC obj ref Object Thread Scheduler Class Loader Execution Engine obj ref Object Stack Stack
Proceedings. IEEE International Conference on Cluster Computing
A distributed Java Virtual Machine (DJVM) spanning multiple cluster nodes can provide a true para... more A distributed Java Virtual Machine (DJVM) spanning multiple cluster nodes can provide a true parallel execution environment for multi-threaded Java applications. Most existing DJVMs suffer from the slow Java execution in interpretive mode and thus may not be efficient enough for solving computation-intensive problems. We present JESSICA2, a new DJVM running in JIT compilation mode that can execute multi-threaded Java applications transparently on clusters. JESSICA2 provides a single system image (SSI) illusion to Java applications via an embedded global object space (GOS) layer. It implements a cluster-aware Java execution engine that supports transparent Java thread migration for achieving dynamic load balancing. We discuss the issues of supporting transparent Java thread migration in a JIT compilation environment and propose several lightweight solutions. An adaptive migrating-home protocol used in the implementation of the GOS is introduced. The system has been implemented on x86-based Linux clusters, and significant performance improvements over the previous JESSICA system have been observed.
2003 International Conference on Parallel Processing, 2003. Proceedings., 2003
A distributed JVM on a cluster can provide a highperformance platform for running multi-threaded ... more A distributed JVM on a cluster can provide a highperformance platform for running multi-threaded Java applications transparently. Efficient scheduling of Java threads among cluster nodes in a distributed JVM is desired for maintaining a balanced system workload so that the application can achieve maximum speedup. We present a transparent thread migration system that is able to support high-performance native execution of multi-threaded Java programs. To achieve migration transparency, we perform dynamic native code instrumentation inside the JIT compiler. The mechanism has been successfully implemented and integrated in JESSICA2, a JIT-enabled distributed JVM, to enable automatic thread distribution and dynamic load balancing in a cluster environment. We discuss issues related to supporting transparent Java thread migration in a JIT-enabled distributed JVM, and compare our solution with previous approaches that use static bytecode instrumentation and JVMDI. We also propose optimizations including dynamic register patching and pseudo-inlining that can reduce the runtime overhead incurred in a migration act. We use measured experimental results to show that our system is efficient and lightweight.
IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004.
Applying a cache coherence protocol capable of adapting to memory access patterns is a viable app... more Applying a cache coherence protocol capable of adapting to memory access patterns is a viable approach to improving the performance of software distributed shared memory. In this paper, we present an approach of postmortem memory access pattern analysis and visualization, which has been applied to our design of a global object space for a distributed Java Virtual Machine. The tool not only can enhance our understanding of the access patterns inherent in an application but can also help us to evaluate the effectiveness of an adaptive protocol used in the design of the global object space.
for their patient guidance, encouragement, and advices in both research and daily life. I would l... more for their patient guidance, encouragement, and advices in both research and daily life. I would like to thank Dr. Wang for teaching me how to do research and guiding me in the research of distributed Java Virtual Machine. He has been so careful and energetic in correcting every mistake in my papers. I really learned a lot from his comments. Dr. Wang is also a very generous and kind teacher and friend in daily life. He often invited us to his party and we always had a good time together with his lovely family. I also want to thank Dr. Lau for showing us how to do research that is serious and of great impact. His advices on research is invaluable. I also appreciate Dr. Lau's revision on papers. The papers always looked much better after his refinement. Finally I do appreciate their support in finance for the research. I would also like to thank Prof. Zhi-wei Xu from the Institute of Computing Technology, Chinese Academy of Science, who recommended me for the Ph.D. study in HKU for conducting this research. I would like to express my appreciation in working with Fang Weijian during the Ph.D. study. We had good cooperations in the research project. We also had a good time in our leisure time. We organized hiking often, and discussed research problems at the same time. Through hiking we developed good bodies for research. Here I would like to list other partners and thank them for sharing the joy of hiking. They are Gaoyan Luo, Tingting He, Lin
In this paper, we study the practical issues on the design of a new communication subsystem, name... more In this paper, we study the practical issues on the design of a new communication subsystem, named Directed Point (DP), on a server cluster with Gigabit Ethernet connection, with the goals of achieving high performance and good programmability. Our design exploits the gigabit network architecture and the operating system characteristics. We propose a realistic communication model which can be used to assess various design tradeoffs and to calibrate the performance results. Testing shows that DP communication subsystem can achieve a 16.3 s single-trip latency and 79.5 MBps bandwidth. To achieve good programmability, we propose an abstraction model that allows all inter-process communication patterns to be easily coded using the provided API. The API preserves the syntax and semantics of traditional UNIX I/O operations, making the proposed communication subsystem easy to use without long learning period.
Thread migration is to support the movement of threads across machine boundaries in a distributed... more Thread migration is to support the movement of threads across machine boundaries in a distributed comput-ing environment. It can improve load balancing and the ex-ecution efficiency of multithreaded programs. In this paper, we introduce a new approach that employs the technique of ...
Uploads
Papers by Wenzhang Zhu