Understanding the performance of DSM applications
1997, Lecture Notes in Computer Science
https://doi.org/10.1007/3-540-62573-9_15Abstract
Carnival is a performance measurement and analysis tool that assists users in understanding the performance of DSM applications and protocols. Using traces of program executions, Carnival presents performance data as a hierarchy of execution pro les. During analysis, Carnival automates the inference process that relates performance phenomena to speci c causes in the source code or DSM protocol using techniques that focus on the two most important sources of overhead in DSM systems: waiting time analysis identi es the causes of synchronization overhead, and produces an explanation for each source of waiting time in the program communication analysis identi es the sequence of requests that result in invalidations, and produces an explanation for each source of communication. We describe these techniques and their implementation in TreadMarks, and show h o w t o u s e w aiting time analysis and communication analysis to improve the running time of two programs from the SPLASH application suite when executed on DEC Alphas connected by a DEC Memory Channel network.
References (16)
- C. Amza, A. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W . Y u, and W. Zwaenepoel. Treadmarks: shared memory computing on networks of work- stations. IEEE Computer, F ebruary 1996.
- R. Bianchini, L. Kontothanassis, R. Pinto, M. De Maria, M. Abud, and C. Amorim. Hiding communication latency and coherence overhead in software DSMs. In Proceedings of the 7th International Conference o n A rchitectural Support for Programming Languages and Operating Systems, Boston,MA, October 1996.
- M. Blumrich, C. Dubnicki, E. Felten, K. Li, and M. Mesarina. Virtual-memory- mapped network interfaces. IEEE Micro, 15(2):21{28, February 1995.
- T. Chilimbi, T. Ball, S. Eick, and J. Larus. Stormwatch: A tool for visualizing memory system protocols. In Proceedings of Supercomputing'95, San Diego, CA, December 1995. IEEE.
- R. Gillett. Memory channel network for PCI. IEEE Micro, pages 12{18, February 1996.
- A. Goldberg and J. Hennessy. M T ool:an integrated system for performance debug- ging shared memory multiprocessor applications. IEEE Transactions on Parallel and Distributed Systems, 4(1):28{40, January 1993.
- L. Iftode, C. Dubnicki, E. Felten, and K. Li. Improving release-consistent shared virtual memory using automatic update. In Proceedings of the 2nd IEEE Sympo- sium on High-Performance Computing Architecture. I E E E , F ebruary 1996.
- P. Keleher, A. Cox, and W. Zwaenepoel. Lazy release consistency for software distributed shared memory. I n Proceedings of the 19th International Symposium on Computer Architecture, pages 13{21, Gold Coast, Australia, May 1992. ACM.
- L. Kontothanassis and M. Scott. High performance software coherence for current and future architectures. Journal of Parallel and Distributed Computing, 29:179{ 195, November 1995.
- M. Martonosi, A. Gupta, and T. Anderson. Memspy: Analyzing memory system bottlenecks in programs. Performance Evaluation Review, 20(1):1 { 12, June 1992. Reprint of a paper presented in Sigmetrics' 92.
- W. Meira Jr., T. LeBlanc, and A. Poulos. Waiting time analysis and performance visualization in Carnival. In Proceedings of SPDT96: SIGMETRICS Symposium on Parallel and Distributed T ools, pages 1{10, Philadelphia, PA, May 1996. ACM.
- B. P. Miller, M. D. Callaghan, J. M. Cargille, J. K. Hollingsworth, R. B. Irvin, K. L. Karavanic, K. Kunchithapadam, and T. Newhall. The Paradyn parallel per- formance measurement t o o l . IEEE Computer, 28(11):37{46, November 1995.
- John K. Ousterhout. Tcl and Tk Toolkit. Addison Wesley, 1994.
- R. Rajamony and A. Cox. A performance debugger for eliminating excess syn- chronization in shared-memory parallel programs. In Proceedings of the 4th In- ternational Workshop on Modeling, Analysis, and Simulation of COmputer and Telecommunication Systems (MASCOTS), F ebruary 1996.
- J. P. Singh, W. Weber, and A. Gupta. SPLASH: Stanford parallel applications for shared memory. Computer Architecture N e w s , 20(1):5{44, March 1 9 9 2 .
- S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pages 24{36, Santa Margherita Ligure, Italy, June 1995. ACM.