Academia.eduAcademia.edu

Outline

Multicore compilation strategies and challenges

2009

https://doi.org/10.1109/MSP.2009.934117

Abstract

Abstract To overcome challenges stemming from high power densities and thermal hot spots in microprocessors, multicore computing platforms have emerged as the ubiquitous computing platform from servers down through embedded systems. Unfortunately, providing multiple cores does not directly translate into increased performance or better energy efficiency for most applications.

References (13)

  1. J. H. Ahn, M. Erez, and W. J. Dall, "Tradeoff between data-, instruction-, and thread-level parallelism in stream processors," in Proc. ICS'07, 2007, pp. 126-137.
  2. J. A. Fisher, P. Farabosch, and C. Young, Embedded Computing: A VLIW Approach to Architecture, Compiler and Tools. San Mateo, CA: Morgan Kaufmann, 2004.
  3. J. Nickolls and I. Buck, "NVIDIA CUDA software and GPU parallel computing architecture," in Proc. Microprocessor Forum, Oct. 2007, pp. 103-104.
  4. T. D. R. Hartley, U. Catalyurek, A. Ruiz, F. Igual, R. Mayo, and M. Ujaldon, "Biomedical image analysis on a cooperative cluster of GPUs and multicores," in Proc. 2008 Int. Conf. Supercomputing, pp. 15-25.
  5. A. Ruiz, M. Ujaldon, L. Cooper, and K. Huang, "Non-rigid registration for large sets of microscopic images on graphics processors," J. Signal Process. Syst., vol. 55, no. 1-3, pp. 229-250, Apr. 2008.
  6. L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, P. Dubey, S. Junkins, A. Lake, R. Cavin, R. Espasa, E. Grochowski, T. Juan, M. Abrash, J. Sugerman, and P.Hanrahan, "Larrabee: A many-core x86 architecture for visual computing," ACM Trans. Graph., vol. 29, no. 1, pp. 10-21, Jan./Feb. 2009.
  7. A. E. Eichenberger, K. O'Brien, K. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, and M. Gschwind, "Optimizing compiler for the CELL processor," in Proc. 14th Int. Conf. Parallel Architectures and Compilation Techniques, Sept. 2005, pp. 161-172.
  8. H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke, "Uncovering hidden loop level parallelism in sequential applications," in Proc. 14th Int. Symp. High- Performance Computer Architecture, Feb. 2008, pp. 290-301.
  9. W. Thies, M. Karczmarek, and S. P. Amarasinghe, "StreamIt: A language for streaming applications," in Proc. 2002 Int. Conf. Compiler Construction, 2002, pp. 179-196.
  10. W. Pugh, "The Omega test: A fast and practical integer programming algo- rithm for dependence analysis," in Proc. 1991 ACM/IEEE Conf. Supercomputing, 1991, pp. 4-13.
  11. R. M. Russell, "The CRAY-1 computer system," Commun. ACM, vol. 21, no. 1, pp. 63-72, 1978.
  12. G. Ottoni, R. Rangan, A. Stoler, and D. I. August, "Automatic thread extrac- tion with decoupled software pipelining," in Proc. 38th IEEE/ACM Int. Symp. Microarchitecture, Nov. 2005, pp. 105-116.
  13. M. Kudlur and S. Mahlke, "Orchestrating the execution of stream programs on multicore platforms," in Proc. SIGPLAN '08 Conf. Programming Language Design and Implementation, June 2008, pp. 114-124.