Academia.eduAcademia.edu

Outline

Targeting Heterogeneous Architectures via Macro Data Flow

2012, Parallel Processing Letters

Abstract

We propose a data flow based run time system as an efficient tool for supporting execution of parallel code on heterogeneous architectures hosting both multicore CPUs and GPUs. We discuss how the proposed run time system may be the target of both structured parallel applications developed using algorithmic skeletons/parallel design patterns and also more "domain specific" programming models. Experimental results demonstrating the feasibility of the approach are presented.

References (23)

  1. Timothy Mattson, Beverly Sanders, and Berna Massingill. Patterns for parallel pro- gramming. Addison-Wesley Professional, first edition, 2004.
  2. Murray Cole. Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming. Parallel Computing, 30(3):389-406, 2004.
  3. Horacio González-Vélez and Mario Leyton. A survey of algorithmic skeleton frame- works: high-level structured parallel programming enablers. Softw., Pract. Exper., 40(12):1135-1160, 2010.
  4. Wesley M. Johnston, J. R. Paul Hanna, and Richard J. Millar. Advances in dataflow programming languages. ACM Comput. Surv., 36:1-34, March 2004.
  5. Jack B. Dennis and David P. Misunas. A preliminary architecture for a basic data- flow processor. In Proceedings of the 2nd annual symposium on Computer architecture, ISCA '75, pages 126-132, New York, NY, USA, 1975. ACM.
  6. John R. Gurd, Chris C. Kirkham, and Ian Watson. The manchester prototype dataflow computer. Commun. ACM, 28(1):34-52, 1985.
  7. Marco Danelutto. Efficient support for skeletons on workstation clusters. Parallel Processing Letters, 11(1):41-56, 2001.
  8. Shuvra S. Bhattacharyya, Gordon Brebner, Jörn W. Janneck, Johan Eker, Carl von Platen, Marco Mattavelli, and Mickaël Raulet. Opendf: a dataflow toolset for reconfig- urable hardware and multicore systems. SIGARCH Comput. Archit. News, 36:29-35, June 2009.
  9. Samer Arandi and Paraskevas Evripidou. Programming multi-core architectures using Data-Flow techniques. In ICSAMOS'10, pages 152-161, 2010.
  10. Pritish Jetley and Laxmikant V. Kal. Static macro data flow: Compiling global control into local control. In IPDPS Workshops'10, pages 1-8, 2010.
  11. Bruno Bacci, Marco Danelutto, Salvatore Orlando, Susanna Pelagatti, and Marco Vanneschi. P 3 L: a structured high level programming language and its structured support. Concurrency Practice and Experience, 7(3):225-255, May 1995.
  12. Susanna Pelagatti. Task and data parallelism in P3L, pages 155-186. Springer-Verlag, London, UK, 2003.
  13. Gerhard R. Joubert, Wolfgang E. Nagel, Frans J. Peters, Oscar G. Plata, P. Tirado, and Emilio L. Zapata, editors. Parallel Computing: Current & Future Issues of High- End Computing, Proceedings of the International Conference ParCo 2005, 13-16 September 2005, Department of Computer Architecture, University of Malaga, Spain, volume 33 of John von Neumann Institute for Computing Series. Central Institute for Applied Mathematics, Jülich, Germany, 2005.
  14. Jakub Kurzak, Hatem Ltaief, Jack Dongarra, and Rosa M. Badia. Scheduling dense linear algebra operations on multicore processors. Concurrency and Computation: Practice and Experience, 22:15-44, 2010.
  15. M. Aldinucci, L. Anardu, M. Danelutto, P. Kilpatrick, and M. Torquati. Parallel pat- terns + Macro Data Flow for multi-core programming. In Proceedings of the 20th International EuroMicro Conference on Parallel, Distributed and Network-based Pro- cessing, pages 27-36. Conference Publishing Services IEEE, 2012. ISBN 978-0-7695- 4633-9.
  16. Johan Enmyren and Christoph W. Kessler. Skepu: a multi-backend skeleton pro- gramming library for multi-gpu systems. In Proceedings of the fourth international workshop on High-level parallel programming and applications, HLPP '10, pages 5-14, New York, NY, USA, 2010. ACM.
  17. S. Ernsting and H. Kuchen. Data Parallel Skeletons for GPU Clusters and Multi-GPU Systems. In Proceedings of PARCO 2012, Gent, 2012. to appear.
  18. Michel Steuwer, Philipp Kegel, and Sergei Gorlatch. Skelcl -a portable skeleton li- brary for high-level gpu programming. In Proceedings of the 2011 IEEE Interna- tional Symposium on Parallel and Distributed Processing Workshops and PhD Fo- rum, IPDPSW '11, pages 1176-1182, Washington, DC, USA, 2011. IEEE Computer Society.
  19. U. Dagstgeer, C. Kessler, and S. Thibault. Flexible runtime support for efficient skeleton programming on hybrid systems. In Proceedings of PARCO 2012, Gent, 2012. to appear, Gent.
  20. Eduard Ayguadé, Nawal Copty, Alejandro Duran, Jay Hoeflinger, Yuan Lin, Federico Massaioli, Xavier Teruel, Priya Unnikrishnan, and Guansong Zhang. The design of openmp tasks. IEEE Trans. Parallel Distrib. Syst., 20:404-418, March 2009.
  21. R. Badia. StarSs support for programming heterogeneous platforms. In Proceedings of Heteropar 2012 workshop, to appear in EuroPar 2012 Parallel Computing Workshop Proceedings, 2012.
  22. Judit Planas, Rosa M. Badia, Eduard Ayguadé, and Jesús Labarta. Hierarchical task- based programming with starss. IJHPCA, 23(3):284-299, 2009.
  23. Marco Aldinucci, Marco Danelutto, and Peter Kilpatrick. Towards hierarchical man- agement of autonomic components: a case study. In Didier El Baz, Tom Gross, and Francois Spies, editors, Proc. of Intl. Euromicro PDP 2009: Parallel Distributed and network-based Processing, pages 3-10, Weimar, Germany, February 2009. IEEE.