Academia.eduAcademia.edu

Outline

Generic programming for high-performance scientific applications

2005, Concurrency and Computation: Practice and Experience

https://doi.org/10.1002/CPE.864

Abstract

We present case studies that apply generic programming to the development of high-performance parallel code for solving two archetypal partial differential equations (PDEs). We examine the overall structure of the example scientific codes and consider their generic implementation. With a generic approach it is a straightforward matter to reuse software components from different sources; implementations with components from the Iterative Template Library (ITL), the Matrix Template Library (MTL), Blitz++, A++/P++, and Fortran BLAS are presented. Our newly developed Generic Message Passing library is used for communication. We compare the generic implementations with equivalent implementations developed with alternative libraries and languages and discuss performance as well as software engineering issues.

References (39)

  1. Lee M, Stepanov A. The standard template library. Technical Report, HP Laboratories, February 1995.
  2. Lumsdaine A, Siek JG, Lee L-Q. The Matrix Template Library home page. http://ww.osl.iu.edu/research/mtl [October 2001].
  3. Siek J, Lumsdaine A, Lee L-Q. Generic programming for high performance numerical linear algebra. Proceedings of the SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing (OO'98). SIAM: Philadelphia, PA, 1998.
  4. Lee L-Q, Siek JG, Lumsdaine A. Generic graph algorithms for sparse matrix ordering. Proceedings of the 3rd International Symposium on Computing in Object-Oriented Parallel Environments, ISCOPE'99 (Lecture Notes in Computer Science, vol. 1732). Springer: Berlin, 1999; 120-129.
  5. Lumsdaine A, Lee L-Q, Siek JG. The Iterative Template Library home page. http://ww.osl.iu.edu/research/itl [October 2001].
  6. Lee L-Q, Lumsdaine A. The generic message-passing framework. Proceedings of the 17th International Symposium on Parallel and Distributed Processing. IEEE Computer Society Press: Los Alamitos, CA, 2003; 53.
  7. Veldhuizen T. Blitz++ home page. http://oonumerics.org/blitz [November 2001].
  8. Quinlan D. A++/P++ Manual. Lawrence Livermore National Laboratory: Livermore, CA, 2000.
  9. Lawson C, Hanson R, Kincaid D, Krogh F. Basic linear algebra subprograms for Fortran usage. ACM Transactions on Mathematical Software 1979; 5(3):308-323.
  10. Dongarra J, Croz JD, Hammarling S, Hanson R. Algorithm 656: An extended set of basic linear algebra subprograms: Model implementations and test programs. ACM Transactions on Mathematical Software 1988; 14(1):18-32.
  11. Dongarra J, Croz JD, Duff I, Hammarling S. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software 1990; 16(1):1-17.
  12. Programming Language C++. ISO/IEC Final Draft International Standard number 14882. ISO/IEC, Geneva, Switzerland, 1998.
  13. Siek J, Lumsdaine A. Concept checking: Binding parametric polymorphism in C++. First Workshop on C++ Template Programming, Erfurt, Germany. October 2000. Available at: http://oonumerics.org/tmpw00.
  14. Järvi J, Willcock J, Lumsdaine A. Concept-controlled polymorphism. Generative Programming and Component Engineering (Lecture Notes in Computer Science, vol. 2830), Pfennig F, Smaragdakis Y (eds.). Springer: Berlin, 2003; 228-244.
  15. Myers NC. Traits: A new and useful template technique. C++ Report, June 1995.
  16. Siek J, Lee L-Q, Lumsdaine A. The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley: Boston, MA, 2001.
  17. Hestenes MR, Stiefel E. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau Standards 1952; 49(6):409-436.
  18. Skjellum A, Wooley DG, Lumsdaine A, Squyres JM. Object-oriented analysis and design of the message passing interface. Concurrency and Computation: Theory and Practice 2001; 13(4):245-292.
  19. Josuttis NM. The C++ Standard Library: A Tutorial and Reference. Addison-Wesley: Reading, MA, 1999. Copyright c 2005 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2005; 17:941-965
  20. Coulaud O, Dillon E. Para++: C++ binding for message passing. Technical Report, Institut National de Recherche en Informatique et en Automatique, 1997.
  21. Grundmann T, Ritt M, Rosenstiel W. TPO++: An object-oriented message-passing library in C++. Proceedings of the International Conference on Parallel Processing. IEEE Computer Society Press: Los Alamitos, CA, 2000; 43-50.
  22. Briggs K. The doubledouble library. http://members.lycos.co.uk/keithmbriggs/doubledouble.html [April 2002].
  23. Cuthill EH, McKee J. Reducing the bandwidth of sparse symmetric matrices. Proceedings of the 24th National Conference of the ACM. ACM Press: New York, 1969; 157-172.
  24. Liu W, Sherman A. Comparative analysis of the Cuthill-McKee and the reverse Cuthill-McKee ordering algorithms for sparse matrices. SIAM Journal of Numerical Analysis 1976; 13(2):198-213.
  25. Heber G, Biswas R, Gao G. Self-avoiding walks over adaptive unstructured grids. Parallel and Distributed Processing. (Lecture Notes in Computer Science, vol. 1586). Springer: Berlin, 1999; 968-977.
  26. Gropp WD, Smith B. PETSc: Portable extensible tools for scientific computation. Technical Report, Argonne National Laboratory, Argonne, IL, 1994.
  27. Tuminaro RS, Heroux M, Hutchinson SA, Shadid JN. Official Aztec User's Guide: Version 2.1, Sandia National Laboratories, Albuquerque, NM, November 1999.
  28. Ashcraft C, Grimes R. SPOOLES: An object-oriented sparse matrix library. Proceedings of the 1999 SIAM Conference on Parallel Processing for Scientific Computing. SIAM: Philadelphia, PA, 1999.
  29. Heroux M, Barth T, Day D, Hoekstra R, Lehoucq R, Long K, Pawlowski R, Tuminaro R, Williams A. The Trilinos project. http://www.cs.sandia.gov/∼mheroux/Trilinos/doc/Trilinos.html.
  30. The ESI Technical Forum. Equation Solver Interface (ESI) standards multi-lab working group. http://z.ca.sandia.gov/esi.
  31. Pozo R. Template numerical toolkit for linear algebra: High performance programming with C++ and the Standard Template Library. Proceedings of the 3rd Conference on Environments and Tools for Parallel Scientific Computing (ETPSC III), Faverges de la Tour, France, August 1996.
  32. Bennett BAV, Smooke MD. Local rectangular refinement with application to nonreacting and reacting fluid flow problems. Journal of Computational Physics 1999; 151(2):684-727.
  33. Cai X-C, Gropp WD, Keyes DE, Tidriri MD. Newton-Krylov-Schwarz methods in CFD. The International Workshop on Numerical Methods for the Navier-Stokes Equations, Hebeker F, Rannacher R (eds.). Vieweg: Braunschweig, 1994.
  34. Gropp WD, Keyes DE, McInnes LC, Tidriri MD. Globalized Newton-Krylov-Schwarz algorithms and software for parallel implicit CFD. International Journal of High Performance Computing Appolications 2000; 14:102-136.
  35. Hayder ME, Ierotheou C, Keyes DE. Three parallel programming paradigms: Comparisons on an archetypal PDE computation. Parallel and Distributed Computing Practices 2000; 2:35-53.
  36. Saad Y, Schultz M. GMRES: A generalized minimum residual algorithm for solving nonsymmetric linear systems. SIAM Journal of Scientific and Statistical Computing 1986; 7(3):856-869.
  37. Brown PN, Hindmarsh AC. Matrix-free methods for stiff systems of ODE's. SIAM Journal of Numerical Analysis 1986; 23(3):610-638.
  38. Dijkstra EW. A Discipline of Programming. Prentice-Hall: Englewood Cliffs, NJ, 1976.
  39. Veldhuizen T. Techniques for scientific C++. Technical Report 542, Indiana University, 2000.