HIERARCHICAL MAPPING FOR HPC APPLICATIONS
Abstract
As the high performance computing systems scale up, mapping the tasks of a parallel application onto physical processors to allow efficient communication becomes one of the critical performance issues. Existing algorithms were usually designed to map applications with regular communication patterns. Their mapping criterion usually overlooks the size of communicated messages, which is the primary factor of communication time. In addition, most of their time complexities are too high to process large scale problems.
In this paper, we present a hierarchical mapping algorithm (HMA), which is capable of mapping applications with irregular communication patterns. It first partitions tasks according to their run-time communication information. The tasks that communicate with each other more frequently are regarded as strongly connected. Based on their connectivity strength, the tasks are partitioned into supernodes based on the algorithms in spectral graph theory. The hierarchical partitioning reduces the mapping algorithm complexity to achieve scalability. Finally, the run-time communication information will be used again in fine tuning to explore better mappings. With the experiments, we show how the mapping algorithm helps to reduce the point-to-point communication time for the PDGEMM, a ScaLAPACK matrix multiplication computation kernel, up to 20% and the AMG2006, a tier 1 application of the Sequoia benchmark, up to 7%.
References
- The AMG Benchmark , https://asc.llnl.gov/sequoia/benchmarks/#amg . Google Scholar
- The Community Climate System Model (CCSM) , http://www.cesm.ucar.edu/models/ccsm4.0/ . Google Scholar
- The Sequoia Benchmark , https://asc.llnl.gov/sequoia/benchmarks/ . Google Scholar
- Space-Filling Curves. Springer-Verlag, 1994 . Google Scholar
- IEEE Trans. Parallel Distrib. Syst. 18, 1258 (2007). Crossref, ISI, Google Scholar
- IEEE Transactions on Computers 31(9), 907 (1982). ISI, Google Scholar
-
Zhaojun Bai , Templates for the Solution of Algebraic Eigenvalue Problems: a Practical Guide ( SIAM , 1987 ) . Google Scholar - Satish Balay, Kris Buschelman, William D. Gropp, Dinesh Kaushik, Matthew G. Knepley, Lois Curfman McInnes, Barry F. Smith, and Hong Zhang. PETSc Web page, 2001 . Google Scholar
- IBM Journal of Research and Development 49(2), 489 (2005). Crossref, ISI, Google Scholar
Abhinav Bhatelé , Eric Bohm and Laxmikant V. Kalé , A case study of communication optimizations on 3d mesh interconnects, Euro-Par '09: Proceedings of the 15th International Euro-Par Conference on Parallel Processing (Springer-Verlag, Berlin, Heidelberg, 2009) pp. 1015–1028. Google Scholar- Abhinav Bhatele, I-Hsin Chung, and Laxmikant V. Kale. Automated mapping of structured communication graphs onto mesh interconnects. 2010 . Google Scholar
-
L. S. Blackford , ScaLAPACK Users' Guide ( Society for Industrial and Applied Mathematics , Philadelphia, PA , 1997 ) . Crossref, Google Scholar -
Thomas H. Cormen , Introduction to Algorithms ( McGraw-Hill , 1990 ) . Google Scholar - IEEE Trans. Comput. 40(1), 46 (1991). Crossref, ISI, Google Scholar
- Czechoslovak Mathematical Journal 23(98), (1973). Google Scholar
-
S. L. Graham , M. Snir and C. A. Patterson (eds.) , Getting Up To Speed: The Future Of Supercomputing ( National Academies Press , Washington, D.C. , 2005 ) . Google Scholar -
J. Kleinberg and E. Tardos , Algorithm Design ( Addison Wesley , 2005 ) . Google Scholar - IEEE Trans. Comput. 39(12), 1446 (1990). Crossref, ISI, Google Scholar
-
Sangman Moh , Mapping strategies for switch-based cluster systems of irregular topology , 8th IEEE International Conference on Parallel and Distributed Systems . Google Scholar - Jesper Larsson Träff. Implementing the mpi process topology mechanism. In Supercomputing, pages 1–14, 2002 . Google Scholar
H. Wen , A productivity centered tools framework for application performance tuning, QEST '07: Proceedings of the Fourth International Conference on the Quantitative Evaluation of Systems (QEST 2007) (IEEE Computer Society, Washington, DC, USA, 2007) pp. 273–274. Google Scholar- IEEE Transactions on Pattern Analysis and Machine Intelligence 15(11), 1101 (1993). Crossref, ISI, Google Scholar
Hao Yu , I-Hsin Chung and Jose Moreira , Topology mapping for blue gene/l supercomputer, SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing (ACM, New York, NY, USA, 2006) p. 116. Google Scholar


