USE OF PREDICTIVE PERFORMANCE MODELING DURING LARGE-SCALE SYSTEM INSTALLATION
Abstract
In this paper we describe an important use of predictive application performance modeling - the validation of measured performance during a new large-scale system installation. Using a previously-developed and validated performance model for SAGE, a multidimensional, 3D, multi-material hydrodynamics code with adaptive mesh refinement, we were able to help guide the stabilization of the first phase of the Los Alamos ASCI Q supercomputer. We review the salient features of an analytical model for this code that has been applied to predict its performance on a large class of Tera-scale parallel systems. We describe the methodology applied during system installation and upgrades to establish a baseline for the achievable "real" performance of the system. We also show the effect on overall application performance of certain key subsystems such as PCI bus speed and multi-rail networks. We show that utilization of predictive performance models is also a powerful system debugging tool.
References
-
D. J. Kerbyson , Predictive Performance and Scalability Modeling of a Large-scale Application , Proc. SC2001 . Google Scholar - Int. J. of High Performance Computing Applications 14, 330 (2000). Crossref, ISI, Google Scholar
- IEEE Micro 22(1), 46 (2002). Crossref, ISI, Google Scholar
- R. Weaver, Major 3-D Parallel Simulations, BITS - Computing and communication news, Los Alamos National Laboratory, June/July, (1999) 9–11 . Google Scholar
-
S. Coll , Using Multirail Networks in High-Performance Clusters , Proc. of Cluster2001 . Google Scholar - See http://perc.nersc.gov/main.htm . Google Scholar
-
D. J. Kerbyson , A. Hoisie and S. D. Pautz , Performance Analysis and Grid Computing ( Kluwer , 2003 ) . Google Scholar -
M. Mathis , Performance Modeling of MCNP on Large-Scale Systems , Proc. Los Alamos Computer Science Institute Symposium . Google Scholar


