MODELING THE PERFORMANCE OF COMMUNICATION SCHEMES ON NETWORK TOPOLOGIES
Abstract
This paper investigates the influence of the interconnection network topology of a parallel system on the delivery time of an ensemble of messages, called the communication scheme. More specifically, we focus on the impact on the performance of structure in network topology and communication scheme. We introduce causal structure learning algorithms for the modeling of the communication time. The experimental data, from which the models are learned automatically, is retrieved from simulations. The qualitative models provide insight about which and how variables influence the communication performance. Next, a generic property is defined which characterizes the performance of individual communication schemes and network topologies. The property allows the accurate quantitative prediction of the runtime of random communication on random topologies. However, when either communication scheme or network topology exhibit regularities the prediction can become very inaccurate. The causal models can also differ qualitatively and quantitatively. Each combination of communication scheme regularity type, e.g. a one-to-all broadcast, and network topology regularity type, e.g. torus, possibly results in a different model which is based on different characteristics.
References
-
Jerry Banks , Discrete-Event Simulation - fourth edition ( Prentice Hall , 2005 ) . Google Scholar - IEEE Communication Magazine 160 (1997), DOI: 10.1109/35.587723. Google Scholar
Laura C. Carrington , How well can simple metrics represent the performance of HPC applications?, Proc. of the 2005 ACM/IEEE conference on Supercomputing (IEEE Computer Society, 2005) p. 48. Google Scholar-
Thomas M. Cover and Joy A. Thomas , Elements of Information Theory ( John Wiley & Sons, Inc , 1991 ) . Crossref, Google Scholar Thomas Fahringer and Clovis Seragiotto , Automatic search for performance problems in parallel and distributed programs by using multi-experiment analysis, HiPC2552,Lecture Notes in Computer Science , eds.Sartaj Sahni , Viktor K. Prasanna and Uday Shukla (Springer, 2002) pp. 151–162. Google Scholar- IEEE/ACM Trans. Netw. 9(4), 392 (2001), DOI: 10.1109/90.944338. Crossref, ISI, Google Scholar
Dan Geiger , Thomas Verma and Judea Pearl , d-separation: From theorems to algorithms, UAI, eds.Max Henrion (North-Holland, 1989) pp. 139–148. Google Scholar- Parallel Computing 23(2), 5 (1997), DOI: 10.1016/S0167-8191(96)00093-2. Crossref, ISI, Google Scholar
-
Vipin Kumar , Introduction to Parallel Computing ( Benjamin/Cummings , Redwood City, CA , 1994 ) . Google Scholar - Jan Lemeire. Learning Causal Models of Multivariate Systems and the Value of it for the Performance Modeling of Computer Programs. PhD thesis, Vrije Universiteit Brussel, 2007 . Google Scholar
- Scientific Programming 15(3), 121 (2007). Crossref, ISI, Google Scholar
Gabriel Marin and John Mellor-Crummey , Cross-architecture performance predictions for scientific applications using parameterized models, SIGMETRICS '04/Performance '04: Proceedings of the joint international conference on Measurement and modeling of computer systems (ACM Press, 2004) pp. 2–13. Google Scholar-
Judea Pearl , Causality. Models, Reasoning, and Inference ( Cambridge University Press , 2000 ) . Google Scholar A. Snavely , N. Wolter and L. Carrington , Modeling application performance by convolving machine signatures with application profiles, WWC '01: Proceedings of the Workload Characterization, 2001 (IEEE Computer Society, 2001) pp. 149–156. Google Scholar- Allan Snavely, Laura Carrington, Nicole Wolter, Jesus Labarta, Rosa M. Badia, and Avi Purkayastha. A framework for performance modeling and prediction. In SC, pages 1-17, 2002 . Google Scholar
-
Peter Spirtes , Clark Glymour and Richard Scheines , Causation, Prediction, and Search , 2nd edn. ( Springer Verlag , 1993 ) . Crossref, Google Scholar - Peter Spirtes, Clark Glymour, Richard Scheines, and Joseph Ramsey. The TETRAD project, http://www.phil.emu.edu/projects/tetrad/ . Google Scholar
- AAAI/IAAI 567 (2002). Google Scholar
Hong Linh Truong and Thomas Fahringer , SCALEA: A performance analysis tool for distributed and parallel programs, Euro-Par2400,Lecture Notes in Computer Science , eds.Burkhard Monien and Rainer Feldmann (Springer, 2002) pp. 75–85. Google Scholar-
M. P. Wand and M. C. Jones , Kernel Smoothing ( Chapman & Hall , London, UK , 1995 ) . Crossref, Google Scholar - IEEE/ACM Transactions on Networking 5(6), 770 (1997), DOI: 10.1109/90.650138. Crossref, ISI, Google Scholar


