World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

THE EFFECT OF NETWORK NOISE ON LARGE-SCALE COLLECTIVE COMMUNICATIONS

    The effect of operating system (OS) noise on the performance of large-scale applications is a growing concern and ameliorating the influence of OS noise is a subject of active research. A related problem is that of network noise that arises from the shared use of the interconnection network by parallel processes of different allocations or other background activities. To characterize the effect of network noise on parallel applications, we conducted a series of experiments with a specially crafted benchmark and simulations. Experimental results show a decrease in the communication performance of a parallel reduction operation by a factor of 2 on 246 nodes on an InfiniBand fat-tree and by several orders of magnitude on a BlueGene/P torus. Simulations show how influence of network noise grows with the system size. Although network noise is not as well-studied as OS noise, our results clearly show that it is an important factor that must be considered when running and analyzing large-scale applications.

    References

    • S.   Agarwal , R.   Garg and N.   Vishnoi , The Impact of Noise on the Scaling of Collectives: A Theoretical Approach , 12th Annual IEEE International Conference on High Performance Computing ( 2005 ) . Google Scholar
    • A. Alexandrovet al., Journal of Parallel and Distributed Computing 44(1), 71 (1995), DOI: 10.1006/jpdc.1997.1346. Crossref, ISIGoogle Scholar
    • R. M.   Badia , J.   Labarta and J.   Gimenez , DIMEMAS: Predicting MPI applications behavior in Grid environments , Workshop on Grid Applications and Programming Tools (GGF '03) ( 2003 ) . Google Scholar
    • A. Braccini, A. Del Bimbo and E. Vicario, Software Engineering, IEEE Transactions on 17(4), 357 (1991), DOI: 10.1109/32.90435. Crossref, ISIGoogle Scholar
    • K. Bryan, J. Comput. Phys. 135(2), 154 (1997), DOI: 10.1006/jcph.1997.5699. Crossref, ISIGoogle Scholar
    • W. J. Dally, IEEE Trans. Comput. 39(6), 775 (1990), DOI: 10.1109/12.53599. Crossref, ISIGoogle Scholar
    • J. D. Emerson and H. Strenio. Box-plots and batch comparison. Understanding Robust and Exploratory Data Analysis, 1983 . Google Scholar
    • K. B. Ferreira, P. Bridges and R. Brightwell, Characterizing application sensitivity to OS interference using kernel-level noise injection, SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing (IEEE Press, Piscataway, NJ, USA, 2008) pp. 1–12. Google Scholar
    • R. Garg and P. De, HiPC 2006, 13th International Conference, LNCS 4297, eds. Yves Robertet al. (Springer, 2006) pp. 460–471. Google Scholar
    • W. Gropp and E. L. Lusk, Reproducible Measurements of MPI Performance Characteristics, Proceedings of the 6th European PVM/MPI Users' Group Meeting (Springer-Verlag, London, UK, 1999) pp. 11–18. Google Scholar
    • E. S. Hertel Jr.et al., CTH: A Software Family for Multi-Dimensional Shock Physics Analysis, Proceedings of the 19th International Symposium on Shock Waves (1993) pp. 377–382. Google Scholar
    • T. Hoefleret al., Netgauge: A Network Performance Measurement Framework, High Performance Computing and Communications, HPCC 20074782 (Springer, Houston, USA, 2007) pp. 659–671. Google Scholar
    • T.   Hoefler , T.   Schneider and A.   Lumsdaine , Accurately Measuring Collective Operations at Massive Scale , Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium (IPDPS) ( 2008 ) . Google Scholar
    • T.   Hoefler , T.   Schneider and A.   Lumsdaine , Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks , Proceedings of the IEEE International Conference on Cluster Computing ( IEEE Computer Society , 2008 ) . Google Scholar
    • F. Ino, N. Fujimoto and K. Hagihara, LogGPS: A Parallel Computational Model for Synchronization Analysis, PPoPP '01: Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming pp. 133–142. Google Scholar
    • K.   Iskra et al. , The Influence of Operating Systems on the Performance of Collective Operations at Extreme Scale , Proceedings of Cluster Computing IEEE International Conference ( 2006 ) . Google Scholar
    • S. M.   Kelly and R.   Brightwell , Software architecture of the light weight kernel, Catamount , Cray User Group Annual Technical Conference ( 2005 ) . Google Scholar
    • D. J.   Kerbyson , A Look at Application Performance Sensitivity to the Bandwidth and Latency of Infiniband Networks , in Proc. of Workshop on Communication Architectures for Clusters (CAC), IEEE/ACM Int. Parallel and Distibuted Processing Symposium (IPDPS) ( 2006 ) . Google Scholar
    • D. J. Kerbysonet al., Predictive performance and scalability modeling of a large-scale application, Supercomputing '01: Proceedings of the ACM/IEEE conference on Supercomputing (ACM, New York, NY, USA, 2001) pp. 37–37. Google Scholar
    • C. E. Leiserson, IEEE Trans. Comput. 34(10), 892 (1985). ISIGoogle Scholar
    • A. D. Malony and S. S. Shende, Overhead compensation in performance profiling, In Proceedings of the European Conference on Parallel Processing (Euro-Par) (Springer-Verlag, 2004) pp. 119–132. Google Scholar
    • J. E. Moreiraet al., IBM Journal of Research and Development 49(2), 367 (2005). Crossref, ISIGoogle Scholar
    • R. Mraz, Reducing the variance of point to point transfers in the IBM 9076 parallel computer, Supercomputing '94: Proceedings of the 1994 ACM/IEEE conference on Supercomputing (ACM, New York, NY, USA, 1994) pp. 620–629. Google Scholar
    • A. Natarajet al., The ghost in the machine: observing the effects of kernel operation on parallel application performance, SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing (ACM, New York, NY, USA, 2007) pp. 1–12. Google Scholar
    • S. R. Öhringet al., On generalized fat trees, IPPS '95: Proceedings of the 9th International Symposium on Parallel Processing (IEEE Computer Society, Washington, DC, USA, 1995) p. 37. Google Scholar
    • F. Petrini, D. J. Kerbyson and S. Pakin, The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q, Proceedings of the ACM/IEEE SC2003 Conference (ACM, 2003) p. 55. Google Scholar
    • R.   Rabenseifner , Automatic MPI Counter Profiling , Proceedings of 42nd CUG Conference ( 2000 ) . Google Scholar
    • Schneider and T. Hoefler. ORCS: An Oblivious Routing Congestion Simulator. Technical report, Indiana University, Computer Science Department, February 2009 . Google Scholar
    • M. Sottile and R. Minnich, Analysis of microbenchmarks for performance tuning of clusters, CLUSTER '04: Proceedings of the 2004 IEEE International Conference on Cluster Computing (IEEE Computer Society, Washington, DC, USA, 2004) pp. 371–377. Google Scholar
    • M. J. Sottile. A measurement and simulation methodology for parallel computing performance studies. PhD thesis, Albuquerque, NM, USA, 2006 . Google Scholar
    • M. J.   Sottile , V. P.   Chandu and D. A.   Bader , Performance analysis of parallel programs via message-passing graph traversal , 20th International Parallel and Distributed Processing Symposium, Proceedings ( IEEE , Rhodes Island, Greece , 2006 ) . Google Scholar
    • A.   Wolf et al. , Trace-Based Parallel Performance Overhead Compensation , In Proc. of the International Conference on High Performance Computing and Communications ( 2005 ) . Google Scholar
    • K. Yoshii, K. Iskra, P. C. Broekema, H. Naik, and P. Beckman. Characterizing the Performance of Big Memory on Blue Gene Linux. Technical report, Argonne National Lab, March 2009. ANL/MCS-P1589–0309 . Google Scholar
    • E.   Zahavi et al. , Optimized InfiniBand Fat-tree Routing for Shift All-To-All Communication Patterns , Proceedings of the International Supercomputing Conference 2007 (ISC07) . Google Scholar