World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

INFINIBAND ROUTING TABLE OPTIMIZATIONS FOR SCIENTIFIC APPLICATIONS

    The achievable performance on Infiniband networks is governed by the latencies and bandwidths of communication channels as well as by contention within the network. Currently Infiniband uses static routing to transfer messages and thus does not take into account dynamic loading of the channels. By interrogating the network routing tables we quantify the contention that occurs for a number of communication patterns using a large-scale (1024 processor) system. Empirical data confirms our contention calculation almost exactly. Custom routing tables are defined that provide both optimum and worst-case performance for a large-range of communication patterns. Performance differences can be as large as 12× (from optimum to worst-case). Two large-scale applications show a run-time improvement of between 10-20% and up to a 40% improvement in just their communication time when using optimized routing tables. The approach taken is applicable to many Infiniband systems, and we expect the performance improvements to be even greater on larger-scale systems.

    References

    • Sandia National Laboratories. Thunderbird Linux Cluster. http://www.cs.sandia.gov/platforms/Thunderbird.html . Google Scholar
    • Virginia Tech. System-X. http://www.tcf.vt.edu/systemX.html . Google Scholar
    • K. J.   Barker et al. , Entering the Petaflop Era: The Architecture and Performance of Roadrunner , Proc. IEEE/ACM Supercomputing ( 2008 ) . Google Scholar
    • Pathscale Infinipath HTX Adapter: Low-Latency Cluster Interconnect for Infiniband, available from http://www.pathscale.com . Google Scholar
    • J. Beecroftet al., IEEE Micro 24(4), 34 (2002). Google Scholar
    • A.   Hoisie et al. , A Performance Comparison Through Benchmarking and Modeling of Three Supercomputers: Blue Gene/L, Read Storm and ASC Purple , Proc. IEEE/ACM Super Computing ( 2006 ) . Google Scholar
    • E.   Zahavi et al. , Optimized InfiniBand Fat-Tree Routing for Shift All-to-All Communication Patterns , Proc. Int. Supercomputing Conf. ( 2007 ) . Google Scholar
    • A.   Ding et al. , Proc. IEEE/ACM Supercomputing ( 2006 ) . Google Scholar
    • D.J. Kerbyson, A Look at Application Performance Sensitivity to the Bandwidth and Latency of Infmiband Networks, in Proc. Communication Architectures for Clusters (CAC), Int. Parallel and Distributed Processing Symp. (IPDPS), Rhodes, Greece, Apr. 2006. . Google Scholar
    • G. Johnson, D.J. Kerbyson, M. Lang, Optimization of Infiniband for Scientific Applications, in Proc. Workshop on Large-Scale Parallel Processing (LSPP), Int. Parallel and Distributed Processing Symp. (IPDPS), Miami, FL, Apr. 2008 . Google Scholar
    • Infiniband Trade Assoc., http://www.infinibandta.org/ . Google Scholar
    • N. R. Adiga et al. , An overview of the BlueGene/L Supercomputer , Proc. of IEEE/ACM Supercomputing ( 2002 ) . Google Scholar
    • K. J. Barker et al. , On the Feasibility of Optical Circuit Switching or High Performance Computing Systems , Proc. IEEE/ACM Supercomputing ( 2005 ) . Google Scholar
    • S. Kamil, A. Pinar, et. al., Reconfigurable Hybrid Interconnection for Static and Dynamic Scientific Applications, Tech Report 60060, Lawrence Berkeley National Laboratory, CA, 2006 . Google Scholar
    • D. J.   Kerbyson and K. J.   Barker , Automatic Identification of Application Communication Patterns via Templates , Proc. ISCA Int. Conf. on Parallel and Distributed Computing Systems (PDCS) ( 2005 ) . Google Scholar
    • E.   Papaefstathiou and D. J.   Kerbyson , Predicting Communication Delays of Detailed Application Workloads , Proc. Int. Conf. on Parallel and Distributed Computing Systems, (PDCS) ( 2000 ) . Google Scholar
    • S.   Pakin , coNCePTuaL: A Network Correctness and Performance Testing Language , Proc. Int. Parallel and Distributed Processing Symp. (IPDPS) ( 2004 ) . Google Scholar
    • D. J.   Kerbyson et al. , Predictive Performance and Scalability Modeling of a Large-scale Application , Proc. IEEE/ACM Supercomputing ( 2001 ) . Google Scholar
    • R. S. Baker, Nuclear Science & Engineering 141(1), 1 (2002). Crossref, ISIGoogle Scholar