World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

BENCHMARK ANALYSIS AND APPLICATION RESULTS FOR LATTICE BOLTZMANN SIMULATIONS ON NEC SX VECTOR AND INTEL NEHALEM SYSTEMS

    Classic vector systems have all but vanished from recent TOP500 lists. Looking at the recently introduced NEC SX-9 series, we benchmark its memory subsystem using the low level vector triad and employ the kernel of an advanced lattice Boltzmann flow solver to demonstrate that classic vectors still combine excellent performance with a well-established optimization approach. To investigate the multi-node performance, the flow field in a real porous medium is simulated using the hybrid MPI/OpenMP parallel ILBDC lattice Boltzmann application code. Results for a commodity Intel Nehalem-based cluster are provided for comparison. Clusters can keep up with the vector systems, however, require massive parallelism and thus much more effort to provide a good domain decomposition.

    References

    • K. J. Barkeret al., Parallel Processing Letters 18(4), 453 (2008). LinkGoogle Scholar
    • M. Frigo and V. Strumpen, Journal of Supercomputing 39(2), 93 (2007). Crossref, ISIGoogle Scholar
    • I. Ginzburg, J.-P. Carlier and C. Kao, Lattice Boltzmann approach to Richards' equation, Computational Methods in Water Resources: Proceedings of CMWR XV, ed. C. T. Miller (Chapel Hill, NC, USA, 2004) pp. 583–597. Google Scholar
    • X. He and L.-S. Luo, Physical Review E 56(6), 6811 (1997). Crossref, ISIGoogle Scholar
    • H. Kobayashiet al., High Performance Computing on Vector Systems 2008, eds. M. Reschet al. (Springer, Berlin, 2009) pp. 3–11. CrossrefGoogle Scholar
    • J. D. McCalpin. STREAM: Sustainable memory bandwidth in high performance computers, 1991–2009 . Google Scholar
    • M. Meier: Thread pinning by overloading pthread_create(). http://www.mulder.franken.de/workstuff/pthread-overload.c . Google Scholar
    • T.   Pohl et al. , Performance evaluation of parallel large-scale lattice Boltzmann applications on three supercomputing architectures , Proceedings of the IEEE/ACM SC2004 Conference ( 2004 ) . Google Scholar
    • R. Rabenseifner and G. Wellein, International Journal of High Performance Computing Applications 17(1), 49 (2003). Crossref, ISIGoogle Scholar
    • W. Schönauer: Scientific Supercomputing: Addendum for NEC SX4/5. http://www.rz.uni-karlsruhe.de/~rxO3/addendum/a1 . Google Scholar
    • T. Schoenemeyer and H. Berger (NEC): Priv. comm. 2009 . Google Scholar
    • S.   Succi , The Lattice Boltzmann Equation - For Fluid Dynamics and Beyond ( Clarendon Press , 2001 ) . Google Scholar
    • G. Welleinet al., Computers & Fluids 35(8-9), 910 (2006). Crossref, ISIGoogle Scholar
    • S.   Williams et al. , Lattice Boltzmann simulation optimization on leading multicore platforms , Proceedings of the 22nd IEEE International Symposium on Parallel&Distributed Processing ( 2008 ) . Google Scholar
    • S. Williamset al., Journal of Parallel and Distributed Computing 69(9), 762 (2009). Crossref, ISIGoogle Scholar
    • T. Zeiseret al., Progress in Computational Fluid Dynamics 8(4), 179 (2008). Crossref, ISIGoogle Scholar
    • T. Zeiser, G. Hager and G. Wellein, High Performance Computing in Science and Engineering '08, eds. W. E. Nagelet al. (Springer, Berlin, 2009) pp. 333–347. CrossrefGoogle Scholar
    • T.   Zeiser , G.   Hager and G.   Wellein , The world's fastest CPU and SMP node: Some performance results from the NEC SX-9 , Proceedings of the 23rd IEEE International Symposium on Parallel & Distributed Processing ( 2009 ) . Google Scholar