World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

Performance Evaluation of an OpenCL Implementation of the Lattice Boltzmann Method on the Intel Xeon Phi

    A portable OpenCL implementation of the lattice Boltzmann method targeting emerging many-core architectures is described. The main purpose of this work is to evaluate and compare the performance of this code on three mainstream hardware architectures available today, namely an Intel CPU, an Nvidia GPU, and the Intel Xeon Phi. Because of the similarities between OpenCL and CUDA, we chose to follow some of the strategies devised to implement efficient lattice Boltzmann solvers on Nvidia GPU, while remaining as generic as possible. Being fairly configurable, this program makes possible to ascertain the best options for each hardware platforms. The achieved performance is quite satisfactory for both the CPU and the GPU. For the Xeon Phi however, the results are below expectations. Nevertheless, comparison with data from the literature shows that on this architecture the code seems memory-bound.

    Communicated by J. Dongarra