World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

OmpSs: A PROPOSAL FOR PROGRAMMING HETEROGENEOUS MULTI-CORE ARCHITECTURES

    In this paper, we present OmpSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on different architectures, SMP, GPUs, and hybrid SMP/GPU environments, showing the wide usefulness of the approach. The evaluation is done with six different benchmarks, Matrix Multiply, BlackScholes, Perlin Noise, Julia Set, PBPI and FixedGrid. We compare the results obtained with the execution of the same benchmarks written in OpenCL or OpenMP, on the same architectures. The results show that OmpSs greatly outperforms both environments. With the use of OmpSs the programming environment is more flexible than traditional approaches to exploit multiple accelerators, and due to the simplicity of the annotations, it increases programmer's productivity.

    References

    • AMD Corporation. The AMD Fusion Family of APUs , http://fusion.amd.com . Google Scholar
    • AMD/ATI. OpenCL: The Open Standard for Parallel Programming of GPUs and Multi–core CPUs, 2010 , http://www.amd.com/us/products/technologies/stream-technology/opencl/Pages/opencl.aspx . Google Scholar
    • C. Augonnet and R. Namyst, A unified runtime system for heterogeneous multicore architectures, Proceedings of the International Euro-Par Workshops 2008, HPPC'085415, Lecture Notes in Computer Science (Springer, Las Palmas de Gran Canaria, Spain, 2008) pp. 174–183. Google Scholar
    • C.   Augonnet et al. , StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures , Concurrency and Computation: Practice and Experience, Euro-Par 2009 best papers issue . Google Scholar
    • E. Ayguadeet al., A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures, IWOMP: Evolving OpenMP in an Age of Extreme Parallelism5568 (Springer, Dresden, Germany, 2009) pp. 154–167. Google Scholar
    • E.   Ayguadé et al. , A Proposal for Task Parallelism in OpenMP , Third International Workshop on OpenMP (IWOMP) . Google Scholar
    • R. D. Blumofeet al., SIGPLAN Not. 30(8), 207 (1995), DOI: 10.1145/209937.209958. Crossref, ISIGoogle Scholar
    • W. P. L. Carter. Documentation of the saprc-99 chemical mechanism for voc reactivity assessment. Final Report Contract No. 92-329, California Air Resources Board, May 8 2000 . Google Scholar
    • P. Cooperet al., Offload – automating code migration to heterogeneous multicore systems, HiPEAC Conference 2010, Lecture Notes in Computer Science (2010) pp. 307–321. Google Scholar
    • R.   Dolbeau , S.   Bihan and F.   Bodin , HMPP: A Hybrid Multi-core Parallel Programming Environment , Workshop on General Processing Using GPUs . Google Scholar
    • A. Duranet al., Extending the OpenMP Tasking Model to Allow Dependent Tasks, OpenMP in a New Era of Parallelism (Springer, Berlin / Heidelberg, 2008) pp. 111–122. Google Scholar
    • A. E. Eichenbergeret al., IBM Systems Journal 45(1), 59 (2006), DOI: 10.1147/sj.451.0059. Crossref, ISIGoogle Scholar
    • X. Feng, K. W. Cameron and D. A. Buell, Pbpi: a high performance implementation of bayesian phylogenetic inference, SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing (ACM, New York, NY, USA, 2006) p. 75. Google Scholar
    • R. Ferreret al., IEEE Micro 30, 42 (2010). Google Scholar
    • W. Hundsdorfer. Numerical solution of advection-diffusion-reaction equations. Technical report, Centrum voor Wiskunde en Informatica, 1996 . Google Scholar
    • IBM Corporation. OpenCL, 2010 , http://www.alphaworks.ibm.com/tech/opencl . Google Scholar
    • Intel Corporation. Intel Unveils Product Plans for HPC, May 2010 , http://www.intel.com/pressroom/archive/releases/2010/20100531comp.htm . Google Scholar
    • P.   Jetley et al. , Scaling Hierarchical N-body Simulations on GPU Clusters , Proceedings of the ACM/IEEE Supercomputing Conference 2010 . Google Scholar
    • Khronos OpenCL Working Group. The OpenCL Specification, version 1.0.29, 8 December 2008 . Google Scholar
    • V.   Kindratenko et al. , GPU Clusters for High-Performance Computing , Workshop on Parallel Programming on Accelerator Clusters, IEEE Int. Conf. on Cluster Comp. . Google Scholar
    • T. J.   Knight et al. , Compilation for explicitly managed memory hierarchies , Proceedings of the 2007 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming . Google Scholar
    • M.   Linderman et al. , Merge: A Programming Model for Heterogeneous Multi-core Systems , Proc. of the 14th Int. Conf. on Arch. Support for Prog. Languages and Operating Systems (ASPLOS) . Google Scholar
    • J. C.   Linford and A.   Sandu , Optimizing large scale chemical transport models for multicore platforms , Proceedings of the 2008 Spring Simulation Multiconference . Google Scholar
    • NVIDIA Corporation. NVIDIA CUDA Compute Unified Device Architecture Version 2.0, 2008 . Google Scholar
    • NVIDIA Corporation. OpenCL, 2010 , http://www.nvidia.com/object/cuda_opencl_new.html . Google Scholar
    • K. O'Brienet al., International Journal of Parallel Programming 36(3), 289 (2008). Crossref, ISIGoogle Scholar
    • OpenMP Architecture Review Board. OpenMP Application Program Interface. Version 3.0, May 2008 . Google Scholar
    • J. M. Perez, R. M. Badia and J. Labarta, A dependency-aware task-based programming environment for multi-core architectures, IEEE Int. Conference on Cluster Computing (2008) pp. 142–151. Google Scholar
    • RapidMind. RapidMind Multi-core Development Platform , http://www.rapidmind.com/pdfs/RapidmindDatasheet.pdf . Google Scholar
    • A. Sanduet al., Journal of Computational Physics 204, 222 (2005), DOI: 10.1016/j.jcp.2004.10.011. Crossref, ISIGoogle Scholar
    • S.-Z.   Ueng et al. , CUDA-lite: Reducing GPU Programming Complexity , Languages and Compilers for Parallel Computing (LCPC) 21st Annual Workshop . Google Scholar
    • P. Wanget al., EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system, Proc. of PLDI (2007) pp. 156–166. Google Scholar