SPOC: GPGPU PROGRAMMING THROUGH STREAM PROCESSING WITH OCAML
Abstract
General purpose computing on graphics processing units (GPGPU) consists of using GPUs to handle computations commonly handled by CPUs. GPGPU programming implies developing specific programs to run on GPUs managed by a host program running on the CPU. To achieve high performance implies to explicitly organize memory transfers between devices. Besides, different incompatible frameworks exist making productivity and portability difficult to achieve. In this paper, we describe SPOC, an OCaml library, defining specific data sets in order to automatically manage transfers between GPU and CPU. SPOC also offers a runtime library looking for multiple frameworks and making them usable transparently. We also describe the link between SPOC and the OCaml garbage collector to optimize transfers dynamically. SPOC benchmarks show that SPOC can offer great performance while simplifying GPGPU programming
References
- Xavier Leroy. The Objective Caml system release 3.12 : Documentation and user's manual. Technical report, Inria, 2011 , http://caml.inria.fr . Google Scholar
T. B. Jablin , Automatic CPU-GPU Communication Management and Optimization, Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation (ACM, 2011) pp. 142–151. Google Scholar-
R. Dolbeau , S. Bihan and F. Bodin , HMPP: A Hybrid Multi-core Parallel Programming Environment , First Workshop on General Purpose Processing on Graphics Processing Units ( 2007 ) . Google Scholar - Nvidia Cray Inc., CAPS Enterprise and The Portland Group. Openacc 1.0 specification, 2011 . Google Scholar
Johan Enmyren and Christoph W. Kessler , Skepu: a multi-backend skeleton programming library for multi-gpu systems, Proceedings of the fourth international workshop on High-level parallel programming and applications, HLPP '10 (ACM, 2010) pp. 5–14. Google ScholarMichel Steuwer , Philipp Kegel and Sergei Gorlatch , Skelcl - a portable skeleton library for high-level gpu programming, Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW '11 (IEEE Computer Society, 2011) pp. 1176–1182. Google Scholar- ACM SIGARCH Computer Architecture News 34(5), 325 (2006). Crossref, Google Scholar
- R. Beck, H. W. Larsen, T. Jensen, and B. Thomsen. Extending Scala with General Purpose GPU Programming. Technical report, Adlborg University, Departement of Computer Science, 2011 . Google Scholar
- AMD. Aparapi , http://code.google.com/p/aparapi/ . Google Scholar
- Joel Svensson. Obsidian: GPU Kernel Programming in Haskell. Technical Report 77L, Computer Science and Enginering, Chalmers University of Technology and Gothenburg University, 2011 . Google Scholar
- Mathias Bourgoin, Emmanuel Chailloux, and Jean-Luc Lamotte. Spoc: Stream processing with ocaml, 2012 , http://www.algo-prog.info/spoc . Google Scholar
- Computer Physics Communications 180(12), 2424 (2009). Crossref, ISI, Google Scholar
Frédéric Loulergue , Frédéric Gava and D. Billiet , Bulk Synchronous Parallel ML: Modular Implementation and Performance Prediction, International Conference on Computational Science (ICCS),LNCS 3515, eds.Vaidy S. Sunderam (Springer, 2005) pp. 1046–1054. Google Scholar- Parallel Processing Letters 18(1), 149 (2008). Link, ISI, Google Scholar


