ARMI: A High Level Communication Library for STAPL
Abstract
ARMI is a communication library that provides a framework for expressing fine-grain parallelism and mapping it to a particular machine using shared-memory and message passing library calls. The library is an advanced implementation of the RMI protocol and handles low-level details such as scheduling incoming communication and aggregating outgoing communication to coarsen parallelism. These details can be tuned for different platforms to allow user codes to achieve the highest performance possible without manual modification. ARMI is used by STAPL, our generic parallel library, to provide a portable, user transparent communication layer. We present the basic design as well as the mechanisms used in the current Pthreads/OpenMP, MPI implementations and/or a combination thereof. Performance comparisons between ARMI and explicit use of Pthreads or MPI are given on a variety of machines, including an HP-V2200, Origin 3800, IBM Regatta and IBM RS/6000 SP cluster.
This research supported in part by NSF Grants EIA-0103742, ACR-0081510, ACR-0113971, CCR-0113974, ACI-0326350, and by the DOE.
References
P. Beckman and D. Gannon , Tulip: A portable run-time system for object-parallel systems,Int. Parallel Processing symp. () (1996) pp. 532–536. Google ScholarG. Blelloch , A comparison of sorting algorithms for the connection machine CM-2,Symp. on Parallel Algorithms and Architectures () (1991) pp. 3–16. Google Scholar- SIAM News 32(9), (1999). Google Scholar
F. Cappello and D. Etiemble , MPI versus MPI+OpenMP on the IBM SP for the NAS benchmarks,High Performance Networking and Computing Conf. (Supercomputing) () (2000) pp. 51–63. Google Scholar- CHOI, J., DONGARRA, J. J., OSTROUCHOV, S., PETITET, A. P., WALKER, D. W., AND WHALEY, R. C. The design and implementation of the ScaLAPACK LU, QR and Cholesky factorization routines. Tech. Rep. ORNL/TM-12470, Oak Ridge, TN, USA, 1994 . Google Scholar
-
D. E. Culler , J. P. Singh and A. Gupta , Parallel Computer Architecture: A Hardware/Software Approach ( Morgan Kaufmann Publishers, inc. , San Francisco, CA , 1999 ) . Google Scholar - Journal of Parallel and Distributed Computing 40(1), 35 (1997). Crossref, ISI, Google Scholar
- Journal of Parallel and Distributed Computing 37(1), 70 (1996). Crossref, ISI, Google Scholar
M. Govindaraju , Requirements for and evaluation of RMI protocols for scientific computing,High Performance Networking and Computing Conf. (Supercomputing) () (2000) pp. 76–102. Google Scholar- IEEE. Information Technology - Portable Operating System Interface (POSIX) - Part 1: System Application: Program Interface [C Language]. 9945-1:1996 (ISO/IEC) [IEEE/ANSI Std 1003.1 1996 Edition], Piscataway, NJ: IEEE Standard Press, 1996 . Google Scholar
-
B. Joy , Java(TM) Language Specification , 2nd edn. ( Addison-Wesley Pub Co , Reading, MA , 2000 ) . Google Scholar L. Kale and S. Krishnan , CHARM++: A portable concurrent object oriented system based on c++,Conf. on Object-Oriented Programming Systems, Languages and Applications () (1993) pp. 91–108. Google Scholar- , Parallel Programming using C++, eds.
G. Wilson and P. Lu (MIT Press, Cambridge, MA, 1996) pp. 175–213. Google Scholar -
S. Lumetta , A. Mainwaring and D. Culler , Multi-protocol active messages on a cluster of SMPs ,High Performance Networking and Computing Conf. (Supercomputing) ( ) ( 1997 ) . Google Scholar W. Mclendon , Finding strongly connected components in parallel in particle transport sweeps,Symp. on Parallel Algorithms and Architectures () (2001) pp. 328–329. Google Scholar- MESSAGE PASSING INTERFACE FORUM. MPI-2: Extensions to the Message-Passing Interface, May 1998 . Google Scholar
- MICROSYSTEMS, S. Java remote method invocation (RMI). http://java.sun.com/products/jdk/rmi/, 1995–2002 . Google Scholar
-
J. Nieplocha and B. Carpenter , ARMCI: A portable remote memory copy library for distributed array libraries and compiler run-time systems ,Workshop on Runtime Systems for Parallel Programming of the Int. Parallel Processing Symp. ( ) ( 1999 ) . Google Scholar J. Nieplocha , J. Ju and T. P. Straatsma ,Lecture Notes in Computer Science 1900 (2001) pp. 718–726. Google Scholar- , OpenMP - C and C++ Application Program Interface ( 1998 ) . Google Scholar
-
A. Ping , STAPL: An adaptive, generic parallel c++ library ,Int. Workshop on Languages and Compilers for Parallel Computing ( ) ( 2001 ) . Google Scholar L. Rauchwerger , F. Arzu and K. Ouchi , Standard templates adaptive parallel library,Workshop on Languages, Compilers and Run-Time Systems for scalable computers () (1998) pp. 402–409. Google ScholarS. Saunders and L. Rauchwerger , ARMI: an adaptive, platform independent communication library, Proc. ACM SIGPLAN Symp. Prin. Prac. Par. Prog. (PPOPP) (ACM Press, 2003) pp. 230–241. Google ScholarG. Shah , Performance and experience with LAPI: A new high-performance communication library for the IBM RS/6000 SP,Int. Parallel Processing Symp. () (1998) pp. 260–266. Google ScholarJ. G. Siek and A. Lumsdaine , The matrix template library: A generic programming approach to high performance numerical linear algebra,ISCOPE () (1998) pp. 59–70. Google Scholar-
L. Smith , UK High-End Computing Technology Report ( 2000 ) . Google Scholar -
B. Stroustrup , The C++ Programming Language ( Addison-Wesley Pub Co , Reading, MA , 2000 ) . Google Scholar - Communications of the ACM 33(8), 103 (1990). Crossref, ISI, Google Scholar
T. Von Eicken , Active messages: A mechanism for integrated communication and computation,Int. Symp. on Computer Architecture () (1992) pp. 256–266. Google Scholar- IEEE Concurrency 6(3), 5 (1998). Crossref, Google Scholar


