World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

ARMI: A High Level Communication Library for STAPL

    ARMI is a communication library that provides a framework for expressing fine-grain parallelism and mapping it to a particular machine using shared-memory and message passing library calls. The library is an advanced implementation of the RMI protocol and handles low-level details such as scheduling incoming communication and aggregating outgoing communication to coarsen parallelism. These details can be tuned for different platforms to allow user codes to achieve the highest performance possible without manual modification. ARMI is used by STAPL, our generic parallel library, to provide a portable, user transparent communication layer. We present the basic design as well as the mechanisms used in the current Pthreads/OpenMP, MPI implementations and/or a combination thereof. Performance comparisons between ARMI and explicit use of Pthreads or MPI are given on a variety of machines, including an HP-V2200, Origin 3800, IBM Regatta and IBM RS/6000 SP cluster.

    This research supported in part by NSF Grants EIA-0103742, ACR-0081510, ACR-0113971, CCR-0113974, ACI-0326350, and by the DOE.

    References

    • P. Beckman and D. Gannon, Tulip: A portable run-time system for object-parallel systems, Int. Parallel Processing symp. () (1996) pp. 532–536. Google Scholar
    • G. Blellochet al., A comparison of sorting algorithms for the connection machine CM-2, Symp. on Parallel Algorithms and Architectures () (1991) pp. 3–16. Google Scholar
    • S. Bovaet al., SIAM News 32(9), (1999). Google Scholar
    • F. Cappello and D. Etiemble, MPI versus MPI+OpenMP on the IBM SP for the NAS benchmarks, High Performance Networking and Computing Conf. (Supercomputing) () (2000) pp. 51–63. Google Scholar
    • CHOI, J., DONGARRA, J. J., OSTROUCHOV, S., PETITET, A. P., WALKER, D. W., AND WHALEY, R. C. The design and implementation of the ScaLAPACK LU, QR and Cholesky factorization routines. Tech. Rep. ORNL/TM-12470, Oak Ridge, TN, USA, 1994 . Google Scholar
    • D. E.   Culler , J. P.   Singh and A.   Gupta , Parallel Computer Architecture: A Hardware/Software Approach ( Morgan Kaufmann Publishers, inc. , San Francisco, CA , 1999 ) . Google Scholar
    • I. Fosteret al., Journal of Parallel and Distributed Computing 40(1), 35 (1997). Crossref, ISIGoogle Scholar
    • I. Foster, C. Kesselman and S. Tuecke, Journal of Parallel and Distributed Computing 37(1), 70 (1996). Crossref, ISIGoogle Scholar
    • M. Govindarajuet al., Requirements for and evaluation of RMI protocols for scientific computing, High Performance Networking and Computing Conf. (Supercomputing) () (2000) pp. 76–102. Google Scholar
    • IEEE. Information Technology - Portable Operating System Interface (POSIX) - Part 1: System Application: Program Interface [C Language]. 9945-1:1996 (ISO/IEC) [IEEE/ANSI Std 1003.1 1996 Edition], Piscataway, NJ: IEEE Standard Press, 1996 . Google Scholar
    • B.   Joy et al. , Java(TM) Language Specification , 2nd edn. ( Addison-Wesley Pub Co , Reading, MA , 2000 ) . Google Scholar
    • L. Kale and S. Krishnan, CHARM++: A portable concurrent object oriented system based on c++, Conf. on Object-Oriented Programming Systems, Languages and Applications () (1993) pp. 91–108. Google Scholar
    • L. Kale and S. Krishnan, Parallel Programming using C++, eds. G. Wilson and P. Lu (MIT Press, Cambridge, MA, 1996) pp. 175–213. Google Scholar
    • S.   Lumetta , A.   Mainwaring and D.   Culler , Multi-protocol active messages on a cluster of SMPs , High Performance Networking and Computing Conf. (Supercomputing) ( ) ( 1997 ) . Google Scholar
    • W. Mclendonet al., Finding strongly connected components in parallel in particle transport sweeps, Symp. on Parallel Algorithms and Architectures () (2001) pp. 328–329. Google Scholar
    • MESSAGE PASSING INTERFACE FORUM. MPI-2: Extensions to the Message-Passing Interface, May 1998 . Google Scholar
    • MICROSYSTEMS, S. Java remote method invocation (RMI). http://java.sun.com/products/jdk/rmi/, 1995–2002 . Google Scholar
    • J.   Nieplocha and B.   Carpenter , ARMCI: A portable remote memory copy library for distributed array libraries and compiler run-time systems , Workshop on Runtime Systems for Parallel Programming of the Int. Parallel Processing Symp. ( ) ( 1999 ) . Google Scholar
    • J. Nieplocha, J. Ju and T. P. Straatsma, Lecture Notes in Computer Science 1900 (2001) pp. 718–726. Google Scholar
    • Openmp Architecture Review Board , OpenMP - C and C++ Application Program Interface ( 1998 ) . Google Scholar
    • A.   Ping et al. , STAPL: An adaptive, generic parallel c++ library , Int. Workshop on Languages and Compilers for Parallel Computing ( ) ( 2001 ) . Google Scholar
    • L. Rauchwerger, F. Arzu and K. Ouchi, Standard templates adaptive parallel library, Workshop on Languages, Compilers and Run-Time Systems for scalable computers () (1998) pp. 402–409. Google Scholar
    • S. Saunders and L. Rauchwerger, ARMI: an adaptive, platform independent communication library, Proc. ACM SIGPLAN Symp. Prin. Prac. Par. Prog. (PPOPP) (ACM Press, 2003) pp. 230–241. Google Scholar
    • G. Shahet al., Performance and experience with LAPI: A new high-performance communication library for the IBM RS/6000 SP, Int. Parallel Processing Symp. () (1998) pp. 260–266. Google Scholar
    • J. G. Siek and A. Lumsdaine, The matrix template library: A generic programming approach to high performance numerical linear algebra, ISCOPE () (1998) pp. 59–70. Google Scholar
    • L.   Smith , UK High-End Computing Technology Report ( 2000 ) . Google Scholar
    • B.   Stroustrup , The C++ Programming Language ( Addison-Wesley Pub Co , Reading, MA , 2000 ) . Google Scholar
    • L. Valiant, Communications of the ACM 33(8), 103 (1990). Crossref, ISIGoogle Scholar
    • T. Von Eickenet al., Active messages: A mechanism for integrated communication and computation, Int. Symp. on Computer Architecture () (1992) pp. 256–266. Google Scholar
    • J. Waldo, IEEE Concurrency 6(3), 5 (1998). CrossrefGoogle Scholar