World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

NESTED OPENMP PARALLELIZATION OF A HIERARCHICAL DATA CLUSTERING ALGORITHM

    This paper presents a high performance parallel implementation of a hierarchical data clustering algorithm. The OpenMP programming model, either enhanced with our lightweight runtime support or through its tasking model, deals with the high irregularity of the algorithm and allows for efficient exploitation of the inherent loop-level nested parallelism. Thorough experimental evaluation demonstrates the performance scalability of our parallelization and the effective utilization of computational resources, which results in a clustering approach able to provide high quality clustering of very large datasets.

    References

    • L. Amsaleg and P. Gros, Patterns Analysis and Applications 4, 108 (2001), DOI: 10.1007/s100440170011. Crossref, ISIGoogle Scholar
    • D. Arlia and M. Coppola, Experiments in Parallel Clustering using DBSCAN, Proceedings of Euro-Par (2001) pp. 326–331. Google Scholar
    • OpenMP Architecture Review Board. OpenMP specifications. Available at , http://www.openmp.org . Google Scholar
    • M.   Dash , S.   Petrutiu and P.   Scheuermann , Efficient Parallel Hierarchical Clustering , Proceedings of the European Conference on Parallel Computing (EUROPAR 2004) . Google Scholar
    • V. V.   Dimakopoulos , P. E.   Hadjidoukas and G. Ch.   Philos , A Microbenchmark Study of OpenMP Overheads Under Nested Parallelism , Proceedings of the 4th International Workshop on OpenMP (IWOMP 2003) . Google Scholar
    • V. V.   Dimakopoulos , X.   Tzoumas and E.   Leontiadis , A Portable Compiler for OpenMP v. 2.0 , Proceedings of the 5th European Workshop on OpenMP (EWOMP 2003) . Google Scholar
    • Z. Feng, B. Zhou and J. Shena, NeuroComputing 70, 809 (2007). Crossref, ISIGoogle Scholar
    • International Journal of Supercomputer Applications and High Performance Computing 6(3–4), 159 (1994). Google Scholar
    • M. Gonzalezet al., Openmp extensions for thread groups and their runtime support, In Workshop on Languages and Compilers for Parallel Computing (Springer-Verlag, 2000) pp. 317–331. Google Scholar
    • S.   Guha , R.   Rastogi and K.   Shim , CURE: An Efficient Clustering Algorithm for Large DataBases , Proceedings of the ACM SIGMOD International Conference on Management of Data . Google Scholar
    • P. E.   Hadjidoukas and V. V.   Dimakopoulos , Nested Parallelism in the OMPi OpenMP/C Compiler , Proceedings of Europar '07 . Google Scholar
    • [12]J. P. Hoeflinger. Extending OpenMP to Clusters, 2006. White Paper, Intel Corporation . Google Scholar
    • E.   Januzaj , H.-P.   Kriegel and M.   Pfeifle , Towards Effective and Efficient Distributed Clustering , Proceedings of Workshop on Clustering Large Data Sets (ICDM2003) ( 1998 ) . Google Scholar
    • D. Judd, P. K. Mckinley and A. K. Jain, IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 871 (1998), DOI: 10.1109/34.709614. Crossref, ISIGoogle Scholar
    • S. Kotsiantis and P. Pintelas, WSEAS Transactions on Information Science and Applications 1, 73 (2004), DOI: 10.1109/34.709614. Google Scholar
    • H. S.   Nagesh , S.   Goil and A.   Choudhary , A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets , Proceedings of the International Conference on Parallel Processing (ICPP'2000) ( 2000 ) . Google Scholar
    • C. F. Olson, Parallel Computing 21, 1313 (1995), DOI: 10.1016/0167-8191(95)00017-I. Crossref, ISIGoogle Scholar
    • OpenMP Architecture Review Board. Openmp application program interface, version 3.0, May 2008 . Google Scholar
    • C. Pizzuti and D. Talia, IEEE Transactions on Knowledge and Data Engineering 15(3), (2003), DOI: 10.1109/TKDE.2003.1198395. Google Scholar
    • K. Stoffel and A. Belkoniene, Parallel K-Means Clustering for Large Data Sets, Proceedings of EuroPar '99 (1999) pp. 1451–1454. Google Scholar
    • Y.   Tanaka et al. , Performance Evaluation of OpenMP Applications with Nested Parallelism , Proceedings of the Fifth Workshop on Languages, Compilers and Run-Time Systems for Scalable Computers (LCR '2000) ( 2000 ) . Google Scholar
    • S.   Thibault et al. , An Efficient OpenMP Runtime System for Hierarchical Architectures , Proceedings of the 3rd International Workshop on OpenMP (IWOMP '2008) ( 2007 ) . Google Scholar
    • R. Xu and D. Wunsch, IEEE Transactions on Neural Networks 16(3), 645 (2005), DOI: 10.1109/TNN.2005.845141. Crossref, ISIGoogle Scholar
    • B. Zang and M. Hsu. Scale Up Center-Based Data Clustering Algorithms by Parallelism. In Technical Report HPL-2000-6, Software Technology Laboratory, HP Laboratories, Palo Alto, January 2000 . Google Scholar