World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

NON-UNIFORM "FAT-MESHES" FOR CHIP MULTIPROCESSORS

    This paper studies the traffic hot spots of mesh networks in the context of chip multiprocessors. To mitigate these effects, this paper describes a non-uniform fat-mesh extension to mesh networks, which are popular for chip multiprocessors. The fat-mesh is inspired by the fat-tree and dedicates additional links for connections with heavy traffic (e.g. near the center) with fewer links for lighter traffic (e.g. near the periphery). Two fat-mesh schemes are studied based on the traffic requirements of chip multiprocessors using dimensional ordered XY routing and a randomized XY-YX routing algorithms, respectively. Analytical fat-mesh models are constructed by theoretically presenting the expressions for the traffic requirements of personalized all-to-all traffic for both the raw message numbers and their normalized equivalents. We demonstrate how traffic scales for a traditional mesh compared to a non-uniform fat mesh. Simulation results demonstrate that using same number of physical links the non-uniform fat-mesh can achieve better performance than a uniform fat-mesh mesh using both synthetic traffic patterns and splash-2 benchmark traces.

    This work is partially supported by NSF award number 0702452.

    References

    • C. E. Leiserson, IEEE Transactions on Computers 34(10), 892 (1985). ISIGoogle Scholar
    • J.   Duato , S.   Yalamanchili and L.   Ni , Interconnection Networks ( Morgan Kaufmann , 2002 ) . Google Scholar
    • A.   Grama et al. , Introduction to Parallel Computing , 2nd edn. ( Addison Wesley , 2003 ) . Google Scholar
    • C. Kim, D. Burger and S. W. Keckler, IEEE Micro 23(6), 99 (2003). ISIGoogle Scholar
    • J. A. Brown, R. Kumar and D. M. Tullsen, SPAA 126 (2007), DOI: 10.1145/1248377.1248398. Google Scholar
    • J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss,, and P. Montesinos, "SESC Simulator." http://sesc.sourceforge.net, January 2005 . Google Scholar
    • S. Shao, A. K. Jones and R. Melhem, IEEE Transactions on Parallel and Distributed Systems  (2008). Google Scholar
    • M. Forsell, IEEE Micro 22, 46 (2002), DOI: 10.1109/MM.2002.1044299. Crossref, ISIGoogle Scholar
    • P. S. Magnussonet al., IEEE Computer 35, 50 (2002). Crossref, ISIGoogle Scholar
    • D. A. Carlson, IEEE Transactions on Computers 37, 1315 (1988), DOI: 10.1109/12.5998. Crossref, ISIGoogle Scholar
    • M. R. Samatham, "Augmented Multiprocessor Networks." US Patent 5134690, 1992 . Google Scholar
    • S. G. Ziavras, IEEE Transactions on Parallel and Distributed Systems 5, 1210 (1994), DOI: 10.1109/71.329667. Crossref, ISIGoogle Scholar
    • R. Linet al., IEEE Transactions on Parallel and Distributed Systems 10, 266 (1999), DOI: 10.1109/71.755826. Crossref, ISIGoogle Scholar
    • D. Bhagavathiet al., IEEE Transactions on Parallel and Distributed Systems 9, 929 (1998), DOI: 10.1109/71.730522. Crossref, ISIGoogle Scholar
    • S. R. Ohringet al., On Generalized Fat Trees, Proc. of the Parallel Processing Symposium (1995) pp. 37–44. Google Scholar
    • H. Kariniemi and J. Nurmi, Reusable XGFT Interconnect IP for Network-on-chip Implementations, Proc. of the International Symposium on System-on-Chip (SoC) (2004) pp. 95–102. Google Scholar
    • R. I. Greenberg, IEEE Transactions on Computers 43(12), 1358 (1994), DOI: 10.1109/12.338095. Crossref, ISIGoogle Scholar
    • S. N. Bhatt and F. T. Leighton, Journal of Computer System Sciences 28, 300 (1984), DOI: 10.1016/0022-0000(84)90071-0. Crossref, ISIGoogle Scholar
    • A. DeHon, Compact, multilayer layout for butterfly fat-tree, ACM Symposium on Parallel Algorithms and Architectures (2000) pp. 206–215. Google Scholar
    • K. F. Chen and E. H.-M. Sha, Journal of Parallel and Distributed Computing 66, 705 (2006), DOI: 10.1016/j.jpdc.2006.01.004. Crossref, ISIGoogle Scholar
    • V. Leppänen, Studies on the Realization of PRAM. PhD thesis, University of Turku, Computer Science Dept., 1996 . Google Scholar
    • T. J. Harris, Shared Memory with Hidden Latency on a Family of Mesh-like Networks. PhD thesis, University of Edinburgh, Department of Computer Science, 1995 . Google Scholar
    • T.-C.   Huang , U. Y.   Ogras and R.   Marculescu , Virtual Channels Planning for Networks-on-Chip , Proc. of the International Symposium on Quality Electronic Design (ISQED) ( 2007 ) . Google Scholar
    • C. L. Seitzet al., The architecture and programming of the Ametek series 2010 multicomputer, Proc. of the Hypercube Concurrent Computers and Applications Conf. (1988) pp. 33–37. Google Scholar
    • I. Corp., ".4 Touchstone DEL X4 System Description," 1991 . Google Scholar
    • T. Nesson and S. L. Johnsson, ROMM routing on mesh and torus networks, SPAA '95: Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures (ACM, New York, NY, USA, 1995) pp. 275–287. Google Scholar
    • D. Seoet al., Near-Optimal Worst-Case Throughput Routing for Two-Dimensional Mesh Networks, Proc. of the International Symposium on Computer Architecture (ISCA) (2005) pp. 432–443. Google Scholar
    • R. S. Ramanujam and B. Lin, IEEE Computer Architecture Letters 7, (2008). Google Scholar
    • J. E. Veenstra and R. J. Fowler, MASCOTS 201 (1994). Google Scholar
    • J. P. Singh, W. Weber and A. Gupta, Computer Architecture News 20(1), 5 (1992), DOI: 10.1145/130823.130824. CrossrefGoogle Scholar