NON-UNIFORM "FAT-MESHES" FOR CHIP MULTIPROCESSORS
Abstract
This paper studies the traffic hot spots of mesh networks in the context of chip multiprocessors. To mitigate these effects, this paper describes a non-uniform fat-mesh extension to mesh networks, which are popular for chip multiprocessors. The fat-mesh is inspired by the fat-tree and dedicates additional links for connections with heavy traffic (e.g. near the center) with fewer links for lighter traffic (e.g. near the periphery). Two fat-mesh schemes are studied based on the traffic requirements of chip multiprocessors using dimensional ordered XY routing and a randomized XY-YX routing algorithms, respectively. Analytical fat-mesh models are constructed by theoretically presenting the expressions for the traffic requirements of personalized all-to-all traffic for both the raw message numbers and their normalized equivalents. We demonstrate how traffic scales for a traditional mesh compared to a non-uniform fat mesh. Simulation results demonstrate that using same number of physical links the non-uniform fat-mesh can achieve better performance than a uniform fat-mesh mesh using both synthetic traffic patterns and splash-2 benchmark traces.
This work is partially supported by NSF award number 0702452.
References
- IEEE Transactions on Computers 34(10), 892 (1985). ISI, Google Scholar
-
J. Duato , S. Yalamanchili and L. Ni , Interconnection Networks ( Morgan Kaufmann , 2002 ) . Google Scholar -
A. Grama , Introduction to Parallel Computing , 2nd edn. ( Addison Wesley , 2003 ) . Google Scholar - IEEE Micro 23(6), 99 (2003). ISI, Google Scholar
- SPAA 126 (2007), DOI: 10.1145/1248377.1248398. Google Scholar
- J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss,, and P. Montesinos, "SESC Simulator." http://sesc.sourceforge.net, January 2005 . Google Scholar
- IEEE Transactions on Parallel and Distributed Systems (2008). Google Scholar
- IEEE Micro 22, 46 (2002), DOI: 10.1109/MM.2002.1044299. Crossref, ISI, Google Scholar
- IEEE Computer 35, 50 (2002). Crossref, ISI, Google Scholar
- IEEE Transactions on Computers 37, 1315 (1988), DOI: 10.1109/12.5998. Crossref, ISI, Google Scholar
- M. R. Samatham, "Augmented Multiprocessor Networks." US Patent 5134690, 1992 . Google Scholar
- IEEE Transactions on Parallel and Distributed Systems 5, 1210 (1994), DOI: 10.1109/71.329667. Crossref, ISI, Google Scholar
- IEEE Transactions on Parallel and Distributed Systems 10, 266 (1999), DOI: 10.1109/71.755826. Crossref, ISI, Google Scholar
- IEEE Transactions on Parallel and Distributed Systems 9, 929 (1998), DOI: 10.1109/71.730522. Crossref, ISI, Google Scholar
S. R. Ohring , On Generalized Fat Trees, Proc. of the Parallel Processing Symposium (1995) pp. 37–44. Google ScholarH. Kariniemi and J. Nurmi , Reusable XGFT Interconnect IP for Network-on-chip Implementations, Proc. of the International Symposium on System-on-Chip (SoC) (2004) pp. 95–102. Google Scholar- IEEE Transactions on Computers 43(12), 1358 (1994), DOI: 10.1109/12.338095. Crossref, ISI, Google Scholar
- Journal of Computer System Sciences 28, 300 (1984), DOI: 10.1016/0022-0000(84)90071-0. Crossref, ISI, Google Scholar
A. DeHon , Compact, multilayer layout for butterfly fat-tree, ACM Symposium on Parallel Algorithms and Architectures (2000) pp. 206–215. Google Scholar- Journal of Parallel and Distributed Computing 66, 705 (2006), DOI: 10.1016/j.jpdc.2006.01.004. Crossref, ISI, Google Scholar
- V. Leppänen, Studies on the Realization of PRAM. PhD thesis, University of Turku, Computer Science Dept., 1996 . Google Scholar
- T. J. Harris, Shared Memory with Hidden Latency on a Family of Mesh-like Networks. PhD thesis, University of Edinburgh, Department of Computer Science, 1995 . Google Scholar
-
T.-C. Huang , U. Y. Ogras and R. Marculescu , Virtual Channels Planning for Networks-on-Chip , Proc. of the International Symposium on Quality Electronic Design (ISQED) ( 2007 ) . Google Scholar C. L. Seitz , The architecture and programming of the Ametek series 2010 multicomputer, Proc. of the Hypercube Concurrent Computers and Applications Conf. (1988) pp. 33–37. Google Scholar- I. Corp., ".4 Touchstone DEL X4 System Description," 1991 . Google Scholar
T. Nesson and S. L. Johnsson , ROMM routing on mesh and torus networks, SPAA '95: Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures (ACM, New York, NY, USA, 1995) pp. 275–287. Google ScholarD. Seo , Near-Optimal Worst-Case Throughput Routing for Two-Dimensional Mesh Networks, Proc. of the International Symposium on Computer Architecture (ISCA) (2005) pp. 432–443. Google Scholar- IEEE Computer Architecture Letters 7, (2008). Google Scholar
- MASCOTS 201 (1994). Google Scholar
- Computer Architecture News 20(1), 5 (1992), DOI: 10.1145/130823.130824. Crossref, Google Scholar


