World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

ON DESIGN AND APPLICATION MAPPING OF A NETWORK-ON-CHIP(NOC) ARCHITECTURE

    As the number of integrated IP cores in the current System-on-Chips (SoCs) keeps increasing, communication requirements among cores can not be sufficiently satisfied using either traditional or multi-layer bus architectures because of their poor scalability and bandwidth limitation on a single bus. While new interconnection techniques have been explored to overcome such a limitation, the notion of utilizing Network-on-Chip (NoC) technologies for the future generation of high performance and low power chips for myriad of applications, in particular for wireless communication and multimedia processing, has been of great importance. In order for the NoC technologies to succeed, realistic specifications such as throughput, latency, moderate design complexity, programming model, and design tools are necessary requirements. For this purpose, we have covered some of the key and challenging design issues specific to the NoC architecture such as the router design, network interface (NI) issues, and complete system-level modeling. In this paper, we propose a multi-processor system platform adopting NoC techniques, called NePA (Network-based Processor Array). As a component of system platform, the fundamental NoC techniques including the router architecture and generic NI are defined and implemented adopting low power and clock efficient techniques. Using a high-level cycle-accurate simulation, various parameters relevant to its performance and its systematic modeling are extracted and analyzed. By combining various developed systematic models, we construct the tool chain to pursue hardware/software design tradeoffs necessary for better understanding of the NoC techniques. Finally utilizing implementation of parallel FFT algorithms on the homogeneous NePA, the feasibility and advantages of using NoC techniques are shown.

    References

    • A Comparison of Network-on-Chip and Buses, http://www.arteris.com/noc_whitepaper.pdf, 2005 . Google Scholar
    • International Technology Roadmap for Semiconductors 2004 Update, ITRS, 2004 . Google Scholar
    • AMBA Specification Rev. 2.0, http://www.arm.com, 1999 . Google Scholar
    • Specification for the: WISHBONE System-on-Chip (SoC) Interconnection Architecture for Portable IP Cores, OpenCore, 2002 . Google Scholar
    • The CoreConnect Bus Architecture, http://www-03.ibm.com/chips/products/coreconnect/, 1999 . Google Scholar
    • P. Guerrier and A. Greiner, Proc. Design and Test in Europe (DATE) (2000) pp. 250–256. Google Scholar
    • S. Kumaret al., A Network on Chip Architecture and Design Methodology, Proc. Int'l Symp. VLSI (ISVLSI) (2002) pp. 117–124. Google Scholar
    • W. J. Dally and B. Towles, Route Packets, Not Wires: On-Chip Interconnection Networks, Proc. Design Automation Conf. (DAC) (2001) pp. 683–689. Google Scholar
    • F. Karim, A. Nguyen and S. Dey, IEEE Micro 22, 36 (2002), DOI: 10.1109/MM.2002.1044298. Crossref, ISIGoogle Scholar
    • P. P. Pandeet al., Design of a Switch for Network on Chip Applications, Proc. Int'l Symp. Circuits and Systems (ISCAS)5 (2003) pp. 217–220. Google Scholar
    • J.   Duato , S.   yalamanchili and L. M.   Ni , Interconnection Networks: An Engineering Approach ( IEEE Computer Society Press , 2003 ) . Google Scholar
    • H. H. Najaf-abadi, H. Sarbazi-azad and P. Rajabzadeh, Performance Modeling of Fully Adaptive Wormhole Routing in 2-D Mesh-Connected Multiprocessors, Proc. Int'l Symp. Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS) (2004) pp. 528–534. Google Scholar
    • P. P. Pandeet al., IEEE Trans. Computers 54, 1025 (2005), DOI: 10.1109/TC.2005.134. Crossref, ISIGoogle Scholar
    • W. J.   Dally and B.   Towles , Principles and Practices of Interconnection Networks ( Morgan Kaufmann Publishers , San Francisco, CA , 2004 ) . Google Scholar
    • H. Sullivan and T. R. Bashkow, A Large Scale, Homogeneous, Fully Distributed Parallel Machine, Proc. Symp. Computer Architecture (ACM Press, 1977) pp. 105–117. Google Scholar
    • T. Nesson and S. L. Johnsson, ROMM Routing on Mesh and Torus Networks, Proc. ACM Symp. Parallel Algorithms and Architectures (ACM Press, 1995) pp. 275–287. Google Scholar
    • D. Seoet al., Near-Optimal Worst-case Throughput Routing for Two-Dimensional Mesh Networks, Proc. Int'l Symp. Computer Architecture (ISCA) (2005) pp. 432–443. Google Scholar
    • J. Hu and R. Marculescu, DyAD - Smart Routing for Network-on-Chip, Proc. Design and Automation (ACM Press, 2004) pp. 260–263. Google Scholar
    • W. J. Dally and C. L. Seitz, IEEE Trans. Computer C-36, 547 (1987), DOI: 10.1109/TC.1987.1676939. Crossref, ISIGoogle Scholar
    • J. Duato, IEEE Trans. Parallel and Distributed Systems 4, 1320 (1993), DOI: 10.1109/71.250114. Crossref, ISIGoogle Scholar
    • G. Chiu, IEEE Trans. Parallel and Distributed Systems 11, 729 (2000). Crossref, ISIGoogle Scholar
    • C. J. Glass and L. M. Ni, Journal of ACM 31, 874 (1994). Google Scholar
    • C. J. Glass and L. M. Ni, Maximally Fully Adaptive Routing in 2D Meshes, Proc. Int'l Conf. Parallel ProcessingI (1992) pp. 101–104. Google Scholar
    • W. J. Dally, IEEE Trans. Parallel and Distributed Systems 3(2), 194 (1992), DOI: 10.1109/71.127260. Crossref, ISIGoogle Scholar
    • R. V. Boppana and S. Chalasani, IEEE Trans. Computers 44, (1995), DOI: 10.1109/12.392844. Google Scholar
    • J. Zhou and F. C. M. Lau, Adaptive Fault-Tolerant Wormhole Routing with Two Virtual Channels in 2D Meshes, Proc. Int'l Symp. Parallel Architectures, Algorithms and Networks (ISPAN) (2004) pp. 142–148. Google Scholar
    • A. S. Vaidya, A. Sivasubramaniam and C. R. Das, IEEE Trans. Parallel and Distributed Systems 12, 223 (2001), DOI: 10.1109/71.910875. Crossref, ISIGoogle Scholar
    • M.   Rezazad and H.   Sarbazi-azad , The Effect of Virtual Channel Organization on the Performance of Interconnection Networks , Proc. Int'l Parallel and Distributed Processing Symposium (IPDPS) ( 2005 ) . Google Scholar
    • S. E. Lee and N. Bagherzadeh, Increasing the Throughput of an Adaptive Router in Network-on-Chip(NoC), Proc. of Third Int'l Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) (2006) pp. 82–87. Google Scholar
    • D. Kim, M. Kim and G. E. Soberlman, Circuits and Systems, 2005. ISC AS 2005. IEEE International Symposium on2 (2005) pp. 1138–1141. Google Scholar
    • M. Kreutzet al., Energy and latency evaluation of NoC topologies, Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on6 (2005) pp. 5866–5869. Google Scholar
    • AMBA Advanced extensible Interface (AXI) Protocol Specifcation, Version 1.0, ARM, 2004. http://www.arm.com . Google Scholar
    • OCP International Partnership, Open Core Protocol Specification. 2.0 Release Candidate, 2003 . Google Scholar
    • Device Transaction Level (DTL) Protocol Specification, Version 2.2, Phillips Semiconductors, 2002. . Google Scholar
    • N. Tabriziet al., MaRS: A Macro-pipelined Reconflgurable System, ACM Computing Frontiers (2004) pp. 343–349. Google Scholar
    • J. H. Balm, S. E. Lee and N. Bagherzadeh, On Design and Analysis of a Feasible Network-on-Chip (NoC) Architecture, Proc. ITNG 2007 (2007) pp. 1033–1038. Google Scholar
    • C. Zhong, G. Han and M. Huang, Some New Parallel Fast Fourier Transform Algorithm, Proc. of the Sixth Int'l Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT) (2005) pp. 624–628. Google Scholar
    • TMS320C62x DSPs C62x Core Benchmarks from Texas Instruments, http://www.ti.com . Google Scholar
    • TMS320C67x Floating Point DSPs C67x Core Benchmarks from Texas Instruments, http://www.ti.com . Google Scholar
    • FFT Benchmark Results, http://www.fftw.org/speed . Google Scholar