ON THE PERFORMANCE AND TECHNOLOGICAL IMPACT OF ADDING MEMORY CONTROLLERS IN MULTI-CORE PROCESSORS
Abstract
The increasing core-count on current and future processors is posing critical challenges to the memory subsystem to efficiently handle concurrent memory requests. The current trend is to increase the number of memory channels available to the processor's memory controller. In this paper we investigate the advantages and disadvantages of this approach from both a technological and an application performance viewpoint. In particular, we explore the trade-off between employing multiple memory channels per memory controller and the use of multiple memory controllers with fewer memory channels. Experiments conducted on two current state-of-the-art multi-core processors, a 6-core AMD Istanbul and a 4-core Intel Nehalem-EP, using the STREAM benchmark and a wide range of production applications. An analytical model of the STREAM performance is used to illustrate the diminishing return obtained when increasing the number of memory channels per memory controller whose effect is also seen in the application performance. In addition, we show that this performance degradation can be efficiently addressed by increasing the ratio of memory controllers to channels while keeping the number of memory channels constant. Significant performance improvements can be achieved in this scheme, up to 28%, in the case of using two memory controllers each with one channel compared with one controller with two memory channels.
References
- Parallel Processing Letters 18(4), 453 (2008), DOI: 10.1142/S012962640800351X. Link, Google Scholar
-
L. A. Barroso , Piranha: a Scalable Architecture based on Single-chip Multiprocessing , Proceedings of International Symposium on Computer Architecture . Google Scholar -
J. B. Carter and L. Zhang , A Study of Performance Impact of Memory Controller , Proceedings of Workshop on Memory Performance Issues . Google Scholar - J. Casazza, First the Tick, Now the Tock: Intel Microarchitecture (Nehalem), Intel White Paper, 2009. Available at , http://www.intel.com/technology/architecture-silicon/next-gen/319724.pdf . Google Scholar
- CTWatch Quarterly 3(1), (2007), http://www.ctwatch.org/quarterly/articles/2007/02/the-impact-of-multicore-on-computational-science-software/index.html. Google Scholar
- Futuristic Intel Chip Could Reshape How Computers are Built, Consumers Interact with Their PCs and Personal Devices, Press released at , http://www.intel.com/pressroom/archive/releases/2009/20091202comp_sm.html . Google Scholar
- Journal of Physics 16, 65 (2005). Google Scholar
-
N. Kurd and P. Mosalikanti , Power-efficient scalable multi-core and high-speed IO clocking architecture , Proceedings of the CMOS Emerging Technologies Workshop Research & Business Opportunities Ahead . Google Scholar - AMD Istanbul Processor, Information available at , http://developer.amd.com/ZONES/ISTANBUL . Google Scholar
- Intel Xeon Processor 5500 Series, Datasheet Volume 1. March 2009. Document available at , http://www.intel.com/p/en_US/products/server/processor/xeon5000/technical-documents . Google Scholar
- MIMD Lattice Computation (MILC) Collaboration, Code available at , http://www.physics.indiana.edu/sg/milc.html . Google Scholar
- International Journal of High Performance Computing Applications 35(3), 261 (2005). Google Scholar
-
D. J. Kerbyson , Predictive Performance and Scalability Modeling of a Large-scale Application , Proceedings of the Supercomputing Conference . Google Scholar - Trans. of the American Nuclear Soc. 65, 198 (1992). Google Scholar
- NVIDIAs Next Generation CUDA Compute Architecture: Fermi. 2009 NVIDIA Corporation, Available at , http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf . Google Scholar
J. McCalpin , Memory Bandwidth and Machine Balance in Current High Performance Computers, IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter (1995) pp. 19–25, http://www.cs.virginia.edu/stream/. Google Scholar- SIGARCH Computation Architecture News 23(1), 20 (1995), DOI: 10.1145/216585.216588. Crossref, Google Scholar
-
J. C. Sancho , D. J. Kerbyson and M. Lang , Analyzing the Trade-off between Multiple Memory Controllers and Memory Channels on Multi-core Processor Performance , Workshop on Large-Scale Parallel Processing (LSPP), Int. Parallel and Distributed Processing Symposium (IPDPS) . Google Scholar


