World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

Exploiting Long-Term Temporal Cache Access Patterns for LRU Insertion Prioritization

    In a CPU cache utilizing least recently used (LRU) replacement, cache sets manage a buffer which orders all cache lines in the set from LRU to most recently used (MRU). When a cache line is brought into cache, it is placed at the MRU and the LRU line is evicted. When re-accessed, a line is promoted to the MRU position. LRU replacement provides a simple heuristic to predict the optimal cache line to evict. However, LRU utilizes only simple, short-term access patterns. In this paper, we propose a method that uses a buffer called the history queue to record longer-term access-eviction patterns than the LRU buffer can capture. Using this information, we make a simple modification to LRU insertion policy such that recently-recalled blocks have priority over others. As lines are evicted, their addresses are recorded in a FIFO history queue. Incoming lines that have recently been evicted and now recalled (those in the history queue at recall time) remain in the MRU for an extended period of time as non-recalled lines entering the cache thereafter are placed below the MRU. We show that the proposed LRU insertion prioritization increases performance in single-threaded and multi-threaded workloads in simulations with simple adjustments to baseline LRU.

    Communicated by Andrew Adamatzky

    References

    • 1. LLC WikiChip. Skylake client microarchitectures (2020), https://en.wikichip.org/wiki/intel/microarchitectures/skylake˙(client). Google Scholar
    • 2. Norman P. Jouppi, Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, ACM SIGARCH Computer Architecture News 18(2SI) (1990) 364–373. CrossrefGoogle Scholar
    • 3. Osvaldo Navarro, Tim Leiding and Michael Hübner, Configurable cache tuning with a victim cache, in 2015 10th International Symposium on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC) (IEEE, 2015), pp. 1–6. CrossrefGoogle Scholar
    • 4. Samira M. Khan, Daniel A. Jiménez, Doug Burger and Babak Falsafi, Using dead blocks as a virtual victim cache, in Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (ACM, 2010) 489–500. CrossrefGoogle Scholar
    • 5. Neetu Jindal, Preeti Ranjan Panda and Smruti R. Sarangi, Reusing trace buffers as victim caches, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 26(9) (2018) 1699–1712. Crossref, ISIGoogle Scholar
    • 6. Samira Khan and Daniel A. Jiménez, Insertion policy selection using decision tree analysis, in 2010 IEEE International Conference on Computer Design (IEEE, 2010), pp. 106–111. CrossrefGoogle Scholar
    • 7. Daniel A. Jiménez, Insertion and promotion for tree-based pseudolru last-level caches, in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (2013), pp. 284–296. CrossrefGoogle Scholar
    • 8. Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely and Joel Emer, Adaptive insertion policies for high performance caching, ACM SIGARCH Computer Architecture News 35(2) (2007) 381–391. CrossrefGoogle Scholar
    • 9. Samira Khan and Daniel A. Jiménez, Insertion policy selection using decision tree analysis, in 2010 IEEE International Conference on Computer Design (IEEE, 2010), pp. 106–111. CrossrefGoogle Scholar
    • 10. Andhi Janapsatya, Aleksandar Ignjatović, Jorgen Peddersen and Sri Parameswaran, Dueling clock: Adaptive cache replacement policy based on the clock algorithm, in 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010) (IEEE, 2010), pp. 920–925. CrossrefGoogle Scholar
    • 11. Aamer Jaleel, William Hasenplaugh, Moinuddin Qureshi, Julien Sebot, Simon Steely, Jr. and Joel Emer, Adaptive insertion policies for managing shared caches, in Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (2008), pp. 208–219. CrossrefGoogle Scholar
    • 12. Georgios Keramidas, Pavlos Petoumenos and Stefanos Kaxiras, Cache replacement based on reuse-distance prediction, in 2007 25th International Conference on Computer Design (IEEE, 2007), pp. 245–250. CrossrefGoogle Scholar
    • 13. Kristof Beyls and Erik D’Hollander, Reuse distance as a metric for cache behavior, in Proceedings of the IASTED Conference on Parallel and Distributed Computing and Systems, Volume 14 (Citeseer, 2001), pp. 350–360. Google Scholar
    • 14. Carole-Jean Wu, Aamer Jaleel, Will Hasenplaugh, Margaret Martonosi, Simon C. Steely, Jr. and Joel Emer, Ship: Signature-based hit predictor for high performance caching, in Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (2011), 430–441. CrossrefGoogle Scholar
    • 15. Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, Jr. and Joel Emer, High performance cache replacement using re-reference interval prediction (RRIP), ACM SIGARCH Computer Architecture News 38(3) (2010) 60–71. CrossrefGoogle Scholar
    • 16. Armin Vakil-Ghahani, Sara Mahdizadeh-Shahri, Mohammad-Reza Lotfi-Namin, Mohammad Bakhshalipour, Pejman Lotfi-Kamran and Hamid Sarbazi-Azad, Cache replacement policy based on expected hit count, IEEE Computer Architecture Letters 17(1) (2017) 64–67. Crossref, ISIGoogle Scholar
    • 17. Seyed Armin Vakil Ghahani, Sara Mahdizadeh Shahri, Mohammad Bakhshalipour, Pejman Lotfi-Kamran and Hamid Sarbazi-Azad, Making Belady-inspired replacement policies more effective using expected hit count, arXiv preprint arXiv:1808.05024 (2018). Google Scholar
    • 18. Joseph Sharkey, Dmitry Ponomarev and Kanad Ghose, M-SIM: A flexible, multithreaded architectural simulation environment, Technical Report, Department of Computer Science, State University of New York at Binghamton (2005). Google Scholar
    • 19. John L. Henning, SPEC CPU2006 benchmark descriptions, ACM SIGARCH Computer Architecture News 34(4) (2006) 1–17. CrossrefGoogle Scholar