Reading: High Performance Cache Replacement Using RRIP

High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP) 

ISCA 2010

In Summary

The authors propose (broadly) two new replacement policies, primarily useful in shared LLCs:

Static RRIP: with this policy, each W-way set stores 2W bits, with a pair of bits denoting a “re-reference prediction value” dedicated to each cache line.  A RRPV of 0 indicates that a line is predicted to be re-referenced nearly immediately.  A RRPV of 3 indicates that a line is predicted to be re-referenced in the distant future (or not at all; i.e., a scan line).  Lines are inserted with RRPV=2 indicating near-distance re-reference prediction.  Hit lines have their RRPV either cleared to 0 (SRRIP-HP mode) or decremented (SRRIP-FP mode), but the authors say the 0 mode works better.  Victims are chosen by incrementing all ways until at least one has RRPV=3 (if there isn’t already one) finding the leftmost way with RRPV=3.

Dynamic RRIP: this policy assumes a modification of SRRIP called Bi-modal RRIP (BRRIP).  BRRIP is analogous to an RRIP version of DIP, where most insertions use RRPV=2 but very infrequently we insert with RRPV=1 instead.  DRRIP is a hybrid implementation of SRRIP and BRRIP with set dueling.

The authors further suggest a refinement to make DRRIP thread aware (TA-DRRIP), where the LLC has per-thread set dueling resources.

Takeaways

The authors’ data suggest that 2-bit DRRIP is more highly performant across the majority of analyzed workloads that SRRIP, LRU, PLRU, NFU, NRU, and basically everything else they tried.  DRRIP is not substantially more hardware than PLRU; it requires a little over 2x the storage, but I would guess that the logic is comparable if not smaller (no more walking the PLRU tree).  

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *