Abstract

Modern superscalar processors implement register renaming by using either RAM or CAM tables. The design of these structures should address their access time and misprediction recovery penalty. While direct-mapped RAMs provide faster access times, CAMs are more appropriate to avoid recovery penalties. Although they are more complex and slower, CAMs usually match the processor cycle in current designs. However, they do not scale with the number of physical registers and the pipeline width. In this paper we present a new hybrid RAM-CAM register renaming scheme, which combines the best of both approaches. In a steady state, a RAM provides the current mappings quickly; on mispeculation, a low-complexity CAM enables immediate recovery and further register renaming. Compared to an ideal CAM in a 4-way state-of-the-art superscalar microprocessor, and for almost the same performance (1% slowdown) and area (95% of the ideal CAM size), the proposed scheme consumes about 90% less dynamic energy.


Original document

The different versions of the original document can be found in:

http://yadda.icm.edu.pl/yadda/element/bwmeta1.element.ieee-000005413160,
https://dblp.uni-trier.de/db/conf/iccd/iccd2009.html#PetitUSL09,
https://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5413160,
https://academic.microsoft.com/#/detail/2145693068
http://dx.doi.org/10.1109/iccd.2009.5413160
Back to Top

Document information

Published on 01/01/2010

Volume 2010, 2010
DOI: 10.1109/iccd.2009.5413160
Licence: CC BY-NC-SA license

Document Score

0

Views 0
Recommendations 0

Share this document

Keywords

claim authorship

Are you one of the authors of this document?