Hybrid-Comp: A Criticality-Aware Compressed Last-Level Cache

Amin Jadidi1, Mohammad Arjomand2, Mahmut Kandemir1, Chita Das1
1Pennsylvania State University, 2Georgia Institute of technology


Abstract

Cache compression is a promising technique to increase on-chip cache capacity and to decrease on-chip and off-chip bandwidth usage. While prior cache compression techniques always consider a trade-off between the compression ratio and decompression latency, they are oblivious to the underlying architecture which imposes different levels of criticality to the cache blocks. In multicore processors, large last level cache is logically shared but physically distributed among cores which are interconnected through a packet-based Network-on-Chip (NoC) communication fabric. In this work, we demonstrate that, cache blocks within such non-uniform cache architecture exhibit different tolerance to the access latency. Owing to this behavior, we propose a criticality-aware cache architecture which exploits the latency-sensitivity of different cache blocks as a third design dimension that should be considered along with the traditional design parameters of a compressed cache architecture, i.e., compression ratio and decompression latency. Our proposed architecture favors lower latency over higher capacity based on the criticality of the cache line. Based on our experimental studies on a 16-core processor with 4MB last-level cache, the proposed criticality-aware architecture improves the system performance comparable to that of with a 8MB last-level cache.