Cache compression is a promising technique to increase on-chip cache capacity and to decrease on-chip and oﬀ-chip bandwidth usage. While prior cache compression techniques always consider a trade-oﬀ between the compression ratio and decompression latency, they are oblivious to the underlying architecture which imposes different levels of criticality to the cache blocks. In multicore processors, large last level cache is logically shared but physically distributed among cores which are interconnected through a packet-based Network-on-Chip (NoC) communication fabric. In this work, we demonstrate that, cache blocks within such non-uniform cache architecture exhibit diﬀerent tolerance to the access latency. Owing to this behavior, we propose a criticality-aware cache architecture which exploits the latency-sensitivity of diﬀerent cache blocks as a third design dimension that should be considered along with the traditional design parameters of a compressed cache architecture, i.e., compression ratio and decompression latency. Our proposed architecture favors lower latency over higher capacity based on the criticality of the cache line. Based on our experimental studies on a 16-core processor with 4MB last-level cache, the proposed criticality-aware architecture improves the system performance comparable to that of with a 8MB last-level cache.