Preliminary Program

Crossing the Layers and Dotting the Details: Systematic Exploration of Near-Memory Computing

Riselda Kodra¹, Rafael Medina Morillas², Marina Zapater³, Giovanni Ansaloni¹, David Atienza⁴
¹EPFL, ²ETH Zurich, ³University of Applied Sciences Western Switzerland (HES-SO), ⁴École Polytechnique Fédérale de Lausanne (EPFL)

Abstract

Near-Memory Computing (NMC) addresses the Von Neumann bottleneck between memory and processing by bringing computation capabilities close to where data is stored. This approach opens a vast design space, which requires both a detailed view of the NMC processing units (PUs) architecture and a broad perspective of the entire system integrating them. To tackle this challenge, this paper presents a novel cross-layer methodology that relies on synthesizable hardware models of the NMC PUs, encapsulated as components for event-based full-system simulation. Our approach enables accurate assessment of the resource requirements and efficiency of near-memory units while exploring them \emph{in context}, i.e., when integrated into a system executing complex workloads. Hence, the resulting framework facilitates the systematic design exploration of NMC solutions. To this end, it also includes a complete software stack that supports the implementation and benchmarking of NMC-accelerated applications. We illustrate its effectiveness by providing area, energy, and performance metrics for executing Machine Learning (ML) inference benchmarks across the host CPU and the PUs using different DRAM standards. Our cross-layer approach highlights the potential of NMC acceleration, demonstrating system-wide speedups of 17.4x on average.