Preliminary Program

Cache Register Sharing Structure for Channel-level Near-memory Processing in NAND Flash Memory

HyunWoo Kim¹, Seungwon Baek¹, Minyoung Jung², Jaehong Song¹, Hyodong Kim¹, Junhyeon Kim¹, Seongju Kim¹, Taigon Song¹, Jongbeom Kim¹, Hyundong Lee³, Yunjeong Go³
¹Kyungpook National University (KNU), ²Kyungpook National University (KNU),, ³Kyungpook National University

Abstract

A vast number of data used for Artificial intelligence causes bottleneck between the processor and memory. To tackle this issue, a technology that embeds a processing unit in the memory (PIM: Processing-in Memory) has been proposed. However, SRAM/DRAM based PIM have a issue for lack of capacity. Thus, we propose a NAND flash PIM scheme that shares the cache register. Our scheme significantly reduces the read latency and runtime by -22.8% and 43.7%, compared to the conventional memory system. The power-performance-area (PPA) was reduced by 17.2% by shortening the number of cycles. Our NAND PIM specializes in large-scale computation tasks.