Mixed-Precision Booth Factored Systolic Array Design for Accelerating Neural Networks

Sneha Dandekar1, Bhavana S2, Madhav Rao3
1International Institute of Information Technology - Bangalore, 2International Institute of Information Technology, Bangalore, 3International Institute of Information Technology-Bangalore


Abstract

Hardware Convolutional Neural Network (CNN) accelerators are expected to be power efficient and occupy minimal die footprint. Systolic Array (SA) Architecture is a group of processing elements that are reused in the datapath flow to satisfy the above demands of hardware accelerator design. In this work, we present SA architecture which augments mixed precision Radix-4 based Booth encoded partial products for each of the PEs in the SA. The proposed architecture supports arithmetic approximations across the SA, allowing designers to explore trade-offs between hardware characteristics, and computational accuracy. A meta-heuristic approach, based on NSGA-II algorithm allows for exploring the large design space of mixed-precision booth encoding per PE to evolve Pareto-optimal SA configurations that balance hardware efficiency with neural network model accuracy. The SA designs were evaluated and individually evolved from different CNNs, such as ResNet-18, VGG11, and LeNet-5, which were trained on various datasets. The post-synthesis results using Genus (Cadence) tool with 45 nm gpdk technology libraries for the proposed approximate booth encoding factored SA designs demonstrated maximum of 12% improvement in the product of area-power-delay (PAPD) over the state-of-the-art (SOTA) designs, while preserving model accuracy. These results highlight the effectiveness of fine-grained architectural-level approximation in achieving efficient CNN acceleration without compromising on the output quality. All the design files are made freely available for easy adoption and further usage by the researchers and designers community.