Hardware-Efficient and Precision-Aware FP16 Approximate Multipliers: A Probabilistic Approach

Bindu G Gowda1 and Madhav Rao2
1International Institute of Information Technology, Bangalore, 2International Institute of Information Technology-Bangalore


Abstract

Floating-point (FP) multiplication is a crucial operation in many modern computing applications, but its high computational cost and energy consumption pose challenges for resource-constrained systems. Half-Precision Floating-point or FP16 format is a preferred choice in applications such as deep learning, signal processing, and real-time embedded systems, where energy efficiency and performance optimization are critical. FP16 is also widely supported in modern hardware, including specialized tensor processing units, general-purpose GPUs, and AI accelerators, making it a popular choice for highperformance and energy efficient computing. This paper presents approximate FP16 multiplier designs that introduce approximation in the mantissa multiplication to further achieve significant reductions in hardware complexity, power consumption, and latency, compared to the conventional IEEE 754-compliant FP16 multipliers, while still maintaining acceptable accuracy for error-tolerant applications. In this work, six approximate FP16 multipliers are proposed, that are categorized based on the level of the approximation introduced in mantissa multiplication, achieved through the employment of statistically analyzed and designed approximate compressors. These approximate multiplier designs achieve area-power-delay-product (APDP) improvements ranging from 17.20% to 55.94% compared to the exact multiplier. To evaluate their performance and efficiency, all six designs were employed in applications including Image Restoration using Deconvolution, Convolutional Neural Networks (CNNs), and a mu-law encoding scheme. They demonstrated exceptional performance, making them excellent candidates for hardware-efficient signal processing and computer vision applications.