OK-BMM: A Power-Performance-Efficient Overlap-Free Karatsuba Based Barrett Modular Multiplier for Secure Embedded Systems

Bhavana S1, Keerthana B2, Madhav Rao3
1International Institute of Information Technology, Bangalore, 2International Institute of Information Technology Bangalore, 3International Institute of Information Technology-Bangalore


Abstract

Security-critical applications such as Fully Homomorphic Encryption (FHE), Rivest–Shamir–Adleman (RSA), and Elliptic Curve Cryptography (ECC) require high-throughput and energy-efficient modular operations. Barrett Modular Multiplier(BMM) proves to be an efficient algorithm for performing modulus operation without division operation. This paper presents an enhanced hardware implementation of BMM by substituting the traditional Karatsuba multiplication with a newer Overlap-Free Karatsuba scheme. The existing method, which decomposes the multiplicative expressions using specialized evaluation-interpolation (E,I) matrix pairs, optimizes resource utilization and reduces arithmetic complexity. Our approach further refines this strategy by eliminating overlap in recursive multiplication branches, thus reducing redundancy in partial product accumulation and simplifying the control logic. The proposed method delivers substantial gains in operational efficiency when implemented on Virtex-7 FPGA board. The critical-path-delay is significantly improved by 33.936%, 28.277%, 26.471% and 25.268% along with power savings of 50%, 52.632%, 74.375% and 71.772% for N = 32, 64, 128 and 256 bit-width design variants respectively, when compared with other state-of-the-art (SOTA) designs. Similar improvements are observed in ASIC synthesis for the design variants which was characterized using the OpenROAD tool by adopting Nangate45 standard cell library. By harmonizing the comparable design footprint, with enhanced timing and power profiles, the proposed BMM architecture introduces a compelling alternative to conventional Karatsuba-based modular multipliers. This advancement contributes towards the broader goal of designing arithmetic primitives optimized for secure and resource-constrained environments. The designs are made freely available for easy adoption and further usage to the researchers and designers community.