The increasing threat posed by quantum computers to classical cryptographic systems necessitates the adoption of post-quantum cryptographic (PQC) algorithms. CRYSTALS-Dilithium, a lattice-based digital signature scheme standardized by NIST, relies heavily on efficient polynomial multiplication using the number theoretic transform (NTT). This work presents a FPGA implementation of Dilithium-NTT, optimizing both the modular multiplication unit and memory access scheme. The modular multiplier leverages precomputed LUTs and an accelerated K-RED algorithm, achieving high performance with low resource overhead (111 LUTs, 137 FFs, and 02 DSPs at a frequency of 613 MHz). The proposed NTT architecture employs a dual ping-pong memory access scheme, eliminating the need for BRAMs while utilizing LUTs for intermediate data storage. Implemented on Xilinx Zynq UltraScale+ ZCU104 and Artix-7 AC701 evaluation platforms, the design achieves a 14% improvement in the area-time product compared to state-of-the-art solutions, with 12% lower LUT usage with an advantage of 6% higher operating frequency. These results demonstrate a scalable and resource-efficient approach for deploying PQC primitives in constrained hardware environments.