Preliminary Program

Semantic-Guided Test Generation using Fine-Tuned LLMs for Validation of Hardware Accelerators

Emma Andrews¹, Aruna Jayasena², Prabhat Mishra¹
¹University of Florida, ²University of Tennessee

Abstract

The increasing complexity and heterogeneity of programmable hardware accelerators, such as Graphics Processing Units (GPUs) and Tensor Processing Units (TPU), pose a significant challenge for automated test generation and functional validation. Traditional validation techniques often struggle to scale with architectural diversity and cannot effectively exploit the semantic relationships between instructions and data. Validation using large language models (LLMs) is a promising avenue for generating assembly programs (test vectors) for processor verification since LLMs are trained with diverse general-purpose processor designs. Unfortunately, LLMs are unsuitable for validation of programmable hardware accelerators since there is a lack of training data for such implementations. In this paper, we propose an automated framework that fine-tunes LLMs to generate semantically correct test cases directed toward improved design coverage while monitoring the functional correctness of the outputs. The generated test cases are evaluated by a compiler for correctness before using them for validation of hardware accelerators. We facilitate a mechanism for the LLM to observe the design coverage on the implementation based on the previously generated test patterns. Extensive experimental evaluation demonstrates that our framework can achieve 33% improvement in design coverage compared to state-of-the-art test generation with the added advantage of monitoring the functional correctness of the design. Our framework has identified several functional bugs in the open-source tiny-gpu implementation.