Preliminary Program

VHDLBench — a Dataset with Rich Contextual Relationships for Training Custom LLMs

Arpit Sakhreliya and Manoj Franklin
University of Maryland, College park

Abstract

Due to the growing complexity of Integrated Circuits (ICs), automating HDL (Hardware Description Language) code generation is becoming important. Although large language models (LLMs) have become very proficient in generating computer programs, they have not been very successful in producing efficient VHDL code. A major reason for this is the lack of a suitable VHDL training set to train the LLM models. In this paper, we present a VHDL dataset built from 356 GitHub repositories comprising 39000 VHDL files. We systematically preprocess each file to extract key structural components—libraries, packages, entities, architectures, components, process blocks, and inter-file dependencies based on Unit Under Test (UUT), entities and component instantiations inter-file dependencies—using VHDL-specific heuristics and regular expressions, delivering rich contextual insights essential for fine-tuning and evaluating LLMs in VHDL code generation. Furthermore, to enable structured evaluation with VHDLBench, we introduce a module masking procedure that selectively removes a key module—entity, architecture, component, or a specific code segment—creating paired samples: a masked code segment and its corresponding removed snippet. This approach allows users to assess two key capabilities in LLMs: Code Structure Learning (CSL), which tests the model's ability to generate coherent in-file structures, and Masked Module Completion (MMC), which evaluates how well the model infers missing modules and captures inter-file dependencies.

By introducing VHDLBench, a structured and dependency-aware VHDL dataset, along with using rigorous evaluation metrics, this work establishes a foundation for advancing automated VHDL code generation. We evaluate deployable 7B-parameter code-oriented LLMs on the Masked Module Completion (MMC) task and fine-tune 7B models on VHDLBench, comparing their performance against the base models to quantify the benefits of domain-specific adaptation. Experimental results demonstrate that fine-tuning substantially improves structural accuracy, semantic alignment, and exact-match correctness in VHDL generation. We believe this work will contribute significantly to advancing the efficiency and accuracy of VHDL code generation, paving the way for more streamlined and scalable development of Integrated Circuits.