Agentic AI for Chip Design Verification: Failure Modes, Metrics, and Coverage Closure

Noah Marosok1, Marcus Halm2, Kevin Immanuel Gubbi2, Mohammadnavid Tarighat2, Neusha Javidnia3, Soheil Zibakhsh-Shabgahi1, Ke Huang4, Setareh Rafatirad5, Hossein Sayadi6, Farinaz Koushanfar7, Houman Homayoun5
1University of California, San Diego, 2University of California, Davis, 3UC San Diego, 4San Diego State University, 5University of California Davis, 6California State University, Long Beach, 7University of California San Diego


Abstract

As computing enters the post-Moore era, the demand for specialized ASIC architectures has transformed design verification from a logistical hurdle into a primary economic bottleneck, now consuming the majority of engineering resources. While LLMs demonstrate proficiency in syntactic code generation, they remain fundamentally insufficient for the "Last Mile" of verification coverage due to critical limitations in temporal reasoning, state-space managements, and open-loop hallucination. This work discusses a necessary paradigm shift towards Agentic AI – autonomous systems capable of recursive reasoning and tool execution capabilities – to bridge the gap between the high-level intent and silicon-accurate behavior. We systematically classify agentic failure modes, specifically examining how the "lost-in-the-middle" problem and atomic tool misuse impede the verification of complex hierarchical designs. Finally, we establish a framework of functional metrics to move the field beyond textual similarity benchmarks and rigorously evaluate an agent's ability to resolve first-order verification constraints.