Preliminary Program

Agentic AI for Chip Design Verification: Failure Modes, Metrics, and Coverage Closure

Noah Marosok¹, Marcus Halm², Kevin Immanuel Gubbi², Mohammadnavid Tarighat², Neusha Javidnia³, Soheil Zibakhsh-Shabgahi¹, Ke Huang⁴, Setareh Rafatirad⁵, Hossein Sayadi⁶, Farinaz Koushanfar⁷, Houman Homayoun⁵
¹University of California, San Diego, ²University of California, Davis, ³UC San Diego, ⁴San Diego State University, ⁵University of California Davis, ⁶California State University, Long Beach, ⁷University of California San Diego

Abstract

As computing enters the post-Moore era, the demand for specialized ASIC architectures has transformed design verification from a logistical hurdle into a primary economic bottleneck, now consuming the majority of engineering resources. While LLMs demonstrate proficiency in syntactic code generation, they remain fundamentally insufficient for the "Last Mile" of verification coverage due to critical limitations in temporal reasoning, state-space managements, and open-loop hallucination. This work discusses a necessary paradigm shift towards Agentic AI – autonomous systems capable of recursive reasoning and tool execution capabilities – to bridge the gap between the high-level intent and silicon-accurate behavior. We systematically classify agentic failure modes, specifically examining how the "lost-in-the-middle" problem and atomic tool misuse impede the verification of complex hierarchical designs. Finally, we establish a framework of functional metrics to move the field beyond textual similarity benchmarks and rigorously evaluate an agent's ability to resolve first-order verification constraints.