Preliminary Program

Integrating Automatic Prompt Engineering and Vision-Language Model for Pad Defect Classification

Yi-Ting Shen¹, Yan-Hsiu Liu², Yi-Ting Li¹, Wuqian Tang¹, Yung-Chih Chen³, Hao-Chiang Shao⁴, Chia-Wen Lin¹, Chun-Yao Wang¹
¹National Tsing Hua University, ²United Microelectronics Corporation, ³National Taiwan University of Science and Technology, ⁴National Chung Hsing University

Abstract

Most defect classification methods rely on deep learning, which requires human effort for image annotation. However, acquiring large, high-quality labeled datasets is often impractical due to time and cost constraints. In this paper, we propose an approach, which leverages a pretrained vision-language model with automatic prompt engineering, to reduce dataset dependence. By optimizing prompts, our approach enables accurate classification, even for unknown pad defects. Experimental results demonstrate that our approach achieves higher accuracy compared to CNN-based models and VLM-based methods on a semiconductor company's dataset.