Clustering Techniques and Statistical Fault Injection for Selective Mitigation of SEUs in Flip-Flops

Adrian Evans1,  Michael Nicolaidis2,  Shi-Jie Wen3,  Thiago Assis4
1iRoC Technologies, 2Institut National Polytechnique de Grenoble, 3Cisco Systems, 4Vanderbilt University


Abstract

In large SoCs, managing the effects of soft-errors in flip-flops is essential, however, selective mitigation is necessary to minimize the area and power costs. The identification of the optimal set of flip-flops to protect typically requires compute-intensive fault-injection campaigns. We present new techniques which group similar flip-flops into clusters to significantly reduce the number of fault injections. The number of required fault injections can be significantly lower than the total number of flip-flops and in one industrial design with over 100,000 flip-flops, by simulating only 2,100 fault injections, the technique identified a set of 4.1% of the flip-flops, which when protected, reduced the critical failure rate by a factor of 7x.