# **REVIEW ARTICLE**



# ISSN: 2321-7758

# A SURVEY ON FAULT DIAGNOSIS APPROACHES

# AISWARYA A<sup>1\*</sup>, SHIJI A. S<sup>2</sup>, Dr. SREEJA MOLE S. S<sup>3</sup>

<sup>1</sup>PG Student, ECE Department, Narayanaguru College of Engineering, Manjalumoodu, KK District, Tamil Nadu, India <sup>2</sup>Assistant Professor, ECE Department, Narayanaguru College of Engineering, Manjalumoodu, KK District, Tamil Nadu, India

<sup>3</sup>Head of the Department ECE, Narayanaguru College of Engineering, Manjalumoodu, KK District, Tamil Nadu, India

Article Received: 05/01/2015

Article Revised on: 12/01/2015

Article Accepted on:13/01/2015



**AISWARYA A** 

## ABSTRACT

Fault free circuits are required in several critical application sectors. Even digital circuits with highly reliable components do not operate without developing faults forever. Current VLSI manufacturing processes suffer from large defective parts due to large number of defect types. Fault diagnosis provides a method of identifying cause of failure in a system under test. Several fault diagnosis techniques have been proposed and practically experimented. Classification of the fault diagnosis approaches is made based on diagnosis algorithms used. This paper presents a survey on fault diagnosis techniques with advantages and disadvantages of the basic methods. After literature survey this paper concluded that an effect-cause based diagnosis algorithm with multiple fault injection and multiple fault simulation in a particle swarm optimization environment which improves detection rate and resolution. With this modified algorithm multiple faults can be diagnosed in a reasonable time.

Keywords - Cause-Effect analysis, Effect-Cause analysis, Fault Diagnosis, Fault Model, Multiple Fault Simulation, VLSI.

## **©KY** Publications

## INTRODUCTION

Introduction of Integrated Circuits (ICs) have enabled the implementation of complex digital systems in a single chip. Advances in VLSI (Very Large Scale Integration) technology have resulted in steadily decreasing dimensions, called feature size, of transistors and interconnecting wires. With feature sizes steadily shrinking, the designs are more susceptible to manufacturing variations and defects which results in a faulty chip. A fault is the representation of a defect reflecting a physical condition that causes a circuit to fail to perform the function it is intended for. A failure is a deviation in the performance of a circuit or system from its expected behaviour. A circuit error is a wrong output signal produced by a defective circuit and is the manifestation of a fault. Faults represent logic deviations, timing deviations, or parametric deviations of a circuit under test.

Physical faults that arise during system operation are best classified according to their stability in time: Permanent, transient, or intermittent.

Permanent faults: Always present after their occurrence in the system and are caused by irreversible component damage, such as improper manufacture, or misuse.

- Intermittent faults: Present only during some intervals. Such faults are caused by unstable hardware or varying hardware conditions.
- Transient faults: Characterized by a one-time occurrence caused by a temporary change in some environmental factors such as power-line fluctuation, electromagnetic interference, or radiation. Transient faults occur more often than permanent ones, and are harder to detect.

If the Circuit Under Test (CUT) deviates from expected behaviour, cause of failure must be identified. Physical faults do not allow a direct treatment, and can be represented by using logical fault models. Fault diagnosis transforms the observed failed response of the CUT into physical faults in a structural model of the CUT. Fault diagnosis addresses two aspects: fault detection and fault location. Fault detection is the identification of some error in a digital system or circuit. This aspect will be providing the input test pattern sets with high fault coverage. Fault location is the process of locating the faults with components, functional modules, or subsystems. This aspect is developing good algorithms for particular fault models such as stuck-at [1], bridging [2], transition, delay [3], etc. In fault diagnosis process all the tests are executed and every response is stored in external memory and will be used for further analysis.

For the increased design complexities and density of digital circuits single fault assumption may not be true. In reality, multiple faults may occur on a failing chip. The number of suspected faults grows exponentially with the number of defects: Suspected faults = (No. of lines) <sup>(No. of defects)</sup>

In order to deal with this exponential search space and different failures special diagnostic algorithms for efficient diagnosis are developed. These diagnostic algorithms deal with single or multiple fault occurrences.

## FAULT DIAGNOSIS APPROACHES

Diagnostic algorithms usually identify a set of potential sources of defects, referred to as fault candidates. Depending on the diagnostic algorithms used diagnosis approaches can be broadly classified into two. The first approach is based on the causeeffect analysis. The second type of approach is based on the effect-cause principle. In fault diagnosis environment, faults are the causes that may fail the circuit to perform in a required manner. Effects are the actual responses obtained from the CUT. Cause-effect analysis starts with possible causes, and obtains their effects. In effect-cause analysis effects are processed to identify its possible causes. Both algorithms identify single and multiple fault locations.

Efficiency of diagnostic algorithms can be obtained by several measures. Resolution for an algorithm is measured as a ratio of actual faults present in the circuit to the total number of reported fault candidates. The resolution should be higher for a better the diagnostic tool. Diagnosability of an algorithm is a measure of the fraction of defects that can be correctly identified. The candidate faults identified by the algorithm are arranged in a specific order depending on their probability. First Hit Rank (FHR) compares the ordered list of faults found by the algorithm with the first fault that matches an injected fault. The next step of the diagnostic process is to use a microscope to examine the candidate sites in the reported order.

## **OVERVIEW OF CAUSE-EFFECT METHOD**

The cause-effect approach performs most of the work before diagnosis experiment. The first step of this method is to build the simulationresponse databases for the modeled faults. Fault simulation technique is used to determine the responses in the presence of faults. The database constructed in this step is called a fault dictionary. The next step is the comparison of these databases with the observed failure responses of the CUT in order to determine the probable causes of the failure. This algorithm starts with a precise fault model, but in some cases the real defects on the circuit may differ from the fault model used. In such a situation the observed response may not match with any of the simulated faults. This approach can handle both combinational and sequential circuits in a similar manner.

There has been a lot of work done to reduce the size of the fault dictionary [4], [5]. Most of these techniques concentrate on reducing the size by managing the content of the information and order and the data representation format (encoding) in the dictionary. Works are also proposed on reducing the size of the dictionary by compaction of the test pattern set [6]. For the assumed fault model they provide very good resolution as their deformed behavior is similar to the modeled fault behavior.

## **OVERVIEW OF EFFECT-CAUSE METHOD**

The algorithms that utilize the effect-cause based approach are observing the actual responses (effects) and determine which fault (cause) might have caused the failure effect which is observed. As the name suggests the effect-cause algorithm directly examines the response of the failing chip and then derives the fault candidates using pathtracing algorithms. This method does not develop a fault simulation response database. Each primary output (PO) is being traced backward so that the error – propagation paths can be defined for all possible fault candidates. Critical path tracing [7], [8] is such a backtracing algorithm which determines the faults detected by a set of tests. It starts at the failing PO to reach the Primary Inputs (PIs) by tracing each critical line passing through sensitive gate inputs. A gate input i is sensitive if complementing the value of i changes the value of the gate output. In presence of a gate with only nonsensitive inputs, the algorithm stops. The effect – cause techniques are more likely to be memory efficient and can be easily integrated in larger designs. Effect-cause analysis can perform both model dependent and model independent diagnosis. Summary of advantages and disadvantages of cause-effect and effect cause approaches is shown in Table 1.

| Method       | Advantages                          | Disadvantages                        |
|--------------|-------------------------------------|--------------------------------------|
| Cause-effect | 1. Handles both combinational and   | 1. Not memory efficient              |
|              | sequential circuits in the same way | 2. Non-scalable with the design size |
|              |                                     | 3. Huge database                     |
|              |                                     | 4. Model dependent                   |
| Method       | Advantages                          | Disadvantages                        |
| Effect-cause | 1. Memory efficient                 | 1. Does not handle combinational and |
|              | 2. Scalable with the design size    | sequential circuits in the same way  |
|              | 3. Handles multiple-fault model     |                                      |
|              | effectively                         |                                      |
|              | 4. Huge database is not present     |                                      |
|              | 5. Can be model dependent and model |                                      |
|              | independent                         |                                      |

## **DEDUCTION ALGORITHM**

The effect-cause analysis can be performed by deducing internal signal values in the CUT [9]. Any line for which both 0 and 1 values are deduced can be neither stuck at 1 nor stuck at 0, and it is identified as normal or fault-free. Faults can be located on lines that could not be proved normal. This method alternates between implications and decisions. Internal values can be computed by the deduction algorithm, which implements a line justification process whose primary goal is to justify all the values obtained at the POs, given the tests applied at the PIs. The values of a normal PI are the values set by the applied tests, and the values of every other normal line must be justified by values of its predecessors. Here, the faults were located without knowing the expected output values.

## **BOOLEAN SATISFIABILITY BASED METHOD**

Boolean satisfiability based method (Boolean SAT) provides an efficient and effective

solution to design diagnosis on large circuits with multiple faults and multiple-design errors [10]. The specification is given as a logic netlist, and the faulty behavior is given as a set of failing test-vector responses. To model the potential presence of a fault on line *I*, a multiplexer is inserted on this line with select line *s*. The original line *I* is attached to the multiplexer's 0-input and the multiplexer's output is connected to the former fanout of line *I*. A new input line *w* is added and attached to the 1-input of the multiplexer.

The multiplexer along with the rest of the circuit is later translated into Conjunctive Normal Form (CNF). This means that the formula is expressed as the product of a set of clauses, where each clause is the sum of a set of literals. A literal is either a variable or its negation. Consider the circuit in **Figure-1(a)**. The potential presence of a fault on line *l*1 can be represented by a multiplexer as shown in **Figure-1(b)**. Observe that the functionality of the

original (modified) circuit is selected when the value of the select line s is set to 0 (1).



## (b)

Figure-1 Modeling Candidate fault locations (a) Original circuit. (b) Modeling

## a Potentially Faulty Line.

The CNF formula representing the new circuit in Figure-1(b) is

C = (x1 + l'1) (x2 + l'1) (x1' + x2' + l1) (s+z1 + l'1) (s+z1' + l1) (x3 + l2) (z1 + l2) (x3' + z1' + l2') (l2' + y) (l2 + x4 + y') (x4' + y)

SAT solvers [11] normally operate on Boolean formulas in CNF and solve it. Initially the number of faults present in the circuit is not known, the algorithm starts by searching for single-fault solutions, then searches for double-fault solutions, and so on. Each run of diagnosis is performed by generating a CNF formula and solving it with an SAT solver. SAT provide an efficient platform for sequential-logic debugging of large real-life industrial designs.

## FAULT SIMULATION

Fault simulation is a more challenging task than logic simulation due to the added dimension of complexity; that is, the behavior of the circuit containing all the modeled faults must be simulated. During single-fault simulation, we transform the model of the fault-free circuit *C* so that it models the circuit  $C_F$  created by a single stuck-at fault  $f_i$  and  $C_F$  is simulated. Similarly, during multiple-fault simulation, we transform the model of the fault free circuit *C* so that it models the circuit *C*<sub>F</sub> created by injecting all suspected faults.

All simulated faults that give results consistent with the "observed responses" are maintained in the set of suspected faults. Faults that lead to inconsistency between the result of "passing test simulation" and the "observed responses" are deduced as faults that may not be present in the CUT and these faults are transferred from the set of suspected faults to the set of nonexistent faults. When simulating one fault at a time, the amount of computation is approximately proportional to the circuit size, the number of test patterns, and the number of modeled faults. The overall time complexity of fault simulation is  $O(pn^2)$ , for p test patterns and n logic gates, which becomes infeasible for large circuits. The works [12], [13] have proposed an incremental multiple-fault simulation strategy.

## [4.2.4]PROPOSED METHOD

The proposed method is an effect-cause based multiple fault diagnosis approach based on multiple fault simulation and multiple fault injection. In order to explore the exponential search space of multiple fault diagnosis problem, population based searches like particle swarm optimization (PSO) [14] can be used. Initially, a list of possible fault candidates is found out by critical path tracing from each failing primary output and taking a union of them. If there exists a single perfect fault candidate, this method stop and report the result. Otherwise, the faults are arranged in descending order according to the number of test patterns they can explain. The initial particles of PSO are chosen at random from the possible faulty sites with more priority given to the faults having higher ranks. Since the number of faults in each particle is a variable, each particle is a set of faults with varying cardinality. The PSO output is given as sets of faults, which could successfully explain the entire passing and failing pattern set. The main advantage is that multiple faults can be analyzed simultaneously.

## CONCLUSION

In this paper different approaches for fault diagnosis are discussed. With feature sizes steadily shrinking, manufacturing defects and parameter variations often cause failures. It is essential that these failures be correctly and quickly diagnosed. Diagnosis is performed to improve the yield of first silicon, to ensure the product quality during volume production and to analyze the failures that caused customer returns. In this paper, a novel approach to multiple-fault diagnosis based on multiple fault simulation. The proposed work is suitable for the diagnosis of multiple stuck-at and transition faults. The algorithm is highly efficient with very high first hit rank and diagnosability with good resolution.

## REFERENCE

- H. Takahashi, K. O. Boateng, K. K. Saluja, and Y. Takamatsu, "On diagnosing multiple stuck-at faults using multiple and single fault simulation in combinational circuits," IEEE Trans. Computer Aided Design Integr. Circuits Syst., vol. 21, no. 3, pp. 362–368, Mar. 2002.
- [2]. D. B. Lavo, B. Chess, T. Larrabee, and F. J. Ferguson, "Diagnosing realistic bridging faults with single stuck-at information," IEEE Trans. Computer Aided Des. Integr. Circuits Syst., vol. 17, no. 3, pp. 255–268, Mar. 1998.
- [3]. V. J. Mehta, M. Marek-Sadowska, K.H. Tsai, and J. Rajski, "Timing-aware multiple-delayfault diagnosis," IEEE Trans. Computer Aided Design Integrated Circuits Syst., vol. 28, no. 2, pp. 245–258, Feb. 2009.
- [4]. B. Chess and T. Larrabee, "Creating small fault dictionaries", Vol. 18, No. 3, pp.346-356, IEEE Trans. Computer-Aided Design, 1999.
- [5]. D. Lavo and T. Larrabee, "Making causeeffect effective: low-resolution fault dictionaries", in Proc. Int. Test Conf. (ITC), 2001, pp. 278-286.
- [6]. Y. Higami, K. K. Saluja, H. Takahashi, S. Kobayashi, and Y. Takamatsu, "Compaction of pass/fail-based diagnostic test vectors for combinational and sequential circuits," in Proc. ASPDAC, 2006, pp. 75– 80.
- [7]. M. Abramovici, P. R. Menon, and D. T. Miller, "Critical path tracing: An alternative to fault simulation," in IEEE Design Test Comput. Mag., vol. 1, pp. 89–93, Feb. 1984.
- [8]. B. Bosio, P. Girard, S. Pravossoudovitch, and A. Virazel, "A comprehensive framework for logic diagnosis of arbitrary defects," IEEE Trans. Computers, vol. 59, no. 3, pp. 289– 300, Mar. 2010.
- [9]. M. Abramovici and M. A. Breuer, "Multiple fault diagnosis in combinational circuits based on an effect-cause analysis," IEEE Trans. Computers, vol. 29, pp. 451–460, June 1980.
- [10]. A. Smith, A. Veneris, M. Ali, and A. Viglas, "Fault diagnosis and logic debugging using Boolean satisfiability," IEEE Trans.

Computer Aided Design Integrated Circuits Syst., vol. 24, no. 10, pp. 1606–1621, Oct. 2005.

- [11]. F. Lu, L. C. Wang, K. T. Cheng, and R. C.-Y. Huang, "A circuit SAT solver with signal correlation guided learning," in Proc. ACM/IEEE De- sign, Automation and Test in Europe, Munich, Germany, 2003, pp. 892– 897.
- [12]. Z. Wang, M. Marek-Sadowska, K.-H. Tsai, and J. Rajski, "Analysis and methodology for multiple-fault diagnosis," IEEE Trans. Computer Aided Design Integr. Circuits Syst., vol. 25, no. 3, pp. 558–575, Mar. 2006.
- [13]. J.B. Liu and A. Veneris, "Incremental Fault Diagnosis," IEEE Trans. Computer Aided Design of Integrated Circuits and Systems, vol. 24, no. 2, pp. 240-251, Feb. 2005.
- [14]. J. Kennedy and R. Eberhart, "Particle swarm optimization," in Proc. IEEE Int.Conf. Neural Netw., vol.4. Nov.–Dec. 1995, pp. 1942– 1948.
- [15]. X. Fan, W. Moore, C. Hora, and G. Gronthoud, "Stuck-open fault diagnosis with stuck-at model," in Proc. 10th IEEE Eur. Symposium, May 2005, pp. 182–187.
- [16]. L. M. Huisman, "Diagnosing arbitrary defects in logic designs using single location at a time (SLAT),"IEEE Trans. Computer Aided Des. Integrated Circuits Syst., vol. 23, no. 1, pp. 91–101, Jan. 2004.
- [17]. X. Fan, W. Moore, C. Hora, and G. Gronthoud, "Stuck-Open Fault Diagnosis with Stuck-at Model," Proc. IEEE European Test Symposium, pp. 182-187, 2005.
- [18]. Z. Wang, M. Marek-Sadowska, K. H. Tsai, and J. Rajski, "Delay-fault diagnosis using timing information," IEEE Trans. Computer Aided Design Integrated Circuits Syst., vol. 24, no. 9, pp. 1315–1325, Sep. 2005.