Sizing CMOS Circuits for Incread Transient Error Tolerance
Yuvraj S. Dhillon, Abdulkadir U. Diril and Abhijit Chatterjee Georgia Institute of Technology, Atlanta, GA
{yuvrajsd,utku,chat}@ece.gatech.edu
Adit D. Singh Auburn University, Auburn, AL adsingh@eng.auburn.edu
Abstract
The continuous shrinking of microelectronic device sizes with every technology generation along with the reduction in supply voltages is causing a significant decrea in circuit noi margins. This leads to incread susceptibility of circuits to transient errors. In this paper, we propo a methodology to increa the robustness of combinational circuits to transient errors by sizing the gates of the circuit in such a way that the number of errors propagated to the primary output is minimized while the timing requirement is met. Using SPICE simulation, we validate that combinational circuits propagate fewer number of transient errors to the circuit output after application of our sizing algorithm.
1. Introduction
Transient errors (TEs) are becoming an increasing concern in electronic devices as the operating voltages
and device sizes decrea. Reduction in circuit dimensions reduces the capacitance of the circuit nodes, thereby leading to increa in the voltage magnitude of the glitches caud by a noi source such as -particles or cosmic rays [1]. It also increas susceptibility of circuits to hard-to-verify design/layout errors such as tho that result from incread crosstalk and incread susceptibility to power supply and ground bounce. Low noi margins due to reduced operating voltages further aggravate the problem by allowing even small glitches to propagate. Thus, TE tolerance of a circuit must be considered an important design parameter for circuits meant to operate in environments prone to TE generation.
When a TE occurs in an internal node of a combinational circuit, it may either propagate to the flip-flop (FF) at the end of the combinational block, or it may die out before reaching the FF. If the error propagates to a FF, it may cau the FF to latch wrong data depending on the state of the clock signal to the FF. This kind of an error will cau the circuit to compute a wrong value, which may have disastrous conquences in some applications. Many studies have been done both to estimate the behavior of circuits when a TE occurs and also to increa the error tolerance of circuits to TEs [
2][3][4][5][6].
We prent an algorithm for computing the optimal sizes of the gates in a CMOS circuit that improves the transient error tolerance of the circuit without changing the delay. We model the transient error tolerance of a circuit by using a cost function that accurately computes the area of the O/P glitches (using SPICE lookup tables) when errors are injected into the internal circuit nodes. The algorithm us delay constrained gradient arch to minimize the cost function and thereby make the circuit more resilient to transient errors.
The paper is organized as follows. Section 2 describes the model we u for TEs. We will u the term glitches and errors interchangeably. Section 3 Figure 1. Transient error modeled as a voltage
This work was supported by NSF Information Technology Rearch Contract, CCR 022-0259
describes the SPICE modeling we u to study glitch propagation in CMOS combinational circuits. Section 4 gives the gate sizing algorithm for incread TE tolerance. Results of the application of our algorithm to some example circuits are given in Section 5. Finally, Section 6 concludes.
2. Transient Error Model
We model the effect of a transient error striking a circuit node as a voltage glitch at the node similar to [7]. We assume that the energy of the upt at node i translates into a voltage glitch at that node of amplitude V i given by:
2upt i i E C V =⋅ (1)
However, a voltage amplitude above V dd will
reduce to V dd immediately due to the forward biad diodes (from drain to bulk). Therefore the amplitude of voltage glitch at node i is taken to be as:
min ,i dd dd V V V
=
(2)
where C i is the capacitance of the node and
2
/th upt dd
C E V = is the minimum capacitance value for a node to be able to reduce the amplitude of the
upt of energy “E upt ”. If a node’s capacitance is below C th , the upt obrved at that node will have amplitude of V dd . Otherwi the upt amplitude will
be inverly proportional to the square root of node capacitance.
Although the energy and duration of the upt depend on environmental factors, for the sake of simplicity, in this work we assume both as constant. Figure 1 shows the error model ud in this paper.
3. Spice Modeling
We generated SPICE look-up tables to estimate the propagation characteristics of an upt at any given node to the primary output of the combinational block (i.e. input of the FF). Level 49 SPICE models for a 0.18µ TSMC process were ud in our simulations. In the simulations, the independe
jibo
nt variables for every gate are: (1) Input ramp, (2) Input amplitude, (3) Sizing of the transistors in the gate, (4) Input duration, and (5) Load capacitance of the gate. For various values of the variables, we collect the following information via SPICE simulations: (1) Output ramp, (2) Output amplitude, (3) Output duration, (4) Propagation delay, and (5) Input capacitance of the gate. Since we consider only NOT, NAND, and NAND gates, a positive pul at the input (i.e. 0-V in -0 transition) may cau a negative pul at the output (i.e. V dd -V out -V dd transition) and a negative pul at the input may cau a positive pul at the output depending on the other input values to the gate. We do simulations for a positive pul at the input and for a negative pul at the input. The results of the two simulations are averaged to get a model where both the input and output puls are positive (i.e. a 0-V in -0 transition at the input will create a 0-V out -0 transition at the output). Figure 2 gives a summary of some of the data collected via SPICE simulations. Propagation delay of the gate
Figure 2. Test circuitry and voltage waveforms in SPICE simulations.
for different input ramps, output load capacitances, and gate sizing is measured by applying a full swing pul (with amplitude V dd ) with a long enough duration to get a full swing at the output, and then measuring time between input crossing V dd /2 to output crossing V dd /2. Delay is measured for a falling and a rising input and then averaged. Input capacitance of a gate for different input ramp
s is measured by applying a full swing input signal with the given ramp and measuring the integral of current flowing into (out of) the input node. Then the following formula is ud to estimate the input capacitance:
2
1
21
t t in I d t
C V V ⋅=
− (3)
where the integral is taken from t 1 to t 2 and V 1 and V 2 are the voltages at t 1 and t 2 respectively. Capacitance is also measured for a rising and falling input and then averaged.
Error propagation is done as follows: Given the I/P glitch ramp, I/P glitch amplitude, I/P glitch duratio
n, output load and size for a gate, the ramp, duration and amplitude of the glitch at the gate output is looked up in the SPICE table. The values are linearly interpolated if needed. The values for the output ramp, amplitude, and duration are ud as input for the gates following the current gate. This process is repeated till the primary output is reached. The error is considered to have died out if
(1) Amplitude at the output is smaller than V dd /2 OR (2) Duration at the output is smaller than the tup time of the FF.
The error is considered to have propagated to the output otherwi.
4. Sizing Algorithm
The optimization procedure for arbitrary transient
errors occurring at arbitrary times within the operation clock cycle of the circuit is very complex. To make the problem “tractable” we make the following simplifications:
• We assume that the transient errors always strike a node during its window of every error that manages to propagate to the O/P will be latched by the O/P FF. In reality, only a fraction of the total transient errors equal to the length of the window of vulnerability divided by clock
period have to be considered.
• We assume during optimization that the energy of each injected node transient error is the same. This
allows us to “normalize” error effects during the optimization procedure. After the optimization is done, we show that the redesigned circuit is resilient to other transient errors as well. Consider a circuit of N gates and P paths from primary inputs to primary outputs. We form a binary matrix, A , of P rows and N columns as follows [8]:
10ji A if Gatei lies on Path j otherwi
==
For example, the A matrix corresponding to the circuit in Figure 3 (N=4, P=2) is:
24
10110
1
hydrogen
1
正本提单
1
=
x A (4)
Given the
initial
sizes
of
the
gates,
[]12T
positiveinit N w w w w = and the load capacitance at
the O/P of the circuit, C out , we can compute the
corresponding delays of each gate by visiting the gates in rever topological sort order as in procedure compute_gate_delays shown in Figure 4. Note that gates are assumed to be numbered in topological sort order from 1 to N. Since we do not know the I/P ramps to the gates, we initially assume a nominal value of I/P ramp to all gates and then iteratively converge to the actual values. In our experiments, we noticed that 6 iterations were enough to converge to stable ramp and delay values for all gates. Let
[]
12T
in it N d d d d = be the delays corresponding
to init w computed using compute_gate_delay. Similar to compute_gate_delay , we have another procedure, compute_gate_widths , that computes the
Figure 3. An example circuit with 4 gates and 2 paths (N=4, P=2).
widths of gates given the delays for the gates. This is shown in Figure 5. In this ca, we found that 10 iterations are needed to converge to stable ramp and size values for all gates. The initial delay of each path in the circuit is then given by:
toronto
∈=
j
j i i P T d f o r a l l j
(5)
We can reprent the above equation in vector form as follows:
in it
in it
T
A d =⋅
(6)
where 12T
in it
p T
T T T = is the vector of
initial path delays. The delay constraint for the circuit
is obtained from the initial delays of the gates as follows:
()in i t
d T m a x T
= (7)
pillar
To find the optimum sizing for the gates that minimizes the number of TEs propagated to the O/P latch, we minimize the cost_function shown in Figure 6. Given a delay assignment for the gates,d , this function first computes the corresponding gate sizes using the procedure compute_gate_widths .
It then computes the cost of the delay assignment as the sum of the areas of all glitches that reach the O/P. We u the sum of the areas of the O/P glitches and not the count of glitches propagating to the O/P as a cost function becau the latter has a non-continuous behavior which is not amenable to gradient arch. We minimize cost_function by doing a gradient arch on the delay vector,d . But, the delay vector is constrained due to the path delay constraints
(in it A d T ⋅=). So, in every iteration, we vary d by adding ∆ such that ⋅∆A =0. This choice of ∆satisfies the constraints as shown below:
()
init init init A d A d A d A T ⋅=⋅+∆=⋅+⋅∆=
(8)
In other words, ∆ has to lie in the nullspace of A . The delay vector,new d , for a new iteration is obtained from the current delay vector, curr d , as follows:
_new curr A d d k Cost function =+⋅∇
(9)
where A Cost_function ∇ is the gradient of cost_function along the nullspace vectors of A . k is chon in such a way that the new cost_function (n ew out (d ,C )co st_fun ctio n ) is minimum in the direction of gradient vector. Figure 7 summarizes the algorithm.
就这么定了5. Results
To validate the algorithm, we carried out experiments with a chain of 10 inverters loaded with a capacitance of 10fF. We resized the circuit assuming a constant energy and duration for all the upts. Then we randomly injected upts of various energies and durations in SPICE simulations to validate that the resized circuit indeed had incread error tolerance not
only for the energy and duration of upt for which we had done the optimization but also for the various other energies and durations a transient upt may have. We simulated 3000 random upts with various energies and durations with the original and resized circuits. The circuit node that the upt is applied to is also lected randomly. Since higher node capacitance decreas the amplitude of the upt, if no restrictions are given, our algorithm tends to increa the circuit size significantly. Therefore, we put a restriction in the cost function on the allowed area increa of the circuit. This allowed area increa may be chon to be any value depending on the application. We
sayhilet our algorithm to increa the area at most by 3X. To show that our algorithm gives better error tolerance compared to blindly increasing the sizes of every inverter by 3X, we also simulated an inverter chain where all the inverters are sized 3X the inverters in the original chain. Table 1 shows the percentages of errors propagated to the output for various upt energy and duration values for the initial circuit, the optimized circuit and the circuit with every inverter scaled by 3X. On the average, the original circuit pass 86.8% of the TEs, the optimized circuit pass 52% of the TEs, and the original circuit with 3X sized inverters pass 72.7% of the TEs. Table 2 gives the sizes of the inverters in the three circuits.
6. Conclusions
We prented an algorithm for sizing the gates in a combinational circuit such that the transient errors propagated to the O/P flip-flop are minimized. The algorithm us extensive SPICE look-up tables to accurately compute costs of different size assignments and minimizes the cost using gradient arch. We validated the effectiveness of our approach by applying
Procedure Gradient_Search
Input initial sizes, init w , and O/P load C out ; Calculate initial delay vector, init d and deadline,
m a x ()in it
d T A d
=⋅;
Let c u rr
in it
d
d
=
while (cost_function (d curr ,C out )> )
Calculate gradient of cost_function at d , A co s t_fu n ctio n ∇;
_new curr A d d k C ost function =+⋅∇where
k is chon in such a way that the new cost, cost_function(d new ,C out ) is minimum in the direction of the gradient vector; c u rr
n e w
调整情绪
d
d
=
end
Output optimum gate widths opt w using
compute_gate_widths(d curr ,C out )
Figure 7. Sizing algorithm for minimum transient error propagation.
Figure 8. The inverter chain ud to validate the algorithm.
it to a chain of 10 inverters. SPICE simulations showed that our technique can reduce propagated TEs by 28% compared to uniform sized chain of inverters with same area and 40% compared to the original circuit when allowed an area increa of 3X. Delay of the circuit remains unchanged for the optimized circuit. References
infection[1] T. Karnik, B. Bloechel, K. Soumyanath, V. De,
and S. Borkar, "Scaling trends of cosmic ray induced soft errors in static latches beyond 0.18u,"
2001 Symposium on VLSI Circuits. Digest of Technical Papers, 14-16 June 2001, Kyoto, Japan, 2001. pp. 61-2.
[2] A. Maheshwari, I. Koren, and N. Burleson,
"Techniques for transient fault nsitivity analysis and reduction in VLSI circuits," Proceedings. 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 3-5 Nov. 2003, Boston, MA, USA, 2003. pp. 597-604. [3]M. Singh and I. Koren, "Reliability enhancement
of analog-to-digital converters (ADCs),"
Proceedings 2001 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 24-26 Oct. 2001, San Francisco, CA, USA, 2001.
pp. 347-53.
[4]H. Cha and J. H. Patel, "Latch design for transient
pul tolerance," Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computers and Processors, 10-12 Oct.
1994, Cambridge, MA, USA, 1994. pp. 385-8. [5] F. L. Yang and R. A. Saleh, "Simulation and
analysis of transient faults in digital circuits,"
IEEE Journal of Solid-State Circuits, vol. 27, pp.
258-64, 1992.
[6]S. M. Kang and D. Chu, "CMOS circuit design for
prevention of single event upt," Proceedings of the IEEE International Conference on Computer D
esign: VLSI in Computers. ICCD '86, 6-9 Oct.
1986, Port Chester, NY, USA, 1986. pp. 385-8. [7]M. Oman, G. Papasso, D. Rossi, and C. Metra, "A
model for transient fault propagation in combinatorial logic," 9th International IEEE On-Line Testing Symposium, 7-9 July 2003, Kos Island, Greece, 2003. pp. 111-15.
[8]Y. S. Dhillon, A. U. Diril, H. S. Lee, and A.
Chatterjee, "Algorithm for achieving minimum energy consumption in CMOS circuits using multiple supply and threshold voltages at the module level," International Conference on Computer Aided Design, 2003. pp. 693 - 700.