U.C.Berkeley —CS278:Computational Complexitylaurel
Handout N25Professor Luca Trevisan 12/1/2004
Notes for Lecture 25
Circuit Lower Bounds for Parity Using Polynomials
In this lecture we prove a lower bound on the size of a constant depth circuit which com-putes the XOR of n bits.大学英语2课后答案
Before we talk about bounds on the size of a circuit,let us first clarify what we mean by circuit depth and circuit size.The depth of a circuit is defined as the length of the longest path from the input to output.The size of a circuit is the number of AND and OR gates in the circuit.Note that,for our purpo,we assume all the gates have unlimited fan-in and fan-out.We define AC 0to be the class of decision problems solvable by circuits of polynomial size and constant depth.We want to prove the result that PARITY is not in AC 0.
There are two known techniques to prove this result.In this class,we will talk about a proof which us polynomials;in the next class we will look at a different proof which us random restrictions.
1Circuit Upper Bounds for PARITY
Before we go into our proof,let us first look at a circuit of constant depth d that computes PARITY.Theorem 1For every constant d ≥2,there are circuits of size 2
O
n 1d −1 that compute parity.
In the next lecture,we will prove a 2n 1
d −1siz
e lower bound,establishing the tightness o
f Theorem 1.Today we will prove a weaker 2n 14d
lower bound.Proof:[Of Theorem 1]Consider the circuit C shown in Figure 1,which computes the PARITY of n variables.The circuit C is a tree of XOR gates,each of which has fan-in n 1d −1;the tree has depth d −1.Now,since each XOR gate is a function of n 1d −1variables,it can be implemented by
a CNF or a DNF of size 2n 1d −1.Let us replace alternating layers of XOR gates in the tree by CNF’s and DNF’s -for example we replace gates in the first layer by their CNF implementation,gates in the cond layer by their DNF implementation,and so on.This gives us a circuit of depth 2(d −1).Now we can u the associativity of OR to collap concutive layers of OR gates into a single layer.The same thing can be done for AND to get a circuit of depth d .
This gives us a circuit of depth d and size
O (n 2n 1d −1)which computes PARITY.2
Figure1:Circuit for Computing XOR of n variables;each small circuit in the tree computes the XOR of k=n1d−1variables
2Overview of the Lower Bound Proof
For our proof,we will utili a property which is common to all circuits of small size and constant depth,which PARITY does not have.1The property is that circuits of small size and constant depth ca
n be reprented by low degree polynomials,with high probability. More formally,we show that if a function f:{0,1}n→{0,1}is computed by a circuit of size s and depth d,then there exists a function g:{0,1}n→R such that Pr x[f(x)=g(x)]≥3
我的女神英文4 andˆgα=0only for|α|≤O((log S)2d),whereˆg is the Fourier transform of g.
Then we will show that if a function g:{0,1}n→R agrees with PARITY on more than
a fraction3
4of its inputs,then there is a coefficientαsuch thatˆgα=0and|α|=Ω(
√
n).
That is,a function which agrees with PARITY on a large fraction of its inputs,has to have high degree.From the two results,it is easy to e that PARITY cannot be computed by circuits of constant depth and small size.
We give a formal definition of degree of a function and then formally state the two results that give our lower bound.
Definition1We say that a function g:{0,1}n→R has degree at most d if there is a polynomial over the reals of degree at most d such that g and the polynomial agree on {0,1}n.
idealistic1Incidentally,the property is fal with high probability for random functions and it is computable in time2O(n)given the truth-table of a function.You may remember that this implies that our lower bound will be a natural proof.
why me歌词An equivalent way of looking at the definition of degree is to consider the size of the largest non-zero coefficient of the Fourier transform of the function.
Fact 2A function g :{0,1}n →R has degree at most d if and only if ˆg α=0for all αsuch that |α|>d .
The following two lemmas are the main results of this lecture.
Lemma 3For every circuit C of size S and depth d ,there is a function g :{0,1}n →R of degree O ((log S )2d )such that g and C agree on at least a 3/4fraction of {0,1}n .
Lemma 4Let g :{0,1}n →R be a function that agrees with PARITY on at least a 3/4fraction of {0,1}n .Then the degree of g is Ω(√n ).
From Lemma 3and Lemma 4it is immediate to derive the following lower bound.
Theorem 5For every constant d ≥2,if C is a circuit of depth d and size S that computes parity,then S ≥2Ω(n 1/4d ).
Proof:From Lemma 3we have that there is a function g :{0,1}n →R that agrees with PARITY on a 3/4fraction of {0,1}n ,and who degree is at most O ((log S )2d ).From Lemma 4we deduce that the degree of g must be at least Ω(√n ),so that
(log S )2d =Ω(√n )
which is equivalent to
S =2Ω(n
1/4d )
23Proof of Lemma 3
Most of the work in the proof of Lemma 3will be in showing how to give a “probabilistic approximation”of a single gate using low-degree functions.
3.1Approximating OR
The following lemma says that we can approximately reprent OR with a polynomial of degree exponentially small in the the fan-in of the OR gate.We’ll u the notation that x is a vector of k bits,x i is the i th bit of x ,and 0is the vector of zeros (of the appropriate size bad on context).
Lemma 6For all k and ,there exists a distribution G over functions g :{0,1}k →R such that
1.g is of degree O ((log 1 )(log k )),and
2.for all x∈{0,1}k,
Pr
g∼G
[g(x)=x1∨...∨x k]≥1− .(1)
Proof Idea:We want a random polynomial p:{0,1}k→R that computes OR.An obvious choice is
p bad(x1,...,x k)=1−
i∈{1,...,k}
(1−x i),(2)
which computes OR with no error.But it has degree k,whereas we’d like it to have logarithmic degree.To accomplish this amazing feat,we’ll replace the tests of all k variables with just a few tests of random batches of variables.This gives us a random polynomial which computes OR with one-sided error:when x=0,we’ll have p(x)=0;and when some x i=1,we’ll almost always(over our choice of p)have p(x)=1.2
Proof:We pick a random collection S of subts of the bits of x.(That is,for each S∈S we have S⊆{1,...,k}).We’ll soon e how to pick S,but once the choice has been made, we define our polynomial as
p(x1,...,x k)=1−
S∈S
1−
i∈S
x i
.(3)
Why does p successfully approximate OR?First,suppo x1∨...∨x k=0.Then we
have x=0,and:
p(0,...,0)=1−
S∈S
1−
i∈S
=0.(4)
So,regardless of the distribution from which we pick S,we have
Pr
S
[p(0)=0]=1.(5)
Next,suppo x1∨...∨x k=1.We have p(x)=1if and only if the product term is zero.The product term is zero if and only if the sum in some factor is1.And that,in turn,happens if and only if there is some S∈S which includes exactly one x i which is1. Formally,for any x∈{0,1}k,x=0,we want the following to be true with high probability.
∃S∈S.(|{i∈S:x i=1}|=1)(6) Given that we do not want S to be very large(so that the degree of the polynomial is small),we’ll have to pick S very carefully.In order to accomplish this,we turn to the Valiant-Vazirani reduction,which you may recall from an earlier class.
Lemma7(Valiant-Vazirani)Let A⊆{1,...,k},let a be such that2a≤|A|≤2a+1, and let H be a family of pairwi independent hash functions of the form h:{1,...,k}→{0,1}a+2.Then if we pick h at random from H,there is probability at least1/8that there is a unique element i∈A such that h(i)=0.Precily,
Pr h∼H [|{i∈A:h(i)=0}|=1]≥
1
8
(7)
With this as a guide,we will define our collection S in terms of pairwi independent hash functions.Let t>0be a value that we will t later in terms of the approximation parameter .Then we let S={S a,j}a∈{0,...,log k},j∈{1,...,t}where the ts S a,j are defined as follows.
•For a∈{0,...,log k}:
–For j∈{1,...,t}:
∗Pick random pairwi independent hash function h a,j:{1,...,k}→{0,1}a+2
∗Define S a,j=h−1(0).That is,S a,j={i:h(i)=0}.
Now consider any x=0which we are feeding to our OR-polynomial p.Let A be the t of bits of x whic
h ,A={i:x i=1},and let a be such that2a≤|A|≤2a+1.Then we have a∈{0,...,log k},so S includes t ts S a,1,...,S a,t.Consider any one such S a,j. By Valiant-Vazirani,we have
Pr h a,j∼H [|{i∈A:h a,j(i)=0}|=1]≥
1
8
(8)
which implies that
Pr h a,j∼H [|{i∈A:i∈S a,j}|=1]≥
1
8
(9)
so the probability that there is some j for which|S a,j∩A|=1is at least1− 7
8
viagra是什么意思t
,which
by the reasoning above tells us that
Pr p [p(x)=x1∨...∨x k]≥1−
7
8
t
.(10)
Now,to get a success probability of1− as required by the lemma,we just pick t=O(log1
).
The degree of p will then be|S|=t(log k)=O((log1
)(log k)),which satisfies the degree
requirement of the lemma.2
Note that given this lemma,we can also approximate AND with an exponentially low degree polynomial.Suppo we have some G which approximates OR within as above. Then we can construct G which approximates AND by drawing g from G and returning g such that
g (x1,...,x k)=1−g(1−x1,...,1−x k).(11) Any such g has the same degree as g.Also,for a particular x∈{0,1}k,g clearly computes AND if g computes OR,which happens with probability at least1− over our choice of g.
3.2Proof of Lemma3
Given a circuit C of size S and depth d,for every gate we pick independently an approx-
imating function g i with parameter =1
4S ,and replace the gate by g i.Then,for a given
五月的英文input,the probability that the new function so defined computes C(x)correctly is at leastune
the probability that the results of all the gates are correctly computed,which is at least3
4.
In particular,there is a function among tho generated this way that agrees with C()on at least a3/4fraction of inputs.Each g i has degree at most O((log S)2,becau the fan-in of each gate is at most S,and the degree of the function defined in the construction is at most O((log S)2d).
4Proof of Lemma 4
Let g :{0,1}n →R be a function of degree at most t that agrees with PARITY on at least a 3/4fraction of inputs.Let G :{−1,1}n →R be defined as协商英语
G (x ):=1−2g 12−12x 1,···,12−12
x n (12)Note that:
•G is still of degree at most t ,
•G agrees with the function Π(x 1,...,x n )=x 1·x 2···x n on at least a 3/4fraction of {−1,1}n .
Define A to be the t of x ∈{−1,1}n such that G (x )=Π(x ).
A := x :G (x )=n i =1
x i .
(13)
Then |A |≥342n ,by our initial assumption.Now consider the t F of all functions f :A →R .The form a vector space of dimension |A |over the reals.We know that any function f in this t can be written as f (x )= αˆf α i ∈αx i (14)Over A ,G (x )= n i =1x i ,and so for x ∈A ,
i ∈αx i =G (x ) i/∈αx i
(15)
By our initial assumption,G (x )is a polynomial of degree at most t .Therefore,for every α,such that |α|≥n 2,we can replace
i ∈αx i by a polynomial of degree less than or equal to t +n 2.Every such function f which belong
to F can be written as a polynomial of degree at most t +n 2.Hence the t i ∈αx i |α|≤t +n 2forms a basis for the t S .As there must be at least |A |such monomials,this means that
t +n
法定代表人英文
2 k =0 n k
≥34·2n (16)
and,in particular,
t +n
2 k =n
2 n k
≥14·2n (17)We know from Stirling’s approximation that every binomial coefficient n k is at most
O (2n /√n ),so we get O t √n ·2n ≥14·2n
(18)And so t =Ω(√n ).