3 The Green–Tao Theorem

Theorem 3.1. For all k3, the primes contain an arithmetic progression of length k.

Theorem 3.2 (Weighted Szemerédi). For all k3 and α>0, there exists c=c(k,α)>0 such that for any function f:N[0,1] satisfying 𝔼f=α,

𝔼x,df(x)f(x+d)f(x+(k1)d)>c(k,α)ok,α(1),

where ok,α(1) is a quantity that goes to 0 as n at a rate that depending on k and α.

There is a standard method that allows converting between a statement like the above about N into a statement about [N]: given a subset of [N], just view it as a subset of 3N (arithmetic progressions in 3N that lie in [N] correspond to genuine arithmetic progressions in [N]).

Insight of Green and Tao: the primes are dense in a certain subset of the naturals.

PIC

We will use some results of Conlon–Fox–Zhao which are sparse versions of the hypergraph regularity results we saw earlier. We will only prove one of these results in lectures, which is the sparse counting lemma.

Definition 3.3. A function ν=ν(N):N[0,) is said to satisfy the k-linear forms condition (k-LFC) if

𝔼x1(0),x1(1)xk(0),xk(1)j=1kω{0,1}[k]{j}ν(i=1k(ij)xi(ωi))nj,ω=1+o(1)

for any choice of exponents nj,ω{0,1}.

Example 3.4. We will omit writing out the nj,ω in this example.

ν satisfies the 2-LFC if

𝔼x,xy,yν(y)ν(y)ν(x)ν(x)=1+o(1).

ν satisfies the 3-LFC if

𝔼x,xy,yz,zν(y+2z)ν(y+2z)ν(y+2z)ν(y+2z)ν(x+z)ν(x+z)ν(x+z)ν(x+z)ν(2xy)ν(2xy)ν(2xy)ν(2xy)=1+o(1).

ν(y+2z)ν(x+z)ν(2xy) relates to three term progressions: it is the arithmetic progression y+2z+n(xyz) for n=0,1,2.

Theorem 3.5 (Relative Szemerédi). Let k3 and α>0 and suppose ν=ν(N):N[0,1] satisfies the k-LFC. Suppose N is sufficiently large and coprime to (k1)!. Let f:N[0,) satisfies 0f(x)ν(x) for all xN and suppose 𝔼fα. Then

𝔼x,df(x)f(x+d)f(x+(k1)d)c(k,α)ok,α,νrate of convergence in k-LFC(1)

where c(k,α) is as in Theorem 3.2. Often refer to ν as a pseudorandom majorant for f.

Note: taking f here to be the indicator of the primes won’t satisfy the conditions above, because we need 𝔼fα. Instead, we’ll use the fact that f is not bounded (takes values in [0,)).

Think of “f(n)=𝟙primes(n)logn.”

In the original Green–Tao proof, they needed an additional condition as well as the k-LFC, and this additional requires a lot of analytic number theory to prove. One of the great contributions of the proof by Conlon–Fox–Zhao was to remove this additional condition, and in particular greatly reducing the amount of analytic number theory needed to prove the result.

Consider the van Mangoldt function

Λ(n)={logpif n=pk for some prime p0otherwise

By the Prime Number Theorem,

𝔼n[N]Λ(n)=1+o(1).

(Remark: we won’t need the Prime Number Theorem to prove Green–Tao)

Problem: Λ is biased with respect to small residue classes. Use W-trick: let w=w(N) be a function with N e.g. log log N. Let W=pwp, and consider only primes 1(modW), by defining

Λ(n)~={φ(W)Wlog(Wn+1)if Wn+1 is prime0otherwise

where φ is the Euler totient function. Can show 𝔼n[N]Λ~(n)=1+o(1), provided w grows sufficiently slowly.

Reminder: instead of Λ, want to consider

Λ~(n)={φ(W)Wlog(Wn+1)if Wn+1 prime0otherwise

Proposition 3.6 (Pseudorandom majorant for the primes). For all k3, there exists δ>0 such that for all sufficiently large N, there exists ν(N):N[0,) satisfying the k-LFC and ν(n)δkΛ~(n) for all n[N2,N). ν is given by

ν(n)={φ(W)WΛχ,R(Wn+1)2cχlogRn[N2,N)1otherwise

where

  • χ:[0,1] is a smooth function supported on [1,1] with χ(0)=1

    PIC

  • cχ=0|χ(x)|2dx

  • R=Nk12k3.

  • Λχ,R(n)=logRd|nμ(d)χ( log d log R).
    Compare with Λ(n)=d|nμ(d) log (nd) and d|ndR log Rd.

Proof. Omitted.

Next goal: “Approximate” 0fν by 0f~1 with 𝔼f=𝔼f~ (transference principle).

Definition 3.7. Let G be an abelian group (which you can think of as being N), let r, and let ψ:GrG a surjective homomorphism, f,f~:G[0,). We say (f,f~) is an (r,𝜀)-discrepancy pair (DP) with respect to ψ if

|𝔼x=(x1,,xr)Gr(f(ψ(x))f~(ψ(x)))i=1rui(x[r]{i}=(x1,,xi1,xi+1,,xr))|𝜀

for all functions u1,,ur:Gr1[0,1].

In words: no function in fewer variables can distinguish f from f~.

Can think of ψ(x)=x1+x2++xr.

Theorem 3.8 (Dense Model Theorem). For all 𝜀>0, there exists k=𝜀O(1) and 𝜀=exp(𝜀O(1)) such that the following holds: Let X be a finite set, let F be a collection of functions φ:X[1,1]. Suppose ν:X[0,) satisfying

|ν1,φ|𝜀φFk={i=1kφi:φiF,kk}

and f:X[0,) satisfies fν and 𝔼f1. Then there exists f~:X[0,1] such that 𝔼f~=𝔼f and |ff~,φ|𝜀 for all φF.

Corollary 3.9. For all 𝜀>0, there exists 𝜀=exp(𝜀O(1)) such that the following holds: Let G be an abelian group, r, ψ:GrG a surjective homomorphism. Let f,ν:G[0,) be such that 0fν, 𝔼f1 and (ν,1) is an (r,𝜀)-discrepancy pair with respect to ψ. Then there exists f~:G[0,1] such that 𝔼f~=𝔼f and (f,f~) is an (r,𝜀)-discrepancy pair with respect to ψ.

Deduction from Theorem 3.8. For any collection of functions u1,,ur:Gr1[0,1], define a generalized convolution with respect to ψ

(u1,,ur)ψ:G[0,1](u1,,ur)ψ(x)=𝔼yGrψ(y)=xi=1rui(y[r]{i})

r=2:

u1u2(x)=𝔼yG2y1+y2=xu1(y1)u2(y2)

So indeed the earlier expression looks like a reasonable definition for a generalised convolution.

Notice that the LHS in ?? 30 is just ff~,(u1ur)ψ.

Let F be the set of functions which can be written as convex combinations of generalised convolutions with respect to ψ. Then by the hypotheses, (ν,1) is an (r,𝜀)-discrepancy pair with respect to ψ, which is equivalent to ν1,φ𝜀 for all φF.

Want: |ff~,φ|𝜀 for all φF.

So it suffices to show that F is closed under multiplication. Indeed,

(u1,,ur)ψ(x)(u1,,ur)ψ(x)=𝔼yGrψ(y)=x𝔼yGrψ(y)=xi=1rui(y[r]{i})i=1rui(y[r]{i}=y[r]{i}+z[r]{i})=𝔼zGrψ(z)=0[𝔼yGrψ(y)=xi=1rui(y[r]{i})ui (y[r]{i})generalised convolution wrt ψ]

Define a norm on functions f:Gr by

f=(f,r)=supφF|f,φ|=supφF(F)f,φ

This has a dual:

φ=supf1f,φ.

By definition, |f,φ|fφ. Note

f=supφF|f,φ|=supφF|𝔼xf(x)φ(x)|f1,

and by duality

φφ.

Lemma 3.10. The unit ball of is just conv(F(F)).

We will use Hahn–Banach to prove this.

Theorem 3.11 (Hahn–Banach). Let K be a closed convex body in n, and suppose 𝜃nK. Then there exists fn such that f,𝜃>1 while f,η1 ηK.

PIC

Proof. Suppose φF(F). Then

φ=supf1f,φ1.

So φ is in the unit ball of . Convex combinations of elements in F(F) are also in the unit ball by the triangle inequality.

For the converse, we need Hahn–Banach. Suppose φconv(F(F)). By Theorem 3.11, there exists f such that f,φ>1 but f,η1 ηconv(F(F)). The latter implies f1. But

1<f,φfφφ.

Hence φ does not lie in the unit ball of .

This implies that the unit ball of is closed under multiplication. Hence, φ,ψ:G, then

φφψψ1φψφψ.

(dual norm is submultiplicative).

Theorem 3.12 (Dense Model (Theorem 3.7’)). For all 𝜀>0, if ν:X[0,) satisfying ν1exp(𝜀O(1)) and f:X[0,) satisfies 0fν, then there exists f~:X[0,1] such that ff~𝜀.

Proof. Given f:X[0,), 0fν, it suffices to show that there exists f~:X[0,1+𝜀2] with ff~𝜀2, assuming that ν1𝜀 for some sufficiently small 𝜀.

Suppose that there is no such f~, i.e. f cannot be written as f=f1+f2, with

f1K1={g:X[0,1+𝜀2]}andf2K2={h:X:h𝜀2}.

In other words: we are supposing that fK1+K2. Note that K1 and K2 are both convex bodies, and both contain 0. Thus K1+K2 is convex and contains K1 and K2.

By Hahn–Banach, there exists ψ:X such that f,ψ>1 and g,ψ1 gK1+K2. Taking g=(1+𝜀2)𝟙ψ>0K1, we note that

(1+𝜀2)𝟙ψ>0,ψ=(1+𝜀2),ψ+1

where ψ+(x)=max{0,ψ(x)}. Thus 1,ψ+(1+𝜀2)1.

On the other hand, if g,ψ1 gK2, so g,ψ2𝜀 for all g:X with g1. So ψ2𝜀.

So far, we have

1<f,ψf,ψ+ν,ψ+=ν1,ψ++1,ψ+(1+𝜀2)1.

So if we had ψ+ bounded above, we would be done. Indeed, by Weierstrass approximation, there exists a polynomial P such that

|P(x)max{0,x}|𝜀8x[2𝜀,2𝜀].

Recall that earlier we used duality to show ψψ2𝜀, which is why we only need the polynomial approximation to hold in the interval [2𝜀,2𝜀].

Let P(x)=adxd++a1x+a0. We can do this with R:=i=0d|ai|(2𝜀)i= exp (𝜀O(1)). Now

Pψi=1d|ai|ψii=1d|ai|(ψ)iR

and

ν1,ψ+=ν1,Pψ+ν1,ψ+Pψν1Pψ𝜀R+ν111ψ+Pψ𝜀8.

Choosing 𝜀 such that 𝜀R𝜀8, we arrive at a contradiction.

X=X1X2X3X4.

Xi means X1Xi1Xi+1X4. xi=(x1,,xi1,xi+1,x4)Xi. (for all i[4])

Work with 4-partite 3-uniform weighted hypergraphs.

g=(gi)i[4]

with each gi:Xi. For two such hypergraphs, g and ν, write gν whenever gi(xi)νi(xi) for all i[4] and all xiXi.

Given a weighted 3-uniform hypergraph h on Xi, define

h,i=supAjXj|𝔼xiXih(xi)j[4]{i}𝟙Aj(xi,j)|.

Given a weighted 4-partite 3-uniform hypergraph on X, define

g=maxi[4]{gi,i}.

Definition 3.13. Say that a 4-partite 3-uniform weighted hypergraph ν satisfies the 3-LFC if

𝔼x1(0),x1(1)X1x4(0),x4(1)X4j=14ω{0,1}[4]{j}ν(xj(ω))nj,w=1+o(1).

Our definition can be recovered by making the substitution

(x(ω))j=i=14(ji)xi(ωi).
        x2 + 2x3 + 3x4− x1 + x3 + 2x4− 2x1 − x2 + x4−−−(((x1x1x1+++xxx222+++xxx−333 3+++xxxx1444 −))) 2x2 − x3

Theorem 3.14 (Sparse counting lemma). For all γ>0, there exists 𝜀>0 such that the following holds: Let ν,g,g~ be weighted hypergraphs on X1X2X3X4. Suppose that ν satisfies the 3-LFC (as in Definition 3.13), 0gν, 0g~1, and gg~<𝜀. Then

|𝔼x1,,x4j=14gj(xj)𝔼x1,,x4j=14g~j(xj)|γ.

The proof consists of the following steps:

(1)

Lemma 3.15 (Telescoping argument). Let 0g~1, and gj1 j[3], g4ν4. If gg~𝜀, then

|𝔼x1,,x4j=14gj(xj)j=14g~j(xj)|4𝜀.

Sketch proof.

𝔼j=14g(xj)=𝔼j=13g(xj)(g(x4)g~(x4))small+𝔼j=13g(xj)g~(x4)

Now we no longer have to worry about g4. Then repeat.

(2)

Lemma 3.16 (Strong linear forms). Suppose ν satisfies the 3-LFC as defined in Definition 3.13 and suppose 0gν, 0g~1. Then

|𝔼x1,x2,x3,x4(0),x4(1)(ν(x4)1)j=13ω{0,1}hj,ω(xj(ω))|,

where x(ω)=(x1,x2,x3,x4(ω)) and hj,ω is either gj or g~j.

Sketch proof. Use LFC and Cauchy-Schwarz.

(3) Next (final) lecture: densification strategy

At each step, if gjνj and gjg~j,j𝜀, find 0gj1 such that gjg~j,j𝜀.

Sketch proof of Theorem 3.5. Given f:N[0,) and ν:N[0,), 0fν, satisfying the 3-LFC:

ψ:G4G{ψ1(x1,,x4)=x2+2x3+3x4ψ2(x1,,x4)=x1+x3+2x4ψ3(x1,,x4)=2x1x2+x4ψ4(x1,,x4)=3x12x2x3

Define the weighted hypergraphs g,ν on (N)4:

gj(xj)=f(ψj(x1,,x4))νj(xj)=ν(ψj(x1,,x4))

and from the Dense Model Theorem, obtain f~:N[0,1] such that ff~𝜀, with corresponding weighted hypergraph

g~j(xj)=f~(ψj(x1,,x4)).

By the sparse counting lemma, we have

Corollary 3.17. For all γ>0, there exists 𝜀>0 such that the following holds: Let ν,f,f~:N[0,) with 0fν, 0f~1 and ff~𝜀. Then

|𝔼x,dj=03f(x+jd)𝔼j=03f~(x+jd)|γ.

But also, by Corollary 3.9, 𝔼f~=𝔼fδ, so by the weighted Szemerédi Theorem we get the result.

Ω