3 Probabilistic Tools

All probability spaces in this course will be finite.

Theorem 3.1 (Khintchine’s inequality). Assuming that:

  • p[2,)

  • X1,X2,,Xn independent random variables

  • (Xi=xi)=12=(Xi=xi)

Then
i=1nXiLp()=O(p12(i=1nXiL2()2)12).

Proof. By nesting of norms, it suffices to prove the case p=2k for some k. Write X=i=1nXi, and assume i=1nXiL()2=1. Note that in fact i=1nXiL2()2=i=1nXiL()2, hence i=1nXiL2()2=1.

By Chernoff’s inequality (Example 2.5), for all 𝜃>0 we have

(|X|𝜃)4exp(𝜃24),

and so using the fact that (|X|t)=0tρX(s)ds we have

XL2k()2k=0t2kρX(t)dt=02kt2k1(|X|t)dtintegration by parts08kt2k1exp(t24)dt=:I(K)

We shall show by induction on k that I(K)22k(2k)k4k. Indeed, when k=1,

0texp(t24)dt=[2exp(t24)]0=22.

For k>1, integrate by parts to find that

I(K)=0t2k2utexp(t24)vdt=[t2k2(2exp(t24))]00(2k2)t2k3(2exp(t24))dt=4(k1)0t2(k1)1exp(t24)dt=4(k1)I(K1)4(k1)22(k1)(2(k1))k14(k1)22k(2k)k4k

Corollary 3.2 (Rudin’s Inequality). Let Γ𝔽2n^ be a linearly independent set and let p[2,). Then for any f^l2(Γ),

γΓf^(γ)γLP(𝔽2n)=O(pf^l2(Γ)).

Corollary 3.3. Let Γ𝔽2n^ be a linearly independent set and let p(1,2]. Then for all fLp(𝔽2n),

f^l2(Γ)=O(pp1fLp(𝔽2n)).

Proof. Let fLp(𝔽2n) and write g=γΓf^(γ)γ. Then

f^l2(Γ)2=γΓ|f^(γ)|2=f^,g^l2(𝔽2n^)=f,gL2(𝔽2n)by Plancherel’s identity

which is bounded above by fLp(𝔽2n)gLp(𝔽2n) where 1p+1p=1, using Hölder’s inequality.

By Rudin’s Inequality,

gLp(𝔽2n)=O(pg^l2(Γ))=O(pp1f^l2(Γ)).

Recall that given A𝔽2n of density α>0, we had |Specρ(𝟙A)ρ2α1. This is best possible as the example of a subspace shows. However, in this case the large spectrum is highly structured.

Theorem 3.4 (Special case of Chang’s Theorem). Assuming that:

  • A𝔽2n of density α>0

  • ρ>0

Then there exists H𝔽2n^ of dimension O(ρ2logα1) such that HSpecρ(𝟙A).

Proof. Let ΓSpecρ(𝟙A) be a maximal linearly independent set. Let H=Specρ(𝟙A). Clearly dim(H)=|Γ|. By Corollary 3.3, for all p(1,2],

(ρα)2|Γ|γΓ|𝟙A^(γ)|2=𝟙A^l2(Γ)2=O(pp1𝟙ALp(𝔽2n)2),

so

|Γ|=O(ρ2α2α2ppp1).

Set p=1+(logα1)1 to get |Γ|=O(ρ2α2(α2e2)(logα1+1)).

Definition 3.5 (Dissociated). Let G be a finite abelian group. We say SG is dissociated if sS𝜀ss=0 for 𝜀{1,0,1}|S|, then 𝜀0.

Clearly, if G=𝔽2n, then SG is dissociated if and only if it is linearly independent.

Theorem 3.6 (Chang’s Theorem). Assuming that:

  • G a finite abelian group

  • AG be of density α>0

  • ΛSpecρ(𝟙A) is dissociated

Then |Λ|=O(ρ2logα1.

We may bootstrap Khintchine’s inequality to obtain the following:

Theorem 3.7 (Marcinkiewicz-Zygmund). Assuming that:

  • p[2,)

  • X1,X2,,XnŁp() independent random variables

  • 𝔼i=1nXi=0

Then
i=1nXiLp()=O(p12i=1n|Xi|2Lp2()12).

Proof. First assume the distribution of the Xi’s is symmetric, i.e. (Xi=a)=(Xi=a) for all a. Partition the probability space Ω into sets Ω1,Ω2,,ΩM, write j for the induced measure on Ωj such that all Xi’s are symmetric and take at most 2 values. By Khintchine’s inequality, for each j[M],

i=1nXiLp(j)p=O(pp2(i=1nXiL2(j)2)p2)=O(pp2i=1n|Xi|2Lp2(j)p2)

so summing over all j and taking p-th roots gives the symmetric case. Now suppose the Xi’s are arbitrary, and let Y1,,Yn be such that YiXi and X1,X2,,Xn,Y1,Y2,,Yn are all independent. Applying the symmetric case to XiYi,

i=1n(XiYi)Lp(×)=O(p12i=1n|XiYi|2Lp2(×)12)=O(p12i=1n|XiYi|2Lp2()12)

But then

i=1nXiLp()=i=1nXi𝔼Yi=1nYi=0Lp()p=𝔼X|Xi𝔼YYi|p=𝔼X|𝔼Y(XiYi)|p𝔼X𝔼Y|(XiYi)|pby Jensen say=(XiYi)Lp(×)p

concluding the proof.

Theorem 3.8 (Croot-Sisask almost periodicity). Assuming that:

  • G a finite abelian group

  • 𝜀>0

  • p[2,)

  • A,BG are such that |A+B|K|A|

  • f:G

Then there exists bB and a set XBb such that |X|21KO(𝜀2p)|B| and
τxfμAfμALp(G)𝜀fLp(G)xX,

where τxg(y)=g(y+x) for all yG, and as a reminder, μA is the characteristic measure of A.

Proof. The main idea is to approximate

fμA(y)=𝔼xf(yx)μA(x)=𝔼xAf(yx)

by 1mi=1mf(yzi), where zi are sampled independently and uniformly from A, and m is to be chosen later.

For each yG, define Zi(y)=τzif(y)fμA(y). For each yG, these are independent random variables with mean 0, so by Marcinkiewicz-Zygmund,

i=1mZi(y)Lp()p=O(pp2i=1m|Zi(y)|2Lp2()p2)=O(pp2𝔼(z1,,zm)Am|i=1m|Zi(y)|2|p2)

By Hölder with 1p+2p=1, we get

|i=1m|Zi(y)|2|p2(i=1m1p)1pp2(i=1m|Zi(y)|2p2)2pp2(i=1m1p)p21(i=1m|Zi(y)|2p2)2pp2=mp21i=1m|Zi(y)|p

so

i=1mZi(y)Lp()p=O(pp2mp21𝔼(z1,,zm)Ami=1m|Zi(y)|p).

Summing over all yG, we have

𝔼yGi=1mZi(y)Lp()p=O(pp2mp21𝔼(z1,,zm)Ami=1m𝔼yG|Zi(y)|p)

with

(𝔼yG|Zi(y)|p)1p=ZiLp(G)=τziffμALp(G),τzifLp(G)+fμALp(G)fLp(G)+fLq(G)μAL1(G)2fLp(G)

by Young / Hölder (fgLr(G)fLp(G)gLq(G) where 1+1r=1p+1q).

So we have

𝔼(z1,,zm)Am𝔼yG|i=1mZi(y)|p=O(pp2mp21i=1m(2fLp(G))p)=O((4p)p2mp2fLp(G)p).

Choose m=O(𝜀2p) so that the RHS is at most (𝜀4fLp(G))p. whence

𝔼(z1,,zm)Am𝔼yG|1mi=1mτzif(y)fμA(y)|p=()=O((4p)p2mp2fLp(G)p)=(𝜀4fLp(G))p.

Write

L={z=(z1,,zm)Am:()(𝜀2fLp(G))p}.

By Markov inequality, since

𝔼()(𝜀4fLp(G))p=2p(𝜀2fLp(G))p,

we have

|AmL||Am|=(()(𝜀2fLp(G))p)(()2p𝔼())2p

so |L|(112p)|A|m12|A|m. Let

D={(b,b,,b)m:bB}.

Now L+D(A+B)m, whence

|L+D||A+B|mKm|A|m2Km|L|.

By Lemma 1.17,

E(L,D)|L|2|D|2|L+D|12Km|D|2|L|

so there are at least |D|22Km pairs (d1,d2)D×D such that rLL(d2d1)>0. In particular, there exists bub and XBb of size |X||D|2Km=|B|2Km such that for all xX, there exists l2(x)L such that for all i[m], l1(x)il2(x)i=x. But then for each xX, by the triangle inequality,

τxfμAfμALp(G)τxfμAτx(1mi=1mτl2(x)if)Lp(G)  +τx(1mi=1mτl2(xi)f)fμALp(G)=fμA1mi=1mτl2(x)ifLp(G)  +1mi=1mτxl2(x)iffμALp(G)2𝜀2fLp(G)

by definion of L.

Theorem 3.9 (Bogolyubov again, after Sanders). Assuming that:

  • A𝔽pn of density α>0

Then there exists a subspace V𝔽pn of codimension O(log4α1) such tht VA+AAA.

Almost periodicity is also a key ingredient in recent work of Kelley and Meka, showing that any A[N] containing no non-trivial 3 term arithmetic progressions has size |A|exp(C log 111N)N.