Introduction to Additive Combinatorics
Daniel Naylor

Contents

1Combinatorial methods
2Fourier-analytic techniques
3Probabilistic Tools
4Further Topics
Index

1 Combinatorial methods

Definition 1.1 (Sumset). Let G be an abelian group. Given A,B G, define the sumset A + B to be

A + B := {a + b : a A,b B}

and the difference set A B to be

A B := {a + b : a A,b B}.

If A and B are finite, then certainly

max {|A|,|B|}|A+B||A||B|.

Example 1.2. Let A = [n] := {1,2,,n} . Then

|A+A| = |{2,,2n}| = 2n 1 = 2|A| 1.

Lemma 1.3. Assuming that:

  • A is finite.

Then |A+A| 2|A| 1, with equality if and only if A is an arithmetic progression.

Proof. Let A = {a1,a2,,an} with a1 < a2 < < an. Then

a1 + a1 < a1 + a2 < a1 + a2 < < a1 + an < a2 + an < < an + an,

so |A+A| 2|A| 1. But we could also have written

a1 + a1 < a1 + a2 < a2 + a2 < a2 + a3 < a2 + a4 < < a2 + an < a3 + an < < an + an.

When |A+A| = 2|A| 1, these two orderings must be the same. So a2 + ai = a1 + ai+1 for all i = 2,,n 1.

Exercise: If A,B , then |A+B||A| + |B| 1 with equality if and only if A and B are arithmetic progressions with the same common difference.

Example 1.4. Let A,B p with p prime. Then |A+B| p + 1A+B = p. Indeed, g A+BA (gB) (note that g B means {g} B). But g p,

|A (gB)| = |A| + |gB||A (gB)||A| + |B| p 1.

Theorem 1.5 (Cauchy-Davenport). Assuming that:

  • p is a prime

  • A,B p nonempty

Then
|A+B| min {p,|A| + |B| 1}.

Proof. Assume |A| + |B| p + 1. Without loss of generality assume that 1 |A||B| and that 0 A. Apply induction on |A|. The case |A| = 1 is trivial. Suppose |A| 2, and let 0a A.

Since {a,2a,3a,,(p 1)a,pa} = p and |A| + |B| p + 1, there must exist m 0 such that ma B but (m + 1)aB. Let B = B ma, so 0 B, aB, |B| = |B|.

But 1 |A B| < |A|, so the inductive hypothesis applies to A B and A B. Since

(A B)+(A B) A+B,

we have

|A+B| = |A+B||(A B)+(A B)||A B| + |A B| + 1 = |A| + |B| + 1.

This fails for general abelian groups (or even general cyclic groups).

Example 1.6. Let p be (fixed, small) prime, and let V 𝔽pn be a subspace. Then V + V = V , so |V + V | = |V |. In fact, if A 𝔽pn is such that |A + A| = |A|, then A must be a coset of a subspace.

Example 1.7. Let A 𝔽pn be such that |A+A| < 3 2|A|. Then there exists V 𝔽pn a subspace such that |V | < 3 2|A| and A is contained in a coset of V . See Example Sheet 1.

Definition 1.8 (Ruzsa distance). Given finite sets A,B G, we define the Ruzsa distance d(A,B) between A and B by

d(A,B) = log |AB| |A||B|

Note that this is symmetric, but is not necessarily non-negative, so we cannot prove that it is a metric. It does, however, satisfy triangle inequality:

Lemma 1.9 (Ruzsa’s triangle inequality). Assuming that:

  • A,B,C G finite

Then
d (A,C) d(A,B) + d(B,C).

Proof. Observe that

|B||AC||AB||BC|.

Indeed, writing each d AC as d = ad cd with ad A, cd C, the map

ϕ : B × (AC) (AB) × (BC) (b,d) (ad b,b cd)

is injective. The triangle inequality now follows from the definition.

Definition 1.10 (Doubling / difference constant). Given a finite A G, we write

σ(A) := |A+A| |A|

for the doubling constant of A and

δ(A) := |AA| |A|

for the difference constant of A.

Then Lemma 1.9 shows, for example, that

logδ(A) = d(A,A) d(A,A) + d(A,A) = 2logσ(A).

So δ(A) σ(A)2, or |AA| |A+A|2 |A|.

Notation. Given A G and l,m 0, we write

lA mA := A+A++Al timesAAAm times.

Theorem 1.11 (Plunnecke’s Inequality). Assuming that:

  • A,B G are finite sets

  • |A+B| K|A| for some K 1

Then l,m 0,
|lB mB| Kl+m|A|.

Proof. Choose a non-empty subset A A such that the ratio |A+B| |A | is minimised, and call this ratio K. Then |A+B| = K|A|, K K, and A A, |A+B| K|A|.

Claim: For every finite C G, |A+B+C| K|A+C|.

Let’s complete the proof of the theorem assuming the claim. We first show that m 0, |A+mB| Km|A|. Indeed, the case m = 0 is trivial, and m = 1 is true by assumption. Suppose m > 1 and the inequality holds for m 1. By the claim with C = (m 1)B, we get

|A+mB| = |A+B+(m 1)B| K|A + (m 1)B| Km|A|.

But as in the proof of Ruzsa’s triangle inequality, l,m 0, we can show

|A||lB mB||A+lB||A+mB| Kl|A|Km|A| = Kl+m|A|2.

Hence |lB mB| Kl+m|A| Kl+m|A|, which completes the proof (assuming the claim).

We now prove the claim by induction on |C|. When |C| = 1 the statement follows from the assumptions. Suppose the claim is true for C, and consider C = C {x} for some xC. Observe that

A+B+C = (A+B+C)+((A+B+x) (D+B+x))

with D = {a A : a+B+x A+B+X}.

By definition of K, |D+B| K|D|, so

|A+B+C| |A+B+C| + |A+B+x||D+B+x| IHK|A+C| + K|A| K|D| = K(|A+C| + |A||D|)

We apply this argument a second time, writing

A+C = (A+C) ((A+x) (E+x))

where E = {a A : a + x A+C} D. We conclude that

|A+C| = |A+C| + |A+x||E+x||A+C| + |A||D|

so

|A+B+C| K(|A+C| + |A||D|) K|A+C|,

proving the claim.

We are now in a position to generalise Example 1.7.

Theorem 1.12 (Freiman-Ruzsa). Assuming that:

  • A 𝔽pn

  • |A + A| K|A| (i.e. σ(A) K)

Then A is contained in a subspace H 𝔽pn of size |H| K2pK4 |A|.

Proof. Choose X 2A A maximal such that the translates x+A with x X are disjoint. Such a set X cannot be too large: x X, x+A 3A A, so by Plunnecke’s Inequality, since |3A A| K4|A|,

|X||A| = | xX(x + A)| |3A A|.

So |X| K4. We next show

2A A X+AA. (∗)

Indeed, if y 2A A and yX, then by maximality of X, y+A x+A for some x X (and if y X, then clearly y X+AA).

It follows from () by induction that l 2,

lA A (l 1)X + A A, (∗∗)

since

lA A = A+(l 1)AA(l2)X+AA (l 2)X+2A A X+AA (l 1)X+AA.

Now let H 𝔽pn be the subgroup generated by A, which we can write as

H = l1(lA A)( )Y + A A

where Y 𝔽pn is the subgroup generated by X.

But every element of Y can be written as a sum of |X| elements of X with coefficients amongst 0,1,,p 1, hence |Y | p|X| pK4 . To conclude, note that

|U||Y ||AA| pK4 pK4 K2|A|,

where we use Plunnecke’s Inequality or even Ruzsa’s triangle inequality.

Example 1.13. Let A = V R where V 𝔽pn is a subspace of dimension K d n K and R consists of K 1 linearly independent vectors not in V .

Then

|A| = |V R| = |V | + |R| = pnk + K 1 pnk = |V |

and

|A + A| = |(V R) + (V R)| = |V (V + R) (R + R)| K|V |.

But any subspace K 𝔽pn containing A must have size at least pnK+(K1) |V | pK, so the exponential dependence on K is necessary.

Theorem 1.14 (Polynomial Freiman-Ruzsa, due to Gowers–Green–Manners–Tao 2024). Assuming that:

  • A 𝔽pn

  • |A + A| K|A|

Then there exists a subspace K 𝔽pn of size at most C1(K)|A| such that for some x 𝔽pn,
|A (x + K)| |A| C2(K),

where C1(K) and C2(K) are polynomial in K.

Proof. Omitted, because the techniques are not relevant to other parts of the course. See Entropy Methods in Combinatorics next term.

Definition 1.15. Given A,B G we define the additive energy between A and B to be

E(A,B) = |{(a,a,b,b) A × A × B × B : a + b = a + b}|.

We refer to the quadruples (a,a,b,b) such that a + b = a + b as additive quadruples.

Example 1.16. Let V 𝔽pn be a subspace. Then E(V ) = E(V,V ) = |V |3.

On the other hand, if A p is chosen at random from p (each element chosen independently with probability α > 0), then with high probability

E (A) = E(A,A) = α4p3 = α|A|3.

Lemma 1.17. Assuming that:

  • A,B G

  • both non-empty

Then
E (A,B) |A|2|B|2 |A + B| .

Proof. Define rA+B(x) = |{(a,b) A × B : a + b = x}| (and notice that this is the same as |A (xB)|). Observe that

E(A,B) = |{(a,a,b,b) A2 × B2 : a + b = a + b} = xGrA+B(x)2 = xA+BrA+B(x)2 ( xA+BrA+B(x))2 |A+B| by Cauchy-Schwarz

but

xG|A (xB)| = xG yG𝟙A(y)𝟙xB(y) = xG yG𝟙A(y)𝟙B(x y) = |A||B|

(As usual, 𝟙A here means the indicator function).

In particular, if |A+A| K|A|, then

E (A) = E(A,A) |A|4 |A+A| |A|3 K .

The converse is not true.

Example 1.18. Let G be your favourite (class of) abelian group(s). Then there exist constants 𝜃,η > 0 such that for all sufficiently large n, there exists A G, with |A| n satisfying E(A) η|A|3 and |A+A| 𝜃|A|2.

Theorem 1.19 (Balog–Szemeredi–Gowers, Schoen). Assuming that:

  • A G is finite

  • E(A) η|A|3 for some η > 0

Then there exists A A of size at least c1(η)|A| such that |A+A| |A| c 2(η), where c1(η) and c2(η) are polynomial in η.

Idea: Find A A such that a,b A such that a b has many representations as (a1 a2) + (a3 a4) with ai A.

We first prove a technical lemma, using a technique called “dependent random choice”.

Definition 1.20 (gamma-popular differences). Given A G and γ > 0, let

Pγ = {x G : |A (x + A)| γ|A|}

be the set of γ-popular differences of A.

Lemma 1.21. Assuming that:

  • A G is finite

  • E(A) η|A|3

  • c > 0

Then there is a subset X A of size |X| η|A|3 such that for all but a (16c)-proportion of pairs (a,b) X2, a b Pcη.

Proof. Let U = {x G : |A (x+A)| 1 2η|A|}. Then

xU|A (x+A)|2 = 1 2η|A| x|A (x+A)| = 1 2η|A|3 = 1 2E(A)

For 0 i log2η1, let

Qi = {x G : |A| 2i+1 < |A (x+A)| |A| 2i } ,

and set δi = η122i. Then

iδi|Qi| = i |Qi| η22i = 1 η|A|2 i |A|2 22i |Qi| = 1 η|A|2 i |A|2 22i xU𝟙{ |A| 2i+1 < |A (x+A)| |A| 2i } 1 η|A|2 xU|A (x+A)|2 1 η|A|2 1 2E(A) ( xU|A (x+A)|2 1 2E(A)) = 1 2|A| ()

Let S = {(a,b) A2 : a bPcη}. Then

i (a,b)S|(Aa) (Ab) Qi| (a,b)S |(Aa) (Ab)|=|A (a b + A)| cη|A| by definition of S |S| cη|A| cη|A|3 2cη|A|2 1 2|A| ()2cη|A|2 iδi|Qi|

Hence there exists i0 such that

(a,b)S|(Aa) (Ab) Qi0| 2cη|A|2δ i0|Qi0|.

Let Q = Qi0, δ = δi0, λ = 2i0. So

(a,b)S|(Aa) (Ab) Q| 2cηδ|A|2|Q|. (∗∗)

Find x such that X = |A (A+x)| is large.

Given x G, let X(x) = A (x + A). Then

𝔼xQ|X(x)| = 1 |Q| xQ|A (x + A)| 1 2λ|A|.

Let T(x) = {(a,b) X(x)2 : a bPcη}. Then

𝔼XQ|T(x)| = 𝔼xQ|{(a,b) (A (xxAaAb + A))2 : a bP cη}| = 1 |Q| xQ|{(a,b) S : x A a A b}| = 1 |Q| (a,b)S|(A a) (A b) Q| 1 |Q|2cη|A|2δ|Q| = 2cηδ|A|2 = 2cλ2|A|2

Therefore,

𝔼xQ|X(x)|2 (16c)1|T(x)| C-S (𝔼xQ|X(x)|)2 (16c)1𝔼 xQ|T(x)| (λ 2 )2|A|2 (16c)12cλ2|A|2 = (λ2 4 λ2 8 )|A|2 = λ2 8 |A|

So there exists x Q such that |X(x)|2 λ2 8|A|2, in which case we have

|X| λ 8|A| η 3|A|

and |T(x)| 16c|X|2.

Proof of Theorem 1.19. Given A G with E(A) η|A|3, apply Lemma 1.21 with c = 27 to otain X A of size |X| η 3 |A| such that for all but 1 8 of pairs (a,b) X2, a b Pη27. In particular, the bipartite graph

G = (X˙X,{(x,y) X × X : x y Pη27})

has at least 7 8|X|2 edges. Let A = {x X : deg (x) 3 4|X|}.

PIC

Clearly, |A| |X| 8 . For any a,b A, there are at least |X| 2 elements y X such that (a,y),(b,y) E(G) (a y,b y Pη27).

Thus a b = (a y) (b y) has at least

η 6|A|choices for y η 27|A| η 27|A| η3 217|A|3

representations of the form a1 a2 (a3 a4) with ai A.

It follows that

η3 217|A|3|AA| |A|4 |AA| 217η3|A| 222η4|A|

Thus |A + A| 244η8|A|.

2 Fourier-analytic techniques

In this chapter we will assume that G is finite abelian.

G comes equipped with a group Ĝ of characters, i.e. homomorphisms γ : G . In fact, Ĝ is isomorphic to G.

See Representation Theory notes for more information about characters and proofs of this as well as some of the facts below.

Example 2.1.

Notation. Given B G nonempty, and any function g : B , let

𝔼xBg(x) = 1 |B| xBg(x).

Lemma 2.2. Assuming that:

  • γ G^

Then
𝔼xG γ(x) = { 1 if γ = 1 0otherwise ,

and for all x G,

γG^γ(x) = { |G^| if x = 0 0 otherwise .

Proof. The first equality in eqch case is trivial. Suppose γ1. Then there exists y G with γ(y)1. Then

γ(y)𝔼zGγ(z) = 𝔼zGγ(y + z) = 𝔼zGγ(z)

So 𝔼zGγ(z) = 0.

For the second part, note that given x0, there must by γ G^ such that γ(x)1, for otherwise G^ would act trivially on x, hence would also be the dual group for Gx, a contradiction.

Definition 2.3 (Fourier transform). Given f : G , define its Fourier transform f^ : G^ by

f^ (γ) = 𝔼xGf(x)γ(x)¯.

It is easy to verify the inversion formula: for all x G,

f(x) = γG^f^(γ)γ(x).

Indeed,

γG^f^(γ)γ(x) = γG^𝔼yGf(y)γ(y)¯γ(x) = 𝔼yGf(y) γG^γ(x y)=|G| iff x = y = f(x) by Lemma 2.2

Given A G, the indicator or characteristic function of A, 𝟙A : G {0,1} is defined as usual.

Note that

𝟙A^ (1) = 𝔼xG𝟙A(x)1(x) = |A| |G|.

The density of A in G (often denoted by α).

Definition (Characteristic measure). Given non-empty A G, the characteristic measure μA : G [0,|G|] is defined by μA(x) = α1𝟙A(x).

Note that 𝔼xGμA(x) = 1 = μA^(1).

Definition (Balanced function). The balanced function fA : G [1,1] is given by fA(x) = 𝟙A(x) α. Note that 𝔼xGfA(x) = 0 = fA^(1).

Example 2.4. Let V 𝔽pn be a subspace. Then for t 𝔽pn^, we have

𝟙V ^(t) = 𝔼x𝔽pn𝟙V (x)e (x t p ) = |V | pn 𝟙V (t)

where V = {t 𝔽pn^ : x t = 0 x V } is the annihilator of V . In other words, 𝟙V ^(t) = μV (t).

Example 2.5. Let R G be such that each x G lies in R independently with probability 1 2. Then with high probability

sup γ1|𝟙R^(γ)| = O ( log |G| |G| ).

This follows from Chernoff’s inequality: Given -valued independent random variables X1,X2,,Xn with mean 0, then for all 𝜃 > 0, we have

(| i=1nX i| 𝜃 i=1n Xi L()2) 4exp (𝜃2 4 ).

Example 2.6. Let Q = {x 𝔽pn : x x = 0} 𝔽pn with p > 2. Then

|Q| pn = 1 p + O(pn 2 )

and sup t0|𝟙Q^(t)| = O(pn 2 ).

Given f,g : G , we write

f,g = 𝔼xGf(x)g(x)¯andf^,g^ = γG^f^(γ)g^(γ)¯.

Consequently,

f L2(G)2 = 𝔼 xG|f(x)|2and f^ l2(G^)2 = γG^|f^(γ)|2.

Lemma 2.7. Assuming that:

  • f,g : G

Then
  • (i) f L2(G)2 = f^l2(G^)2 (Parseval’s identity)
  • (ii) f,g = f^,g^ (Plancherel’s identity)

Proof. Exercise (hopefully easy).

Definition 2.8 (Spectrum). Let 1 ρ > 0 and f : G . Define the ρ-large spectrum of f to be

Spec ρ(f) = {γ G^ : |f^(γ)| ρ f 1}.

Example 2.9. By Example 2.4, if f = 𝟙V with V 𝔽pn, then ρ > 0,

Spec ρ (𝟙V ) = {t 𝔽pn^ : |𝟙V ^(t)| ρ|V | pn } = V .

Lemma 2.10. Assuming that:

  • ρ > 0

Then
|Spec ρ(f)| ρ2 f 22 f 12.

Proof. By Parseval’s identity,

f 22 = f^22 = γG^|f^(γ)|2 γSpec ρ(f)|f^(γ)|2 |Spec ρ(f)|(ρ f 1)2

In particular, if f = 𝟙A for A G, then

f 1 = α = |A| |G| = f 22,

so |Spec ρ(𝟙A)| ρ2α1.

Definition 2.11 (Convolution). Given f,g : G , we define their convolution f g : G by

f g(x) = 𝔼yGf(y)g(x y)x G.

Example 2.12. Given A,B G,

𝟙 A𝟙B(x) = 𝔼yG𝟙A(y)𝟙B(x y) = 𝔼yG𝟙A(y)𝟙x B(y) = |A (x B)| |G| = 1 |G|rA+B(x).

In particular, supp (𝟙A𝟙B) = A+B.

Lemma 2.13. Assuming that:

  • f,g : G

Then
fg^ (γ) = f^(γ)g^(γ)γ G^.

Proof.

fg^(γ) = 𝔼xGfg(x)γ(x)¯ = 𝔼xG𝔼[y]Gf(y)g(x y u)γ(x)¯ = 𝔼uG𝔼[y]Gf(y)g(u)γ(u + y)¯ = f^(γ)g^(γ)

Example 2.14.

𝔼x+y=z+wf(x)f(y)f(z)f(w)¯ = f^l4(G^)4.

In particular,

𝟙A^l4(G^)4 = E(A) |G|3

for any A G.

Theorem 2.15 (Bogolyubov’s lemma). Assuming that:

  • A 𝔽pn be a set of density α

Then there exists V 𝔽pn of codimension 2α2 such that V A+AAA.

Proof. Observe

2A 2A = supp (𝟙A𝟙A𝟙A𝟙A=:g),

so wish to find V 𝔽pn such that g(x) > 0 for all x V . Let S = Spec ρ(𝟙A) with ρ = α 2 and let V = S. By Lemma 2.10, codim (V ) |S| ρ2α1. Fix x V .

g(x) = t𝔽pn^g^(t)e(x tp) = t𝔽pn^|𝟙A^(t)|4e(x tp) by Lemma 2.13 = α4 + t0|𝟙A^(t)|4e(x tp) = α4 + tS{0}|𝟙A^(t)|4e(x tp) (1) + tS|𝟙A^(t)|4e(x tp) (2)

Note (1) (ρα)4 since x t = 0 for all t S and

|(2)| sup tS|𝟙A^(t)|2 tS|𝟙A^|2 sup tS|𝟙A^(t)|2 tS|𝟙A^|2 (ρα)2𝟙A 22 by Parseval’s identity = ρ2α3

hence g(x) > 0 (in fact, α4 2) for all x V and codim (V ) 2α2.

Example 2.16. The set A = {x 𝔽2n : |x| n 2 + n 2 } (where |x| counts the number of 1s in x) has density 1 8, but there is no coset C of any subspace of codimension n such that C A+A(= AA).

Lemma 2.17. Assuming that:

  • A 𝔽pn of density α

  • ρ > 0

  • sup t0|𝟙A^(t)| ρα

Then there exists V 𝔽pn of codimension 1 and x 𝔽pn such that
|A (x + V )| α (1 + ρ 2 )|V |.

PIC

Proof. Let t0 be such that |𝟙A^(t)| ρα, and let V = t. Write vj + V for j [p] = {1,2,,p} for the p distinct cosets vj + V = {x 𝔽pn : x t = j} of V . Then

𝟙A^(t) = fA^(t) = 𝔼x𝔽pn(𝟙A(x) α)e(x tp) = 𝔼j[p]𝔼xvj+V (𝟙A(x) α)e(jp) = 𝔼j[p] ( |A (vj + V )| |vj + V | α=aj) e(jp)

By triangle inequality, 𝔼j[p]|aj| ρα. But note that 𝔼j[p]aj = 0 so 𝔼j[p]aj + |aj| ρα, hence there exists j [p] such that aj + |aj| ρα. Then aj ρα 2 .

Notation. Given f,g,h : G , write

T3(f,g,h) = 𝔼x,dGf(x)g(x + d)h(x + 2d).

Notation. Given A G, write

2 A = {2a : a A},

to be distinguished from 2A = A+A = {a + a : a,a A}.

Lemma 2.18. Assuming that:

  • p 3 prime

  • A 𝔽pn of density α > 0

  • sup t0|𝟙A^(t)| 𝜀

Then the number of 3-term arithmetic progressions in A differs from α3(pn)2 by at most 𝜀(pn)2.

Proof. The number of 3-term arithmetic progressions in A is (pn)2 times

T3(𝟙A,𝟙A,𝟙A) = 𝔼x,d𝔽pn𝟙A(x)𝟙(x + d)𝟙A(x + 2d) = 𝔼x,y𝔽pn𝟙A(x)𝟙A(y)𝟙A(2y x) = 𝔼yG𝟙A(y)𝔼xG𝟙A(x)𝟙A(2y x) = 𝔼yG𝟙A(y)𝟙A𝟙A(2y) = 𝟙2A,𝟙A𝟙A

By Plancherel’s identity and Lemma 2.13, we have

= 𝟙2A^,𝟙A^2 = t𝟙2A^(t)𝟙A^(t)2¯ = α3 + t0𝟙2A^(t)𝟙A^(t)2¯

but

| t0𝟙2A^(t)𝟙A^(t)2| sup t0|𝟙A^(t)| t0|𝟙2A^(t)||𝟙A^(t)| CSsup t0|𝟙A^(t)|( t|𝟙2A^(t)|2 t|𝟙A^(t)|2) 1 2 𝜀𝟙2A^2𝟙A^2 = 𝜀 α

by Parseval’s identity.

Theorem 2.19 (Meshulam’s Theorem). Assuming that:

  • A 𝔽pn a set containing no non-trivial 3 term arithmetic progressions

Then |A| = O ( pn log pn ).

Proof. By assumption,

T 3 (𝟙A,𝟙A,𝟙A) = |A| (pn)2 = α pn.

But as in (the proof of) Lemma 2.18,

|T3(𝟙A,𝟙A,𝟙A) α3| sup t0|𝟙A^(t)| α,

so provided pn 2α2, i.e. T3(𝟙A,𝟙A,𝟙A) α3 2 we have sup t0|𝟙A^(t)| α2 2.

So by Lemma 2.17 with ρ = α 2 , there exists V 𝔽pn of codimension 1 and x 𝔽pn such that |A (x + V )| (α + α2 4 ) |V |.

We iterate this observation: let A0 = A, V 0 = 𝔽pn, α0 = |A0| |V 0|. At the i-th step, we are given a set Ai1 V i1 of density αi1 with no non-trivial 3 term arithmetic progressions. Provided that pdim (V i1) 2αi12, there exists V i V i1 of codimension 1, xi V i1 such that

|(Axi) V i| (αi1 + (αi1)2 4 )|V i|.

Set Ai = (Axi) V i V i, has density αi1 + (αi1)2 4, and is free of non-trivial 3 term arithmetic progressions.

Through this iteration, the density increases from α to 2α in at most α (α2 4) = 4 α1 steps.

2α to 4α in at most 2α ((2α)2 4) = 2α1 steps and so on.

So reaches 1 in at most

4α1 (1 + 1 2 + 1 4 + 1 8 + ) 8α1

steps. The argument must end with dim (V i) n 8α1, at which point we must have had pdim (V i) < 2αi12 2α2, or else we could have continued.

But we may assume that α 2pn 4 (or α2 < 2pn 2 ) whence pn8α1 pn 2 , or n 2 2α1.

At the time of writing, the largest known subset of 𝔽3n containing no non-trivial 3 term arithmetic progressions has size (2.2202)n.

We will prove an upper bound of the form (2.756)n.

Theorem 2.20 (Roth’s theorem). Assuming that:

  • A [N] = {1,,N}

  • A contains no non-trivial 3 term arithmetic progressions

Then |A| = O ( N loglogN ).

Example 2.21 (Behrend’s example). There exists A [N] of size at least |A| exp (c log N)N containing no non-trivial 3 term arithmetic progressions.

Lemma 2.22. Assuming that:

  • A [N] of density α > 0

  • N > 50α2

  • A contains no non-trivial 3 term arithmetic progressions

  • p a prime in [N 3 , 2N 3 ]

  • let A = A [p] p

Then one of the following holds:
  • (i) sup t0|𝟙A^(t)| α2 10 (where the Fourier coefficient is computed in p)
  • (ii) There exists an interval J [N] of length N 3 such that |A J| α (1 + α 400 ) |J|

Proof. We may assume that |A| = |A [p]| α (1 α 200 ) p since otherwise

|A [p + 1,N]| αN (α (1 α 200 )p) = α(N p) + α2 200p (α + α2 400 )(N p)

so we would be in Case (ii) with J = [p + 1,N]. Let A = A[p 3, 2p 3 ]. Note that all 3 term arithmetic progressions of the form (x,x + d,x + 2d) A× A × A are in fact arithmetic progressions in [N].

If |A[p 3 ] | or |A[2p 3 ,p]| were at least 2 5|A|, we would again be in case (ii). So we may assume that |A| |A| 5.

Now as in Lemma 2.18 and Theorem 2.19,

α p = |A| p2 T3(𝟙A,𝟙A,𝟙A) = α(α)2 + t𝟙A^(t)𝟙A^(t)¯𝟙2A^(t)

where α = |A| p and α = |A| p . So as before,

αα 2 sup t0|𝟙A(t)| α ,

provided that α p 1 2α(α)2, i.e. 2 p αα. (Check this is satisfied).

Hence

sup t0|𝟙A^(t)| αα 2 1 2 (α (1 α 200 ))2 2 5 α2 10.

Lemma 2.23. Assuming that:

  • m

  • φ : [m] p be given by xtx for some t0

  • 𝜀 > 0

Then there exists a partition of [m] into progressions Pi of length li [𝜀m 2,𝜀m] such that
diam (φ(Pi)) = max x,yPi|φ(x) φ(y)| 𝜀p

for all i.

Proof. Let u = m and consider 0,t,2t,,ut. By Pigeonhole, there exists 0 v < w usuch that |wt vt| = |(w v)t| p u. Set s = w v, so |st| p u. Divide [m] into residue classes modulo s, each of which has size at least m s m 4 . But each residue class can be divided into arithmetic progressions of the form a,a + s,,a + ds with 𝜀u 2 < d 𝜀u. The diameter of the image of each progression under φ is |dst| dp u 𝜀up u = 𝜀p.

Lemma 2.24. Assuming that:

  • A [N] of density α > 0

  • p a prime in [N 3 , 2N 3 ]

  • let A = A [p] p

  • |𝟙A^(t)| α2 20 for some t0

Then there exists a progression P [N] of length at least α2 N 500 such that |A P| α (1 + α 80 ) |P|.

Proof. Let 𝜀 = α2 40π, and use Lemma 2.23 to partition [p] into progressions Pi of length

𝜀 p 2 α2 40π N 3 2 α2 N 500

and diam (φ(Pi)) 𝜀p. Fix one xi from each of the Pi. Then

α2 20 |fA^(t)| = |1 p i xPifA(x)e(xtp)| = 1 p | i xPifA(x)e(xitp) + i xPifA(x)(e(xtp) e(xitp))| 1 p i | xPifA(x)| + 1 p i xPi|fA(x)||e(xtp) e(xitp) 2π𝜀 since |t(x xi)| 𝜀p |

So

i | xPifA(x)| α2 40p.

Since fA has mean zero,

i (| xPifA(x)| + xPifA(x)) α2 40p,

hence there exists i such that

| xPifA(x) | + xPifA(x) α2 80|Pi|

and so

xPifA(x) α2 160|Pi|.

Definition 2.25 (Bohr set). Let Γ G^ and ρ > 0. By the Bohr set B(Γ,ρ) we mean the set

B(Γ,ρ) = {x G : |γ(x) 1| < ρ γ Γ}.

We call |Γ| the rank of B(Γ,ρ), and ρ its width or radius.

Example 2.26. When G = 𝔽pn, then B(Γ,ρ) = Γ for all sufficiently small ρ.

Lemma 2.27. Assuming that:

  • Γ G^ of size d

  • ρ > 0

Then
|B(Γ,ρ)| (ρ 8 )d|G|.

Proposition 2.28 (Bogolyubov in a general finite abelian group). Assuming that:

  • A G of density α > 0

Then there exists Γ G^ of size at most 2α2 such that A + A A A B(Γ,ρ).

Proof. Recall 𝟙A𝟙A𝟙A𝟙A(x) = γG^|𝟙A^(γ)|4γ(x).

Let Γ Spec α 2 (𝟙A), and note that, for x B (Γ, 1 2 ) and γ Γ, Re (γ(x)) > 0. Hence, for x B (Γ, 1 2 ),

Re γG^|𝟙A^(γ)|4γ(x) = Re γΓ|𝟙A^(γ)|4γ(x) α4 + Re γΓ|𝟙A^(γ)|4γ(x)

and

|Re γΓ|𝟙A^(γ)|4γ(x) | sup γΓ|𝟙A^(γ)|2 γΓ|𝟙A^(γ)|2 ( α 2 α)2 α = α4 2 .

3 Probabilistic Tools

All probability spaces in this course will be finite.

Theorem 3.1 (Khintchine’s inequality). Assuming that:

  • p [2,)

  • X1,X2,,Xn independent random variables

  • (Xi = xi) = 1 2 = (Xi = xi)

Then
i=1nX i Lp() = O (p1 2 ( i=1nX iL2()2) 1 2 ) .

Proof. By nesting of norms, it suffices to prove the case p = 2k for some k . Write X = i=1nXi, and assume i=1nXiL()2 = 1. Note that in fact i=1nXiL2()2 = i=1nXiL()2, hence i=1nXiL2()2 = 1.

By Chernoff’s inequality (Example 2.5), for all 𝜃 > 0 we have

(|X| 𝜃) 4exp (𝜃2 4 ),

and so using the fact that (|X| t) = 0tρX(s)ds we have

XL2k()2k =0t2kρ X(t)dt =02kt2k1(|X| t)dt integration by parts 08kt2k1 exp (t2 4 )dt=:I(K)

We shall show by induction on k that I(K) 22k(2k)k 4k. Indeed, when k = 1,

0texp (t2 4 )dt = [2exp (t2 4 )]0 = 2 2.

For k > 1, integrate by parts to find that

I(K) =0t2k2 u texp (t2 4 )vdt = [t2k2 (2exp (t2 4 ))]00(2k 2)t2k3 (2exp (t2 4 ))dt = 4(k 1)0t2(k1)1 exp (t2 4 )dt = 4(k 1)I(K 1) 4(k 1)22(k1)(2(k 1))k1 4(k 1) 22k(2k)k 4k

Corollary 3.2 (Rudin’s Inequality). Let Γ 𝔽2n^ be a linearly independent set and let p [2,). Then for any f^ l2(Γ),

γΓf^(γ)γ LP(𝔽2n) = O(pf^ l2(Γ)).

Corollary 3.3. Let Γ 𝔽2n^ be a linearly independent set and let p (1,2]. Then for all f Lp(𝔽2n),

f^l2(Γ) = O (p p 1fLp(𝔽2n)) .

Proof. Let f Lp(𝔽2n) and write g = γΓf^(γ)γ. Then

f^l2(Γ)2 = γΓ|f^(γ)|2 = f^,g^l2(𝔽2n^) = f,gL2(𝔽2n) by Plancherel’s identity

which is bounded above by fLp(𝔽2n)gLp(𝔽 2n) where 1 p + 1 p = 1, using Hölder’s inequality.

By Rudin’s Inequality,

gLp(𝔽 2n) = O ( pg^l2(Γ)) = O (p p 1f^l2(Γ)) .

Recall that given A 𝔽2n of density α > 0, we had |Spec ρ(𝟙A) ρ2α1. This is best possible as the example of a subspace shows. However, in this case the large spectrum is highly structured.

Theorem 3.4 (Special case of Chang’s Theorem). Assuming that:

  • A 𝔽2n of density α > 0

  • ρ > 0

Then there exists H 𝔽2n^ of dimension O(ρ2 logα1) such that H Spec ρ(𝟙A).

Proof. Let Γ Spec ρ(𝟙A) be a maximal linearly independent set. Let H = Spec ρ(𝟙A). Clearly dim (H) = |Γ|. By Corollary 3.3, for all p (1,2],

(ρα)2|Γ| γΓ|𝟙A^(γ)|2 = 𝟙A^ l2(Γ)2 = O ( p p 1𝟙ALp(𝔽2n)2) ,

so

|Γ| = O (ρ2α2α2p p p 1 ).

Set p = 1 + (logα1)1 to get |Γ| = O(ρ2α2(α2 e2)(logα1 + 1)).

Definition 3.5 (Dissociated). Let G be a finite abelian group. We say S G is dissociated if sS𝜀ss = 0 for 𝜀 {1,0,1}|S|, then 𝜀 0.

Clearly, if G = 𝔽2n, then S G is dissociated if and only if it is linearly independent.

Theorem 3.6 (Chang’s Theorem). Assuming that:

  • G a finite abelian group

  • A G be of density α > 0

  • Λ Spec ρ(𝟙A) is dissociated

Then |Λ| = O(ρ2 logα1.

We may bootstrap Khintchine’s inequality to obtain the following:

Theorem 3.7 (Marcinkiewicz-Zygmund). Assuming that:

  • p [2,)

  • X1,X2,,Xn Łp() independent random variables

  • 𝔼 i=1nXi = 0

Then
i=1nX i Lp() = O (p1 2 i=1n|X i|2 Lp2() 1 2 ) .

Proof. First assume the distribution of the Xi’s is symmetric, i.e. (Xi = a) = (Xi = a) for all a . Partition the probability space Ω into sets Ω1,Ω2,,ΩM, write j for the induced measure on Ωj such that all Xi’s are symmetric and take at most 2 values. By Khintchine’s inequality, for each j [M],

i=1nX i Lp(j)p = O (pp2 ( i=1nX iL2(j)2) p2) = O (pp2 i=1n|X i|2 Lp2(j)p2)

so summing over all j and taking p-th roots gives the symmetric case. Now suppose the Xi’s are arbitrary, and let Y 1,,Y n be such that Y i Xi and X1,X2,,Xn,Y 1,Y 2,,Y n are all independent. Applying the symmetric case to Xi Y i,

i=1n(X i Y i)Lp(×) = O (p1 2 i=1n|X i Y i|2 Lp2(×) 1 2 ) = O (p1 2 i=1n|X i Y i|2 Lp2() 1 2 )

But then

i=1nX i Lp() = i=1nX i 𝔼Y i=1nY i =0 Lp()p = 𝔼X | Xi 𝔼Y Y i| p = 𝔼X |𝔼Y (Xi Y i)|p 𝔼X𝔼Y | (Xi Y i)|p by Jensen say = (Xi Y i)Lp(×)p

concluding the proof.

Theorem 3.8 (Croot-Sisask almost periodicity). Assuming that:

  • G a finite abelian group

  • 𝜀 > 0

  • p [2,)

  • A,B G are such that |A + B| K|A|

  • f : G

Then there exists b B and a set X B b such that |X| 21KO(𝜀2p) |B| and
τxfμA fμALp(G) 𝜀fLp(G)x X,

where τxg(y) = g(y + x) for all y G, and as a reminder, μA is the characteristic measure of A.

Proof. The main idea is to approximate

fμA(y) = 𝔼xf(y x)μA(x) = 𝔼xAf(y x)

by 1 m i=1mf(y zi), where zi are sampled independently and uniformly from A, and m is to be chosen later.

For each y G, define Zi(y) = τzif(y) fμA(y). For each y G, these are independent random variables with mean 0, so by Marcinkiewicz-Zygmund,

i=1mZ i(y)Lp()p = O (pp2 i=1m|Z i(y)|2 Lp2()p2) = O (pp2𝔼 (z1,,zm)Am | i=1m|Z i(y)|2| p2)

By Hölder with 1 p + 2 p = 1, we get

| i=1m|Z i(y)|2| p2 ( i=1m1p ) 1 pp 2 ( i=1m|Z i(y)|2p2) 2 pp 2 ( i=1m1p ) p 2 1 ( i=1m|Z i(y)|2p2) 2 pp 2 = mp21 i=1m|Z i(y)|p

so

i=1mZ i(y) Lp()p = O (pp2mp21𝔼 (z1,,zm)Am i=1m|Z i(y)|p) .

Summing over all y G, we have

𝔼yG i=1mZ i(y) Lp()p = O (pp2mp21𝔼 (z1,,zm)Am i=1m𝔼 yG|Zi(y)|p)

with

(𝔼yG|Zi(y)|p) 1 p = ZiLp(G) = τzif fμALp(G) ,τzifLp(G) + fμALp(G) fLp(G) + fLq(G)μAL1(G) 2fLp(G)

by Young / Hölder (fgLr(G) fLp(G)gLq(G) where 1 + 1 r = 1 p + 1 q).

So we have

𝔼(z1,,zm)Am𝔼yG | i=1mZ i(y)|p = O (pp2mp21 i=1m(2f Lp(G))p) = O((4p)p2mp2f Lp(G)p).

Choose m = O(𝜀2p) so that the RHS is at most (𝜀 4fLp(G))p. whence

𝔼(z1,,zm)Am 𝔼yG | 1 m i=1mτ zif(y) fμA(y)|p =() = O((4p)p2mp2f Lp(G)p) = (𝜀 4fLp(G)) p.

Write

L = {z = (z1,,zm) Am : () (𝜀 2fLp(G)) p} .

By Markov inequality, since

𝔼() (𝜀 4fLp(G)) p = 2p (𝜀 2fLp(G)) p,

we have

|Am L| |Am| = (() (𝜀 2fLp(G)) p) (() 2p𝔼()) 2p

so |L| (1 1 2p ) |A|m 1 2|A|m. Let

D = {(b,b,,b)m : b B}.

Now L+D (A+B)m, whence

|L+D||A+B|m Km|A|m 2Km|L|.

By Lemma 1.17,

E (L,D) |L|2|D|2 |L+D| 1 2Km|D|2|L|

so there are at least |D|2 2Km pairs (d1,d2) D × D such that rLL(d2 d1) > 0. In particular, there exists b ub and X B b of size |X| |D| 2Km = |B| 2Km such that for all x X, there exists l2(x) L such that for all i [m], l1(x)i l2(x)i = x. But then for each x X, by the triangle inequality,

τxfμA fμALp(G) τxfμA τx ( 1 m i=1mτ l2(x)if )Lp(G)    + τx ( 1 m i=1mτ l2(xi)f ) fμA Lp(G) = fμA 1 m i=1mτ l2(x)if Lp(G)    + 1 m i=1mτ xl2(x)if fμA Lp(G) 2 𝜀 2fLp(G)

by definion of L.

Theorem 3.9 (Bogolyubov again, after Sanders). Assuming that:

  • A 𝔽pn of density α > 0

Then there exists a subspace V 𝔽pn of codimension O(log4α1) such tht V A+AAA.

Almost periodicity is also a key ingredient in recent work of Kelley and Meka, showing that any A [N] containing no non-trivial 3 term arithmetic progressions has size |A| exp (Clog 1 11 N)N.

4 Further Topics

In 𝔽pn, we can do much better.

Theorem 4.1 (Ellenberg-Gijswijt, following Croot-Lev-Pach). Assuming that:

  • A 𝔽3n contains no non-trivial 3 term arithmetic progressions

Then |A| = o(2.756)n.

Notation. Let Mn be the set of monomials in x1,,x2 whose degree in each variable is at most 2. Let V n be the vector space over 𝔽3 whose basis is Mn. For any d [0,2n], write Mnd for the set of monomials in Mn of (total) degree at most d, and V nd for the corresponding vector space. Set md = dim (V nd) = |Mnd|.

Lemma 4.2. Assuming that:

  • A 𝔽3n

  • P V nd is a polynomial

  • P(a + a) = 0 for all aa A

Then
|{a A : P(2a)0}| 2md2.

Proof. Every P V nd can be written as a linear combination of monomials in Mnd, so

P(x+y) = m,mM nd deg (mm)d cm,mm(x)m(y)

for some coefficients cm,m. Clearly at least one of m,m must have degree d 2, whence

P(x + y) = mMnd2m(x)Fm(y) + mMnd2m(y)G m(x),

for some families of polynomials (Fm)mMnd2, (Gm)mMnd2.

Viewing (P(x + y))x,yA as a |A|×|A|-matrix C, we see that C can be written as the sum of at most 2md2 matrices, each of which has rank 1. Thus rank (C) 2md2. But by assumption, C is a diagonal matrix whose rank equals |{a A : P(a + a)0}|.

Proposition 4.3. Assuming that:

  • A 𝔽3n a set containing no non-trivial 3 term arithmetic progressions

Then |A| 3m2n3.

Proof. Let d [0,2n] be an integer to be determined later. Let W be the space of polynomials in V nd that vanish on (2A)c. We have

dim (W) dim (V nd) |(2A)c| = m d (3n |A|).

We claim that there exists P W such that |supp (P)| dim (W). Indeed, pick P W with maximal support. If |supp (P)| < dim (W), then there would be a non-zero polynomial Q W vanishing on supp (P), in which case supp (P + Q) supp (P), contradicting the choice of P.

PIC

Now by assumption,

{a + a : aa A} 2A = .

So any polynomial that vanishes on (2A)c vanishes on {a + a : aa A}. By Lemma 4.2 we now have that,

|A| (3n m d) = md (3n |A|) dim (W) |supp (P)| = |{x 𝔽3n : P(x)0}| = |{a A : P(2a)0}| 2md2

Hence |A| 3n md + 2md2. But the monomials in Mn Mnd are in bijection with the ones in M2nd via x1α1xnαnx12α1xn2αn, whence 3n md = m2nd. Thus setting d = 4n 3 , we have |A|m2n3 + 2m2n3 = 3m2n3.

You will prove Theorem 4.1 on Example Sheet 3.

We do not have at present a comparable bound for 4 term arithmetic progressions. Fourier techniques also fail.

Example 4.4. Recall from Lemma 2.18 that given A G,

|T3(𝟙A,𝟙A,𝟙A) α3| sup γ1|𝟙A^(γ)|.

But it is impossible to bound

T 4 (𝟙A,𝟙A,𝟙A,𝟙A) α4 = 𝔼 xd𝟙A(x)𝟙A(x + d)𝟙A(x + 2d)𝟙A(x + 3d) α4

by sup γ1|𝟙A^(γ)|. Indeed, consider Q = {x 𝔽pn : x x = 0}. By Problem 11(ii) on Sheet 1,

|Q| pn = 1 p + O(pn2)

and

sup t0|𝟙Q^(t)| = O(pn2).

But given a 3 term arithmetic progression x,x + d,x + 2d Q, by the identity

x2 3(x + d)2 + 3(x + 2d)2 (x + 3d)2 = 0x,d,

x + 3d automatically lies in Q, so

T 4 (𝟙A,𝟙A,𝟙A,𝟙A) = T3(𝟙A,𝟙A,𝟙A) = (1 p )3 + O(pn2)

which is not close to (1 p ) 4.

Definition 4.5 (U2-norm). Given f : G , define its U2-norm by the formula

fU2(G)4 = 𝔼 x,a,bGf(x)f(x + a)f(x + b)¯f(x + a + b).

Problem 1(i) on Sheet 2 showed that fU2(G) = f^l4(G^), so this is indeed a norm.

Problem 1(ii) asserted the following:

Lemma 4.6. Assuming that:

  • f1,f2,f3 : G

Then
|T3(f1,f2,f3)| min i[3]fiU2(G) jifjL(G).

Note that

sup γG^|f^(γ)|4 γG^|f^(γ)|4 sup γG^|f^(γ)|2 γG^|f^(γ)|2

and thus by Parseval’s identity,

fU2(G)4 = f^ l(G^)4 f^ l(G^)2f L2(G)2.

Hence

f^l(G^) f^l4(G^) = fU2(G) f^l(G^) 1 2 fL2(G) 1 2 .

Moreover, if f = fAA = 𝟙A α, then

T 3 (f,f,f) = T3(𝟙A α,𝟙A α,𝟙A α) = T3(𝟙A,𝟙A,𝟙A) α3.

We may therefore reformulate the first step in the proof of Meshulam’s Theorem as follows: if pn 2α2, then by Lemma 4.6,

α3 2 |α pn α3| = |T 3(fAA,fAA,fAA)|fAAU2(𝔽pn).

It remains to show that if fAAU2(𝔽pn) is non-trivial, then there exists a subspace V 𝔽pn of bounded codimension on which A has increased density.

Theorem 4.7 (U2 Inverse Theorem). Assuming that:

  • f : 𝔽pn

  • fL(𝔽pn) 1

  • δ > 1

  • fU2(𝔽pn) δ

Then there exists b 𝔽pn such that
|𝔼x𝔽pnf(x)e(x bp)| δ2.

In other words, |f,ϕ| δ2 for ϕ(x) = e(x bp) and we say “f correlates with a linear phase function”.

Proof. We have seen that

fU2(𝔽pn)2 f^ l(𝔽pn^)fL2(𝔽pn) f^l(𝔽pn^),

so

δ2 f^ l(𝔽pn^) = sup t𝔽pn^|𝔼xf(x)e(x tp)|.

PIC

Definition 4.8 (U3 norm). Given f : G , define its U3 norm by

fU3(G)8 := 𝔼x,a,b,cf(x)f(x + a)f(x + b)f(x + c)¯ f(x + a + b)f(x + b + c)f(x + a + c)f(x + a + b + c)¯ = 𝔼x,h1,h2,h3G 𝜀{0,1}3C|𝜀|f(x + 𝜀 h)

where Cg(x) = g(x)¯ and |𝜀| denotes the number of ones in 𝜀.

It is easy to verify that 𝔼cGΔcfU2(G)4 where Δcg(x) = g(x)g(x + c)¯.

Definition 4.9 (U3 inner product). Given functions f𝜀 : G for 𝜀 {0,1}3, define their U3 inner product by

(f𝜀)𝜀{0,1}3U3(G) = 𝔼x,h1,h2,h3G 𝜀{0,1}3C|𝜀|f 𝜀(x + 𝜀 h).

Observe that f,f,f,f,f,f,f,fU3(G) = fU3(G)8.

Lemma 4.10 (Gowers–Cauchy–Schwarz Inequality). Assuming that:

  • f𝜀 : G , 𝜀 {0,1}3

Then
|(f𝜀)𝜀{0,1}3U3(G) 𝜀{0,1}3f𝜀U3(G).

Setting f𝜀 = f for 𝜀 {0,1}2 ×{0} and f𝜀 = 1 otherwise, it follows that fU2(G)4 fU3(G)4 hence fU2(G) fU3(G).

Proposition 4.11. Assuming that:

  • f1,f2,f3,f4 : 𝔽5n

Then
T 4 (f1,f2,f3,f4) min i[4]fiU3(G) jifjL(𝔽5n).

Proof. We additionally assume f = f1 = f2 = f3 = f4 to make the proof easier to follow, but the same ideas are used for the general case. We additionally assume fL(𝔽5n) 1, by rescaling, since the inequality is homogeneous.

Reparametrising, we have

T4(f,f,f,f) = 𝔼a,b,c,d𝔽5nf(3a + 2b + c)f(2a + b d)f(a c 2d)f(b 2c 3d) |T4(f,f,f,f)|8 (𝔼a,b,c|𝔼df(2a + b d)f(a c 2d)f(b 2c 3d)|2)4 = (𝔼d,d𝔼a,bf(2a + b + d)f(2a + b d)¯ 𝔼cf(a c 2d)f(a c 2d)¯f(b 2c 3d)f(b 2c 3d)¯)4 (𝔼d,d𝔼a,b|𝔼cf(a c 2d)f(a c 2d)¯f(b 2c 3d)f(b 2c 3d)¯|2)2 = (𝔼c,c,d,d𝔼af(a c 2d)f(a c 2d)¯f(a c 2d)¯f(a c 2d) 𝔼bf(b 2c 3d)f(b 2c 3d)¯f(b 2c 3d)¯f(b 2c 3d))2 𝔼c,c,d,d,a|𝔼bf(b 2c 3d)f(b 2c 3d)¯f(b 2c 3d)¯f(b 2c 3d)|2 = 𝔼b,b,c,c,d,df(b 2c 3d)f(b 2c 3d)¯f(b 2c 3d)¯f(b 2c 3d) f(b 2c 3d)¯f(b 2c 3d)f(b 2c 3d)f(b 2c 3d)¯

Theorem 4.12 (Szemeredi’s Theorem for 4-APs). Assuming that:

  • A 𝔽5n a set containing no non-trivial 4 term arithmetic progressions

Then |A| = o(5n).

Idea: By Proposition 4.11 with f = fA = 𝟙A α,

T 4 (𝟙AfA+α,𝟙AfA+α,𝟙AfA+α,𝟙AfA+α) α4 = T 4(fA,fA,fA,fA) +

where consists of 14 other terms in which between one and three of the inputs are equal to fA.

These are controlled by

fAU2(𝔽5n) fAU3(G),

whence

|T4(𝟙A,𝟙A,𝟙A,𝟙A) α4| 15f AU3(G).

So if A contains no non-trivial 4 term arithmetic progressions and 5n > 2α3, then fAU3(G) α4 30.

What can we say about functions with large U3 norm?

Example 4.13. Let M be an n × n symmetric matrix with entries in 𝔽5. Then f(x) = e(xMx5) satisfies fU3(G) = 1.

Theorem 4.14 (U3 inverse theorem). Assuming that:

  • f : 𝔽5n

  • fL(𝔽5n) 1

  • fU3(G) δ for some δ > 0

Then there exists a symmetric n × n matrix M with entries in 𝔽5 and b 𝔽5n such that
|𝔼xf(x)e((xMx + bx)p)| c(δ)

where c(δ) is a polynomial in δ. In other words, |f,ϕ| c(δ) for ϕ(x) = e((xMx + bx)p) and we say “f correlates with a quadratic phase function”.

Proof (sketch). Let Δhf(x) denote f(x)f(x + h)¯.

fU3(G) = (𝔼hΔhfU24)1 8 .

STEP 1: If fU3(G)8 = 𝔼hΔhU24 δ8, then for at least a δ8 2-proportion of h 𝔽5n, δ8 2 ΔhfU24 Δhf^l2. So for each such h 𝔽5n, there exists th such that |Δhf^(th)|2 δ8 2.

Proposition 4.15. Assuming that:

  • f : 𝔽5n

  • f 1

  • fU3(G) δ

  • |𝔽5n| = Ωδ(1)

Then there exists S 𝔽5n with |S| = Ωδ(|𝔽5n|) and a function ϕ : S 𝔽5n^ such that
  • (i) |Δhf^(ϕ(h))| = Ωδ(1);
  • (ii) There are at least Ωδ(|𝔽5n|3) quadruples (s1,s2,s3,s4) S4 such that s1 + s2 = s3 + s4 and ϕ(s1) + ϕ(s2) + ϕ(s4).

STEP 2: If S and ϕ are as above, then there is a linear function ψ : 𝔽5n 𝔽5n^ which coincides with ϕ for many elements of S.

Proposition 4.16. Assuming that:

  • S and ϕ given as in Proposition 4.15

Then there exists n × n matrix M with entries in 𝔽5 and b 𝔽5n such that ψ(x) = Mx + b (ψ : 𝔽5n 𝔽5n^) satisfies ψ(x) = ϕ(x) for Ωδ(|𝔽5n|) elements x S.

Proof. Consider the graph of ϕ, Γ = {(h,ϕ(h)) : h S} 𝔽5n ×𝔽5n^. By Proposition 4.15, Γ has Ωδ(|𝔽5n|3) additive quadruples.

By Balog–Szemeredi–Gowers, Schoen, there exists ΓΓ with |Γ| = Ωδ(|Γ|) = Ωδ(|𝔽5n|) and |Γ + Γ| = Oδ(|Γ|). udefine S S by Γ = {(h,ϕ(h)) : h S} and note |S| = Ωδ(|𝔽5n|).

By Freiman-Ruzsa applied to Γ 𝔽5n ×𝔽5n^, there exists a subspace H 𝔽5n ×𝔽5n^ with |H| = Oδ(|Γ|) = Oδ(|𝔽5n|) such that Γ H.

Denote by π : 𝔽5n ×𝔽5n^ 𝔽5n the projection onto the first n coordinates. By construction, π(H) S. Moreover, since |S| = Ωδ(|𝔽5n|),

|ker (π|H)| = |H| |Im (π|H)| = Oδ(|𝔽5n|) |S| = Oδ(1).

We may thus partition H into Oδ(1) cosets of some subspace H such that π|H is injective on each coset. By averaging, there exists a coset x + H such that

|Γ (x + H)| = Ω δ(|Γ|) = Ω δ(|𝔽5n|).

Set Γ = Γ (x + H), and define S accordingly.

Now π|x+ is injective and surjective onto V := Im (π|x+H). This means there is an affine linear map ψ : V 𝔽5n^ such that (h,ψ(h)) Γ for all h S.

Then do steps 3 and 4.

What to do if you have lots of additive quadruples?

Balog-Szemeredi-Gowers

What to do if you have small doubling constant?

Freiman-Ruzsa (or Polynomial-Freiman-Ruzsa)

˙

Index

additive energy

additive quadruple

Bohr set

characteristic measure

Chernoff’s inequality

difference constant

dissociated

doubling constant

Parseval’s identity

Plancherel’s identity

Rusza distance

ρ-large spectrum of f

sumset