Combinatorial methods - Introduction to Additive Combinatorics

1 Combinatorial methods

Definition 1.1 (Sumset). Let $G$ be an abelian group. Given $A, B \subseteq G$ , define the sumset $A + B$ to be

A + B : = {a + b : a \in A, b \in B}

and the difference set $A - B$ to be

A - B : = {a + b : a \in A, b \in B} .

If $A$ and $B$ are finite, then certainly

\max {| A |, | B |} \leq | A + B | \leq | A | | B | .

Example 1.2. Let $A = [n] : = {1, 2, \dots, n} \subseteq ℤ$ . Then

| A + A | = | {2, \dots, 2 n} | = 2 n - 1 = 2 | A | - 1 .

Lemma 1.3. Assuming that:

$A \subseteq ℤ$ is finite.

Then

| A + A | \geq 2 | A | - 1

, with equality if and only if

A

is an arithmetic progression.

Proof. Let $A = {a_{1}, a_{2}, \dots, a_{n}}$ with $a_{1} < a_{2} < \dots < a_{n}$ . Then

a_{1} + a_{1} < a_{1} + a_{2} < a_{1} + a_{2} < \dots < a_{1} + a_{n} < a_{2} + a_{n} < \dots < a_{n} + a_{n},

so $| A + A | \geq 2 | A | - 1$ . But we could also have written

a_{1} + a_{1} < a_{1} + a_{2} < a_{2} + a_{2} < a_{2} + a_{3} < a_{2} + a_{4} < \dots < a_{2} + a_{n} < a_{3} + a_{n} < \dots < a_{n} + a_{n} .

When $| A + A | = 2 | A | - 1$ , these two orderings must be the same. So $a_{2} + a_{i} = a_{1} + a_{i + 1}$ for all $i = 2, \dots, n - 1$ . □

Exercise: If $A, B \subseteq ℤ$ , then $| A + B | \geq | A | + | B | - 1$ with equality if and only if $A$ and $B$ are arithmetic progressions with the same common difference.

Example 1.4. Let $A, B \subseteq ℤ ∕ p ℤ$ with $p$ prime. Then $| A + B | \geq p + 1 ⟹ A + B = ℤ ∕ p ℤ$ . Indeed, $g \in A + B ⟺ A \cap (g - B) \neq \emptyset$ (note that $g - B$ means ${g} - B$ ). But $\forall g \in ℤ ∕ p ℤ$ ,

| A \cap (g - B) | = | A | + | g - B | - | A \cup (g - B) | \geq | A | + | B | - p \geq 1 .

Theorem 1.5 (Cauchy-Davenport). Assuming that:

$p$ is a prime
$A, B \subseteq ℤ ∕ p ℤ$ nonempty

Then

| A + B | \geq \min {p, | A | + | B | - 1} .

Proof. Assume $| A | + | B | \leq p + 1$ . Without loss of generality assume that $1 \leq | A | \leq | B |$ and that $0 \in A$ . Apply induction on $| A |$ . The case $| A | = 1$ is trivial. Suppose $| A | \geq 2$ , and let $0 \neq a \in A$ .

Since ${a, 2 a, 3 a, \dots, (p - 1) a, p a} = ℤ ∕ p ℤ$ and $| A | + | B | \leq p + 1$ , there must exist $m \geq 0$ such that $m a \in B$ but $(m + 1) a \notin B$ . Let $B^{'} = B - m a$ , so $0 \in B^{'}$ , $a \notin B^{'}$ , $| B^{'} | = | B |$ .

But $1 \leq | A \cap B^{'} | < | A |$ , so the inductive hypothesis applies to $A \cap B^{'}$ and $A \cup B^{'}$ . Since

(A \cap B^{'}) + (A \cup B^{'}) \subseteq A + B^{'},

we have

| A + B | = | A + B^{'} | \geq | (A \cap B^{'}) + (A \cup B^{'}) | \geq | A \cap B^{'} | + | A \cup B^{'} | + 1 = | A | + | B | + 1 . □

This fails for general abelian groups (or even general cyclic groups).

Example 1.6. Let $p$ be (fixed, small) prime, and let $V \leq 𝔽_{p}^{n}$ be a subspace. Then $V + V = V$ , so $| V + V | = | V |$ . In fact, if $A \subseteq 𝔽_{p}^{n}$ is such that $| A + A | = | A |$ , then $A$ must be a coset of a subspace.

Example 1.7. Let $A \subseteq 𝔽_{p}^{n}$ be such that $| A + A | < \frac{3}{2} | A |$ . Then there exists $V \leq 𝔽_{p}^{n}$ a subspace such that $| V | < \frac{3}{2} | A |$ and $A$ is contained in a coset of $V$ . See Example Sheet 1.

Definition 1.8 (Ruzsa distance). Given finite sets $A, B \subseteq G$ , we define the Ruzsa distance $d (A, B)$ between $A$ and $B$ by

d (A, B) = \log \frac{| A - B |}{\sqrt{| A | | B |}}

Note that this is symmetric, but is not necessarily non-negative, so we cannot prove that it is a metric. It does, however, satisfy triangle inequality:

Lemma 1.9 (Ruzsa’s triangle inequality). Assuming that:

$A, B, C \subseteq G$ finite

Then

d (A, C) \leq d (A, B) + d (B, C) .

Proof. Observe that

| B | \cdot | A - C | \leq | A - B | \cdot | B - C | .

Indeed, writing each $d \in A - C$ as $d = a_{d} - c_{d}$ with $a_{d} \in A$ , $c_{d} \in C$ , the map

\begin{array}{l} ϕ : B \times (A - C) & \to (A - B) \times (B - C) \\ (b, d) & \mapsto (a_{d} - b, b - c_{d}) \end{array}

is injective. The triangle inequality now follows from the definition. □

Definition 1.10 (Doubling / difference constant). Given a finite $A \subseteq G$ , we write

σ (A) : = \frac{| A + A |}{| A |}

for the doubling constant of $A$ and

δ (A) : = \frac{| A - A |}{| A |}

for the difference constant of $A$ .

Then Lemma 1.9 shows, for example, that

\log δ (A) = d (A, A) \leq d (A, - A) + d (- A, A) = 2 \log σ (A) .

So $δ (A) \leq σ {(A)}^{2}$ , or $| A - A | \leq \frac{| A + A |^{2}}{| A |}$ .

Notation. Given $A \subseteq G$ and $l, m \in ℕ_{0}$ , we write

l A - m A : = \underset{l times}{\underset{⏟}{A + A + \dots + A}} - \underset{m times}{\underset{⏟}{A - A - \dots - A}} .

Theorem 1.11 (Plunnecke’s Inequality). Assuming that:

$A, B \subseteq G$ are finite sets
$| A + B | \leq K | A |$ for some $K \geq 1$

Then

\forall l, m \in ℕ_{0}

| l B - m B | \leq K^{l + m} | A | .

Proof. Choose a non-empty subset $A^{'} \subseteq A$ such that the ratio $\frac{| A^{'} + B |}{| A^{'} |}$ is minimised, and call this ratio $K^{'}$ . Then $| A^{'} + B | = K^{'} | A^{'} |$ , $K^{'} \leq K$ , and $\forall A^{″} \subseteq A$ , $| A^{″} + B | \geq K^{'} | A^{″} |$ .

Claim: For every finite $C \subseteq G$ , $| A^{'} + B + C | \leq K^{'} | A^{'} + C |$ .

Let’s complete the proof of the theorem assuming the claim. We first show that $\forall m \in ℕ_{0}$ , $| A^{'} + m B | \leq {K^{'}}^{m} | A^{'} |$ . Indeed, the case $m = 0$ is trivial, and $m = 1$ is true by assumption. Suppose $m > 1$ and the inequality holds for $m - 1$ . By the claim with $C = (m - 1) B$ , we get

| A^{'} + m B | = | A^{'} + B + (m - 1) B | \leq K^{'} | A^{'} + (m - 1) B | \leq {K^{'}}^{m} | A^{'} | .

But as in the proof of Ruzsa’s triangle inequality, $\forall l, m \in ℕ_{0}$ , we can show

| A^{'} | | l B - m B | \leq | A^{'} + l B | | A^{'} + m B | \leq {K^{'}}^{l} | A^{'} | {K^{'}}^{m} | A^{'} | = {K^{'}}^{l + m} | A^{'} |^{2} .

Hence $| l B - m B | \leq {K^{'}}^{l + m} | A^{'} | \leq {K^{'}}^{l + m} | A |$ , which completes the proof (assuming the claim).

We now prove the claim by induction on $| C |$ . When $| C | = 1$ the statement follows from the assumptions. Suppose the claim is true for $C$ , and consider $C^{'} = C \cup {x}$ for some $x \notin C$ . Observe that

A^{'} + B + C^{'} = (A^{'} + B + C) + ((A^{'} + B + x) ∖ (D + B + x))

with $D = {a \in A^{'} : a + B + x \subseteq A^{'} + B + X}$ .

By definition of $K^{'}$ , $| D + B | \geq K^{'} | D |$ , so

\begin{array}{l} | A^{'} + B + C^{'} | & \leq | A^{'} + B + C | + | A^{'} + B + x | - | D + B + x | \\ \overset{IH}{\leq} K^{'} | A^{'} + C | + K^{'} | A^{'} | - K^{'} | D | \\ = K^{'} (| A^{'} + C | + | A^{'} | - | D |) \end{array}

We apply this argument a second time, writing

A^{'} + C^{'} = (A^{'} + C) ⊔ ((A^{'} + x) ∖ (E + x))

where $E = {a \in A^{'} : a + x \in A^{'} + C} \subseteq D$ . We conclude that

| A^{'} + C^{'} | = | A^{'} + C | + | A^{'} + x | - | E + x | \geq | A^{'} + C | + | A^{'} | - | D |

| A^{'} + B + C^{'} | \leq K^{'} (| A^{'} + C | + | A^{'} | - | D |) \leq K^{'} | A^{'} + C^{'} |,

proving the claim. □

We are now in a position to generalise Example 1.7.

Theorem 1.12 (Freiman-Ruzsa). Assuming that:

$A \subseteq 𝔽_{p}^{n}$
$| A + A | \leq K | A |$ (i.e. $σ (A) \leq K$ )

Then

A

is contained in a subspace

H \leq 𝔽_{p}^{n}

of size

| H | \leq K^{2} p^{K^{4}} | A |

Proof. Choose $X \subseteq 2 A - A$ maximal such that the translates $x + A$ with $x \in X$ are disjoint. Such a set $X$ cannot be too large: $\forall x \in X$ , $x + A \subseteq 3 A - A$ , so by Plunnecke’s Inequality, since $| 3 A - A | \leq K^{4} | A |$ ,

| X | | A | = | ⋃_{x \in X} (x + A) | \leq | 3 A - A | .

So $| X | \leq K^{4}$ . We next show

2 A - A \subseteq X + A - A . (∗)

Indeed, if $y \in 2 A - A$ and $y \notin X$ , then by maximality of $X$ , $y + A \cap x + A \neq \emptyset$ for some $x \in X$ (and if $y \in X$ , then clearly $y \in X + A - A$ ).

It follows from ( $*$ ) by induction that $\forall l \geq 2$ ,

l A - A \subseteq (l - 1) X + A - A, (∗∗)

since

l A - A = A + \underset{\subseteq (l - 2) X + A - A}{\underset{⏟}{(l - 1) A - A}} \subseteq (l - 2) X + \underset{⏟}{2 A - A} \subseteq X + A - A \subseteq (l - 1) X + A - A .

Now let $H \leq 𝔽_{p}^{n}$ be the subgroup generated by $A$ , which we can write as

H = ⋃_{l \geq 1} (l A - A) \overset{(* *)}{\subseteq} Y + A - A

where $Y \leq 𝔽_{p}^{n}$ is the subgroup generated by $X$ .

But every element of $Y$ can be written as a sum of $| X |$ elements of $X$ with coefficients amongst $0, 1, \dots, p - 1$ , hence $| Y | \leq p^{| X |} \leq p^{K^{4}}$ . To conclude, note that

| U | \leq | Y | | A - A | \leq p^{K^{4}} \leq p^{K^{4}} K^{2} | A |,

where we use Plunnecke’s Inequality or even Ruzsa’s triangle inequality. □

Example 1.13. Let $A = V \cup R$ where $V \leq 𝔽_{p}^{n}$ is a subspace of dimension $K ≪ d ≪ n - K$ and $R$ consists of $K - 1$ linearly independent vectors not in $V$ .

Then

| A | = | V \cup R | = | V | + | R | = p^{n ∕ k} + K - 1 \sim p^{n ∕ k} = | V |

and

| A + A | = | (V \cup R) + (V \cup R) | = | V \cup (V + R) \cup (R + R) | \sim K | V | .

But any subspace $K \leq 𝔽_{p}^{n}$ containing $A$ must have size at least $p^{n ∕ K + (K - 1)} \sim | V | \cdot p^{K}$ , so the exponential dependence on $K$ is necessary.

Theorem 1.14 (Polynomial Freiman-Ruzsa, due to Gowers–Green–Manners–Tao 2024). Assuming that:

$A \subseteq 𝔽_{p}^{n}$
$| A + A | \leq K | A |$

Then there exists a subspace

K \leq 𝔽_{p}^{n}

of size at most

C_{1} (K) | A |

such that for some

x \in 𝔽_{p}^{n}

| A \cap (x + K) | \geq \frac{| A |}{C_{2} (K)},

where $C_{1} (K)$ and $C_{2} (K)$ are polynomial in $K$ .

Proof. Omitted, because the techniques are not relevant to other parts of the course. See Entropy Methods in Combinatorics next term. □

Definition 1.15. Given $A, B \subseteq G$ we define the additive energy between $A$ and $B$ to be

E (A, B) = | {(a, a^{'}, b, b^{'}) \in A \times A \times B \times B : a + b = a^{'} + b^{'}} | .

We refer to the quadruples $(a, a^{'}, b, b^{'})$ such that $a + b = a^{'} + b^{'}$ as additive quadruples.

Example 1.16. Let $V \leq 𝔽_{p}^{n}$ be a subspace. Then $E (V) = E (V, V) = | V |^{3}$ .

On the other hand, if $A \subseteq ℤ ∕ p ℤ$ is chosen at random from $ℤ ∕ p ℤ$ (each element chosen independently with probability $α > 0$ ), then with high probability

E (A) = E (A, A) = α^{4} p^{3} = α | A |^{3} .

Lemma 1.17. Assuming that:

$A, B \subseteq G$
both non-empty

Then

E (A, B) \geq \frac{| A |^{2} | B |^{2}}{| A + B |} .

Proof. Define $r_{A + B} (x) = | {(a, b) \in A \times B : a + b = x} |$ (and notice that this is the same as $| A \cap (x - B) |$ ). Observe that

\begin{array}{l} E (A, B) & = | {(a, a^{'}, b, b^{'}) \in A^{2} \times B^{2} : a + b = a^{'} + b^{'}} \\ = \sum_{x \in G} r_{A + B} {(x)}^{2} \\ = \sum_{x \in A + B} r_{A + B} {(x)}^{2} \\ \geq \frac{{(\sum_{x \in A + B} r_{A + B} (x))}^{2}}{| A + B |} & by Cauchy-Schwarz \end{array}

but

\begin{array}{l} \sum_{x \in G} | A \cup (x - B) | & = \sum_{x \in G} \sum_{y \in G} 𝟙_{A} (y) 𝟙_{x - B} (y) \\ = \sum_{x \in G} \sum_{y \in G} 𝟙_{A} (y) 𝟙_{B} (x - y) \\ = | A | | B | \end{array}

(As usual, $𝟙_{A}$ here means the indicator function). □

In particular, if $| A + A | \leq K | A |$ , then

E (A) = E (A, A) \geq \frac{| A |^{4}}{| A + A |} \geq \frac{| A |^{3}}{K} .

The converse is not true.

Example 1.18. Let $G$ be your favourite (class of) abelian group(s). Then there exist constants $𝜃, η > 0$ such that for all sufficiently large $n$ , there exists $A \subseteq G$ , with $| A | \geq n$ satisfying $E (A) \geq η | A |^{3}$ and $| A + A | \geq 𝜃 | A |^{2}$ .

Theorem 1.19 (Balog–Szemeredi–Gowers, Schoen). Assuming that:

$A \subseteq G$ is finite
$E (A) \geq η | A |^{3}$ for some $η > 0$

Then there exists

A^{'} \subseteq A

of size at least

c_{1} (η) | A |

such that

| A^{'} + A^{'} | \leq \frac{| A^{'} |}{c_{2} (η)}

, where

c_{1} (η)

and

c_{2} (η)

are polynomial in

η

Idea: Find $A^{'} \subseteq A$ such that $\forall a, b \in A^{'}$ such that $a - b$ has many representations as $(a_{1} - a_{2}) + (a_{3} - a_{4})$ with $a_{i} \in A$ .

We first prove a technical lemma, using a technique called “dependent random choice”.

Definition 1.20 (gamma-popular differences). Given $A \subseteq G$ and $γ > 0$ , let

P_{γ} = {x \in G : | A \cap (x + A) | \geq γ | A |}

be the set of $γ$ -popular differences of $A$ .

Lemma 1.21. Assuming that:

$A \subseteq G$ is finite
$E (A) \geq η | A |^{3}$
$c > 0$

Then there is a subset

X \subseteq A

of size

| X | \geq η | A | ∕ 3

such that for all but a

(16 c)

-proportion of pairs

(a, b) \in X^{2}

a - b \in P_{c η}

Proof. Let $U = {x \in G : | A \cap (x + A) | \leq \frac{1}{2} η | A |}$ . Then

\begin{array}{l} \sum_{x \in U} | A \cap (x + A) |^{2} & \leq \frac{1}{2} η | A | \sum_{x} | A \cap (x + A) | \\ = \frac{1}{2} η | A |^{3} \\ \leq \frac{1}{2} E (A) \end{array}

For $0 \leq i \leq ⌈ \log_{2} η^{- 1} ⌉$ , let

Q_{i} = {x \in G : \frac{| A |}{2^{i + 1}} < | A \cap (x + A) | \leq \frac{| A |}{2^{i}}},

and set $δ_{i} = η^{- 1} 2^{- 2 i}$ . Then

\begin{array}{l} \sum_{i} δ_{i} | Q_{i} | & = \sum_{i} \frac{| Q_{i} |}{η 2^{2 i}} \\ = \frac{1}{η | A |^{2}} \sum_{i} \frac{| A |^{2}}{2^{2 i}} | Q_{i} | \\ \geq \frac{1}{η | A |^{2}} \sum_{i} \frac{| A |^{2}}{2^{2 i}} \sum_{x \notin U} 𝟙_{{\frac{| A |}{2^{i + 1}} < | A \cap (x + A) | \leq \frac{| A |}{2^{i}}}} \\ \geq \frac{1}{η | A |^{2}} \sum_{x \notin U} | A \cap (x + A) |^{2} \\ \geq \frac{1}{η | A |^{2}} \cdot \frac{1}{2} E (A) & (\sum_{x \in U} | A \cap (x + A) |^{2} \leq \frac{1}{2} E (A)) \\ \geq \frac{1}{2} | A | & (*) \end{array}

Let $S = {(a, b) \in A^{2} : a - b \notin P_{c η}}$ . Then

\begin{array}{l} \sum_{i} \sum_{(a, b) \in S} | (A - a) \cap (A - b) \cap Q_{i} | & \leq \sum_{(a, b) \in S} \underset{= \underset{by definition of S}{\underset{⏟}{| A \cap (a - b + A) | \leq c η | A |}}}{\underset{⏟}{| (A - a) \cap (A - b) |}} \\ \leq | S | \cdot c η | A | \\ \leq c η | A |^{3} \\ \leq 2 c η | A |^{2} \cdot \frac{1}{2} | A | \\ \overset{(*)}{\leq} 2 c η | A |^{2} \sum_{i} δ_{i} | Q_{i} | \end{array}

Hence there exists $i_{0}$ such that

\sum_{(a, b) \in S} | (A - a) \cap (A - b) \cap Q_{i_{0}} | \leq 2 c η | A |^{2} δ_{i_{0}} | Q_{i_{0}} | .

Let $Q = Q_{i_{0}}$ , $δ = δ_{i_{0}}$ , $λ = 2^{- i_{0}}$ . So

\sum_{(a, b) \in S} | (A - a) \cap (A - b) \cap Q | \leq 2 c η δ | A |^{2} | Q | . (∗∗)

Find $x$ such that $X = | A \cap (A + x) |$ is large.

Given $x \in G$ , let $X (x) = A \cap (x + A)$ . Then

𝔼_{x \in Q} | X (x) | = \frac{1}{| Q |} \sum_{x \in Q} | A \cap (x + A) | \geq \frac{1}{2} λ | A | .

Let $T (x) = {(a, b) \in X {(x)}^{2} : a - b \notin P_{c η}}$ . Then

\begin{array}{l} 𝔼_{X \in Q} | T (x) | & = 𝔼_{x \in Q} | {(a, b) \in {(A \cap (\underset{x \in A - a \cap A - b}{\underset{⏟}{x}} + A))}^{2} : a - b \notin P_{c η}} | \\ = \frac{1}{| Q |} \sum_{x \in Q} | {(a, b) \in S : x \in A - a \cap A - b} | \\ = \frac{1}{| Q |} \sum_{(a, b) \in S} | (A - a) \cap (A - b) \cap Q | \\ \leq \frac{1}{| Q |} 2 c η | A |^{2} δ | Q | \\ = 2 c η δ | A |^{2} \\ = 2 c λ^{2} | A |^{2} \end{array}

Therefore,

\begin{array}{l} 𝔼_{x \in Q} | X (x) |^{2} - {(16 c)}^{- 1} | T (x) | & \overset{C-S}{\geq} {(𝔼_{x \in Q} | X (x) |)}^{2} - {(16 c)}^{- 1} 𝔼_{x \in Q} | T (x) | \\ \geq {(\frac{λ}{2})}^{2} | A |^{2} - {(16 c)}^{- 1} 2 c λ^{2} | A |^{2} \\ = (\frac{λ^{2}}{4} - \frac{λ^{2}}{8}) | A |^{2} \\ = \frac{λ^{2}}{8} | A | \end{array}

So there exists $x \in Q$ such that $| X (x) |^{2} \geq \frac{λ^{2}}{8} | A |^{2}$ , in which case we have

| X | \geq \frac{λ}{\sqrt{8}} | A | \geq \frac{η}{3} | A |

and $| T (x) | \leq 16 c | X |^{2}$ . □

Proof of Theorem 1.19. Given $A \subseteq G$ with $E (A) \geq η | A |^{3}$ , apply Lemma 1.21 with $c = 2^{- 7}$ to otain $X \subseteq A$ of size $| X | \geq \frac{η}{3} | A |$ such that for all but $\frac{1}{8}$ of pairs $(a, b) \in X^{2}$ , $a - b \in P_{η ∕ 2^{7}}$ . In particular, the bipartite graph

G = (X \dot{\cup} X, {(x, y) \in X \times X : x - y \in P_{η ∕ 2^{7}}})

has at least $\frac{7}{8} | X |^{2}$ edges. Let $A^{'} = {x \in X : \deg (x) \geq \frac{3}{4} | X |}$ .

Clearly, $| A^{'} | \geq \frac{| X |}{8}$ . For any $a, b \in A^{'}$ , there are at least $\frac{| X |}{2}$ elements $y \in X$ such that $(a, y), (b, y) \in E (G)$ ( $a - y, b - y \in P_{η ∕ 2^{7}}$ ).

Thus $a - b = (a - y) - (b - y)$ has at least

\underset{choices for y}{\underset{⏟}{\frac{η}{6} | A |}} \cdot \frac{η}{2^{7}} | A | \cdot \frac{η}{2^{7}} | A | \geq \frac{η^{3}}{2^{17}} | A |^{3}

representations of the form $a_{1} - a_{2} - (a_{3} - a_{4})$ with $a_{i} \in A$ .

It follows that

\begin{array}{l} \frac{η^{3}}{2^{17}} | A |^{3} | A^{'} - A^{'} | & \leq | A |^{4} \\ ⟹ | A^{'} - A^{'} | & \leq 2^{17} η^{- 3} | A | \\ \leq 2^{22} η^{- 4} | A^{'} | \end{array}

Thus $| A^{'} + A^{'} | \leq 2^{44} η^{- 8} | A^{'} |$ . □

[next] [prev] [prev-tail] [front] [up]