6 Dependent Random Choice

Definition 6.1 ((s,k)-rich). Let G be a graph. Say that RV(G) is (s,k)-rich if for every x1,,xsR we have

|N(x1)N(xs)|k.

PIC

Theorem 6.2 (Dependent Random Choice). Assuming that:

  • G a graph on n vertices with e(G)=m

  • s,t,r,k0

  • (2m)tn2t1ns(kn)tr

Then there exists RV(G), |R|r which is (s,k)-rich.

Note that t doesn’t appear in the conclusion. It is a parameter that we can tune appropriately in applications.

Proof. We sample x1,,xtV(G) uniformly at random (with replacement). Then

𝔼x1,,xt|N(x1)N(xt)|=𝔼x1,,xtyV(G)𝟙x1yx2yxty=yV(G)[x1(x1y)]t=yV(G)(d(y)n)tn(2mn2)t=(2m)tn2t1

Let Y be the random variable that counts the number of {y1,,ys}(N(x1)N(xt))(s) with

|N(y1)N(ys)|<k.

We want Y to be not too big. This will be the case (on average) because (y1,,ysN(x1)N(xt)) is an increasing function of |N(y1)N(ys)| (so our probability distribution is biased away from fully selecting tuples that will increase Y). We make this precise as follows:

PIC

Note

Y={y1,,ys}(V(G))(s)|N(y1)N(ys)|<k𝟙y1,,ysN(x1)N(xt),

so

𝔼Y{y1,,ys}(V(G))(s)|N(y1)N(ys)|<k(x1,,xtN(y1)N(ys))ns(kn)t

To finish, define R to be N(x1)N(xt) with a vertex deleted from each {y1,,ys}(i=1tN(xi))(s) with |N(y1)N(ys)|<k. By definition, R is (s,k)-rich, so we now just need to ensure that |R|r holds for some choice of x1,,xt. Indeed,

|R||N(x1)N(xt)|Y(x1,,xt)𝔼|R|(2m)tn2t1ns(kn)tr

The last line follows from the assumption.

Definition 6.3 (Extremal number). Let H be a graph. The extremal numbers of H are

ex(n,H)=max{e(G):GH,G on n vertices}.

Theorem 6.4 (Erdős-Stone). Assuming that:

  • H a graph

  • χ(H)=r3

Then
ex(n,H)=(11r1+o(1))n2.

Definition 6.5 (Complete bipartite graph). Ks,t is the complete bipartite graph with parts of sizes s, t respectively.

PIC

If χ(H)3, then for all s,t, we have Ks,tH. More generally, a complete (χ(H)1)-partite graph will not contain a copy of H. Erdős-Stone says that these constructions are optimal, up to o(1)n2 edges.

Erdős-Stone doesn’t help us when χ(H)=2. In this case, it is true that

cn212tex(n,Kt,t)Ctn21t.

See Part II Graph Theory for proofs of both bounds.

These bounds are essentially the best known for t5.

We now prove a generalisation of this upper bound that holds for graphs other than complete bipartite graphs:

Theorem 6.6 (Füredi). Assuming that:

  • H be a bipartite graph with vertex partition AB

  • deg(x)s for all xB

Then
ex(n,H)CHn21s,

for some CH>0 (CH depends only on H and not on n).

Lemma 6.7. Assuming that:

  • H be a bipartite graph with vertex partition AB

  • deg(x)s for all xB

  • G a graph

  • RV(G) a (s,|V(H)|)-rich set

  • |R||A|

Then GH.

PIC

Proof. First inject AR. We now inductively place the vertices B={y1,,yl}. Assume we have embedded y1,,yi into G. To embed yi+1, simply consider the N(yi+1). These have been embedded inside of R and thus have |V(H)| common neighbours. So there must be a choice for yi+1 that does not clash with the previous choices.

Proof of Füredi. We are given G with e(G)Cn21s. We choose C=2|H|, and will show GH. By lemma, enough to find a (s,|V(H)|)-rich set of size |A|. Let h=|V(H)| and t=s. Then

(2Cn21s)tn2t1ns(hn)t(2C)sns(hn)s=(4h)shsh|A|

as desired.

Definition 6.8 (Hypercube graph). Qd denotes the hypercube graph: the graph with V(Qd)={0,1}d and x,y{0,1}d are adjacent if they differ in exactly one coordinate.

Theorem 6.9. R(Qd)210d.

Proof. Let n=210d and let χ be a colouring of Kn. Let G be the graph of the majority colour. So e(G)12n2n28. So we apply Dependent Random Choice to find a (d,2d)-rich set of size 2d. If t=2d, then

(n24)tn2t1(end)d(2dn)tn42dnd2dtn2dn42d22d2210d22d

Then finish by applying Lemma 6.7.

The reasoning for choosing t=2d is that we wanted to make (2dn)t win against the (end)d. Setting t=d makes the powers of n cancel, but then we are still left with the big 2dt term, so instead we make t a bit bigger than d.

Remark. One can use this proof to get

R(Qd)23d.

Burr-Erdős conjectured that R(Qd)C2d. State of the art: R(Qd)2(2𝜀)d by Tikhomirov 2023.

Say I have a graph H, with max degree d where d fixed and |V(H)|=k. Then it is known that

R(H)Cdk.(∗)

Proposition 6.10. Assuming that:

  • d1, 𝜀>0

  • kk0, for some k0 depending on d and 𝜀

  • H bipartite and max degree d

Then
R(H)k1+𝜀.

Proof. Apply Dependent Random Choice and embedding Lemma 6.7 to the majority colour.

So Dependent Random Choice and Lemma 6.7 are enough to prove an upper bound that is almost linear in k, but not quite powerful enough to prove the linear bound as in ().

Remark. To improve this to get the bound R(H)Cdn, one idea might be to find a (d,k)-rich set where k=Θ(n) (n=Cdk), where the rich set has size Θ(n) on n vertices. But this is impossible.

Proposition 6.11. For all c>0, there exists a graph G with (12+o(1))n2 edges so that every set of size cn contains some x,y with o(n) common neighbours.

PIC

Proof. We won’t prove.

So we have to go back into the guts of the embedding idea. In some sense, the embedding algorithm was a little bit wasteful: for example, we dumped A into the rich set without any clever choice.

The key idea to improve the previous bound will be to perform a more clever embedding algorithm that requires a weaker notion of rich set. We want to use a weaker notion of rich set so that we can find large linear size rich sets.

Definition 6.12 (Approximately (s,k)-rich). Let G be a graph. Say UV(G) is approximately (s,k)-rich if

|{(x1,,xs)Us:d(x1,,xs)<k}|(12s)s|U|s.

Here we use the notation

d(x1,,xs)=|N(x1)N(xs)|.

Lemma 6.13. Assuming that:

Then GH.

Lemma 6.14. Assuming that:

  • 𝜀>0

  • G a graph with 𝜀n22 edges on n vertices

  • n4s𝜀sk

Then there exists UV(G) that is approximately (s,k)-rich and |U|>2k.

Theorem 6.15 (Fox, Sudakov, Conlon). Assuming that:

  • H a bipartite graph on k vertices

  • max degree d

Then
R(H)Cd2dk.

Remark. This also improves our bound on Qd:

R(Qd)Cd22d.

The key difference in proving Lemma 6.13 compared to the previous embedding algorithm is that we will need to embed A in a more clever way. We will need to ensure that for all bB, the neighbourhood of b is not sent to one of the bad subsets of U (the “bad” subsets are those which have <k common neighbours).

PIC

Proof of Lemma 6.13. For x1,,xt, t<s, let

β(x1,,xt)=|{(x1,,xt,xt+1,,xs)Us:d(x1,,xs)<k}|.

Say that (x1,,xt) is healthy if

β(x1,,xt)(12s)st|U|st.

We will embed vertices of A one after the other. We will use this notion of healthiness to make sure that every time we place a vertex, we keep plenty of good extensions.

Observation: If x1,,xt is healthy, then the number of y such that (x1,,xt,y) is not healthy is |U|2s.

Proof of observation: Note

β(x1,,xt)=yUβ(x1,,xt,y).

Let Bx1,,xt={y:(x1,,xt,y) is not healthy}. Then

β(x1,,xt)yBx1,,xtβ(x1,,xt,y)|Bx1,,xt|(12s)st1|U|st1

But if |Bx1,,xt|>|U|2s, I contradict that x1,,xt is healthy. So we have proved the observation.

We now inject f:AU inductively. Assume we have embedded a1,,ai so far. We assume that we have

f(N(b){a1,,ai})

is healthy for all bB. We now embed ai+1. Let b1,,bs be the neighbour of ai+1. We know f(N(bi){a1,..,ai}) are healthy, so there are at most (by observation) |U|2s ways of choosing f(ai+1)U so that one of the neighbours becomes unhealthy. But there are only s neighbourhoods. So there are >|U|2 choices that maintain all healthy neighbours. Since |U|>2k, I can map ai+1 to a new vertex.

We now inject f:BV(G) exactly as we saw in the proof of Lemma 6.7.

Proof of Lemma 6.14. Let x1,,xsV(G) be chosen uniformly at random. Let U=N(x1)N(xs). Let X=|U|. We have 𝔼X𝜀sn.

Let Y=|{(y1,,ys)Us:d(y1,,ys)<k}|. 𝔼Yns(kn)s. Now note that by Jensen,

𝔼[Xs(𝔼X)sY2𝔼Y(𝔼X)s2]=𝔼Xs(𝔼X)s0.

So there is a U such that

|U|s(𝔼X)s2(𝜀sn)s2.

So |U|𝜀sn2>2k.

Y2(𝔼Y)(𝔼X)s|U|s(12s)s|U|s.

Proof of Fox, Sudakov, Conlon. Apply our second dependent random choice lemma (Lemma 6.14) to find an approximately (d,k)-rich set UV(G), with |U|>2k, where G is the graph of the denser colour. Then apply the embedding lemma for approximately (d,k)-rich sets (Lemma 6.13) to find GH.

Theorem 6.16 (Graham, Rödl, Ruciński). Assuming that:

  • H a graph on k vertices with max degree d

Then there exists a constant Cd>0 depending only on d, such that
R(H)Cdk.

We will show

Cd2Cd(logd)2

for some absolute constant C.

Remark. There exists H such that R(H)>2cdk, for c>0.

This is also “almost” the best bound known on these Ramsey numbers. The best known bound is

R(H)2Cd(logd)k,

which was proved by Fox, Conlon, Sudakov.

Definition 6.17 ((p,eps)-dense). Say that a graph G on n vertices is (p,𝜀)-dense if A,BV(G), |A|,|B|𝜀n, we have

e(A,B)p|A||B|.

Lemma 6.18 (Embedding lemma for (p,eps)-dense graphs). Assuming that:

  • 𝜀=pd2d

  • nk𝜀

  • G a graph on n vertices which is (p,𝜀)-dense

  • H a graph on k vertices with maximum degree d

Then GH.

Sketch of how to use embedding lemma: Apply lots of times, with say p=14d. Get lots of parts, with high density between all. Then can just greedily embed.

PIC

The lemma for greedily embedding will be:

Lemma 6.19. Assuming that:

  • G a graph on n>2k vertices

  • δ(G)(114d)n

  • H a graph on k vertices with max degree d

Then GH.

With the statements of these lemmas, we can already get an idea of where the 2Cd(logd)2 comes from. We will use p=14d, so by Embedding lemma for (p,eps)-dense graphs, we can always either find a copy of H, or zoom in by a factor of 2Cdlogd to get a very dense bipartite graph. We will apply this Clogd times to get a highly dense graph, which we can then apply Lemma 6.19 to, and so we will need at least (2Cdlogd)Clogd=2Cd(logd)2 vertices.

We now focus on discussing and proving Embedding lemma for (p,eps)-dense graphs.

Observation: Let G be a (p,𝜀)-dense graph, and let C0,C1,,CdV(G) with |C0|d𝜀n and |Ci|>𝜀n. Then there exists xC0 such that

|N(x)Ci|p|Ci|

for all i=1,,d.

PIC

Proof. Let Bi={xC0:|N(x)Ci|<p|Ci|}. Note

e(Bi,Ci)<p|Bi||Ci|.

Hence Bi<𝜀n for all i. So we must have

C0(B1Bd).

PIC

Proof of Embedding lemma for (p,eps)-dense graphs. The idea of the proof is to embed the vertices one by one. This could go wrong if we place vertices in a way that reduces the possibilities for future vertices by too much. We avoid this issue by embedding each vertex using the above Observation, to make sure that future vertices still have lots of valid places to be embedded into.

Let V(H)={x1,,xk}. We inductively map xivi, viV(G). At each stage we maintain a set of candidates for each xi. In particular,

Ci(j)={candidates for xj at step i},

meaning after we have embedded x1v1, …, xivi, we maintain that after embedding x1,,xi we have

Ci(j)rixjxrN(vr).(∗)

We also require

|Ci(j)|pdi(j)n(∗∗)

where di(j)=|N(xj){x1,,xi}|.

Say at step i, I have mapped x1,,xi preserving adjacencies and preserving () and ().

We now choose vi+1. We want to choose vi+1Ci(i+1). Note

|Ci(i+1)|pdn2d𝜀n.

Let C0=Ci(i+1){v1,,vi} and note

|C0|2d𝜀nk(2d1)𝜀n.

Now apply the observation with C0 and Ci(j1),,Ci(jd), where xixj1,,xjd, to find a vertex vi+1Ci(i+1) such that

|N(vi+1)Ci(j1)|ppdi(j1)n|N(vi+1)Ci(j2)|ppdi(j2)n |N(vi+1)Ci(jd)|ppdi(jd)n

as desired.

Proof of Lemma 6.19. Greedily embed H into G. This is possible, because when we embed xi+1, there are at least

nn4dn4dn4dd times

many possible options (a lot!).

Lemma 6.20. Assuming that:

  • p,𝜀(0,1), t{0}

  • G a graph with no (p,𝜀)-dense subgraph on (𝜀4)tn vertices

Then there exists AV(G) with |A|(𝜀4)tn and
Δ(G[A])(32p+2t)|A|.

We will apply this Lemma with tlogd and p1d, so that we get a subgraph G[A] satisfying the assumptions of the greedy embedding algorithm (Lemma 6.19).

Proof. We work by induction on t.

For t=0: trivial, can just take A=G, since 2t=1.

Assume holds for t1. Let X,YV(G) such that

e(X,Y)p|X||Y|,

where |X|,|Y|=𝜀n, because G is not (p,𝜀)-dense.

If |XY|𝜀n2 then let A=XY, so

e(A,A)e(X,Y)p|X||Y|4p|A|2.

Throw away vertices of degree 16p|A| and we delete at most 12 of A. Call the resulting set A, which is our desired set.

So we can assume that |XY|𝜀n2. So we work with XY and YX.

Fact: If I have a bipartite graph G with bipartition AB and e(G)p|A||B|, then there exists A0A with |A0||A|2 so that

|N(x)B|2p|B|,

for all xA0.

Proof: Let xA chosen uniformly at random. Then

x(|N(x)B|2p|B|)1|A|xAd(x)2p|B|=p|A||B|2p|B||A|=12.

So if we let A0 be the set of vertices satisfying the desired condition, then we have |AA0||A|12 as desired.

Using this fact, we put

A0={xA:|N(x)B|2p|B|}.

Let X0XY be such that

|N(x)(YX)|2p|YX|

and |X0||XY|2𝜀4n.

Now apply induction inside of X0 to find XAX0 with

|XA|(𝜀4)t1(𝜀n4)=(𝜀4)tn

and

Δ(G[XA])(64p+2t)|XA|.

Now consider the bipartite graph between XA and YX. Note

e(XA,YX)=xXA|N(x)YX|2p|N(x)YX||XA|

So apply fact to find Y0YX with |Y0||YX| and |N(x)XA|4p|XA|. Now apply induction inside of Y0 to get YAY0 satisfying the induction hypothesis.

Set A=YAXA. We now check the max degree condition. Let xA and without loss of generality assume xYA.

|N(x)A|(64p+2t+1)|YA|+4p|XA|.

Then

|N(x)A||A|(64p+2t+1)12+4p2(64p+2t)

as desired.

This proof had some lies. At the end, we actually need to perform another step where we delete high degree vertices. Also, for some of the inequalities, we treated them as equality so that we can divide by them nicely. This can be fixed by passing down to a subgraph, because an average subgraph has the same density as the original graph.

The proof is probably simpler if we instead make our inductive claim be about the density of G[A], and then delete high degree vertices at the end to get a maximum degree condition.

Proof of Graham, Rödl, Ruciński. Let n>2Cd(logd)2k. Let χ be a 2-colouring of Kn. Set p=14d and 𝜀=pd4d, let t=2log2d.

Either there is a subset of V(G) that is (p,𝜀)-dense with (𝜀4)tn vertices, in which case I apply Embedding lemma for (p,eps)-dense graphs to find H. Otherwise find a very dense subgraph in the other colour (Lemma 6.20) and apply the greedy embedding algorithm (Lemma 6.19).