1 Diophantine Approximation

Theorem 1.1 (Dirichlet). Assuming that:

  • α is an irrational real number

Then there exist infinitely many pq such that
|αpq|1q2.

Proof. Consider the numbers 0,α,2α,,Nα for some fixed N>0. Consider them in [0,1]. Note

[0,1N)[1N,2N)[N1N,1).

By the box principle (pigeonhole principle), there exists Nn2>n10 such that n2α and n1α belong to the same interval. Then:

|n2αn1αp|1N

for some p. Take q=n2n1. Then

|αpq|1Nq1q2.

Take N, then you get an infinite sequence of rationals. If α is not raitonal, then this sequence cannot stabilise, so we get infinitely many pq as desired.

Can we do better?

In particular for α¯.

Theorem (Liouville). Assuming:

Then there exists c>0 such that for all pq with αpq, we have

|αpq|cqd.

Proof. Let P[x] be the minimal polynomial of α, so P(α)=0. Now note that P(pq)0 (by irreducibility when d2, and for d=1 using the hypothesis that αpq). Then

|P(pq)|1qd.

Note that P(pq) is rational with denominator qd. On the other hand,

|P(pq)|(maxx[α1,α+1]|P(x)|)|αpq|.

provided |αpq|1, which we may assume. Hence

|αpq|>cqd.

Improvements of the exponent d in Liouville:

Theorem 1.2 (Roth). Assuming that:

  • α is an irrational real algebraic number

Then there exists c=c(α,𝜀)>0 such that for all pq we have
|αpq|cq2+𝜀.

Theorem 1.3 (Thue). Assuming that:

  • P(X,Y)[X,Y] homogeneous of degree d3

  • without repeated factors

  • m

Then the equation
P(X,Y)=m

has only finitely many solutions in 2 with gcd(X,Y)=1.

Liouville’s theorem |P(p,q)|1.

Lemma 1.4. Assuming that:

  • P[X,Y] be homogeneous of degree d

  • without repeated factors

Then for all p,q, there exists α root of P(X,1) such that
cqdP(p,q)|αpq|CqdP(p,q).

Here c, C depend on P, and a fixed compact set that contains pq.

Proof. Let

P(X,1)=a(Xα1)(Xαd),

with α1,,αd distinct (since we assumed no repeated factors, and characteristic 0 fields are always separable). Without loss of generality assume that α1 is the closest to pq.

Then c0<|pqαj|<C0 for some constants depending on P and the compact set for j1. So we get lower and upper bounds on P(pq,1)=P(p,q)1qd.

Proof of Thue. Suppose P(p,q)=m. The lemma tells us that there exists α a root of P such that

|pqα|<Cqd|P(p,q)m|=Cmqd.

If the degree of α2, then Roth or already Thue implies that q must be bounded, hence only finitely many solutions.

For α, we use Liouville.

Let (x1,,xn)n. The height of it is

H(x1,,xn)=max(|x1|,,|xn|).

Theorem 1.5 (Subspace theorem, Archimedean version, Schmidt). Assuming that:

  • n2

  • L1,,Ln linearly independent linear forms with algebraic coefficients in n-variables

Then for all 𝜀>0 the solutions of
j=1n|LJ(x1,,xn)|<H(x1,,xn)𝜀,(∗)

for (x1,,xn)n are contained in a finite collection of proper linear subspaces of n, which depend only on L1,,Ln,𝜀.

PIC

The volume of the region is

H(x1,,xn)Handj=1n|Lj(x1,,xn)|<H𝜀

is (logH)n1H𝜀. Consider the paralellepipeds:

|Lj(x1,,xn)|<Hκk

for some κj with κj=𝜀.

This implies Roth’s theorem:

Let α¯ irrational. Consider the linear forms

L1(X1,X2)=X1αX2L2(X1,X2)=X2

Let p,q. Then () is equivalent to |pαq||q|<max(p,q)𝜀. If |pqα|<|α|2, then this is equivalent to |pqα|<Cq2𝜀. Roth’s theorem is true apart from p,q contained in a finite collection of subspaces. A subspace is of the form p+βq=0 for some β or maybe q=0.

Obvious subspaces:

The places of is M and it consists of all prime numbers and . For each vM, we define an absolute value on . || is the ordinary absolute value. If vM is a prime number, this is the v-adic absolute value, that is, for a, |a|v=vb where b is maximal such that vb|a. For ab, we define |ab|v=|a|v|b|v. If x,y, then:

When v,

|x+y|max(|x|v,|y|v).

This is called the ultrametric inequality.

Theorem 1.6 (Subspace theorem, p-adic version with Q coeffs). Assuming that:

  • n2

  • SM with S

  • for each vS, let L1(v),,Ln(v) be linearly independent forms with rational coefficients in n variables

Then the solutions of
vSj=1n|Lj(v)(x1,,xn)|v<H(x17,xn)𝜀,

with (x1,,xn)n are contained in a finite collection of proper subspaces of n.

n=2, S={2,3,}, Lj(v)=Xj, vS, j=1,2. Consider a. Let a=2k3lb with b not divisible by 2 or 3.

|a|2|a|3|a|=2k3l|a|=|b|.

Consider X1=2k, X2=3l, then

vSj=12|Lj(v)(2k,3l)|v=1.

What happens if you replace L2() with X1X2?

Proposition 1.7. Assuming that:

  • 𝜀>0

Then there exists c=c(𝜀)>0 such that for p,q,k,m>0, we have
|p2kq3m|>cmax(2k,3m)1𝜀max(p,q)

or p2k=q3m.

Proof. Take n=2, S={2,3,}. Let Lj(v)=Xj for all j,v, except: L2()=X2X1. Then the solutions of

vSj=12|Lj(v)(x1,x2)|v<H(x1,x2)𝜀2

with x1,x2 are contained in the lines: X1=0, X2=0, X1=X2 plus finitely many points.

Plug in X1=p2k, X2=q3m. Then

|L1()(x1,x2)|=p2k|L2()(x1,x2)|max(2k,3m)1𝜀max(p,q)

provided p,k,q,m does not satisfy the claim with c=1. Also,

|L1(2)(x1,x2)|22k|L2(2)(x1,x2)|21
|L1(3)(x1,x2)|31|L2(3)(x1,x2)|33m

so

()p3mmax(2k,3m)1𝜀max(p,q)max(2k,3m)1𝜀3m.

Assume 3m2k by symmetry. Then

()max(2k,3m)eps.

We can assumme that p,q3m, for otherwise the claim is trivial. Then H(p2k,q3m)32m. Then

()<H(p2k,q3m)𝜀2.

Then either p2k=q3m or p,q,k,m is one of finitely many exceptions.

Make c small enough to rule out the exceptions.

For a,b>0, let N(a,b) denote the number of non-zero digits in the base b expansion of a.

Theorem 1.8 (Senge, Strauss). We have N(a,2)+N(a,3) as a.

Despite the fact that this statement looks quite modest, the proof is not so simple.

Proof. Take a: we assume that N(a,2)+N(a,3)<N for some fixed N. Consider its base 2 expansion.

First we will explore the consequences of having a large string of 0s in the base 2 expansion.

PIC

Then a=p2k1+e1. We know:

|p|<2log2(a)k1+1,|e1|<2k2.

Similarly: a=q3m1+e2 with |q|<3log3(a)m1+1 and |e2|<3m2.

We will make sure that 2k13m1,2k2km2[13,3].

|p2k1q3m1|=|e1e2|<32k2.

Want to use the proposition. So we need:

|p2kq3m|max(p,q)<cmax(2k,3m)1𝜀.

So we want

C2k22log2(a)k1<c2k1(1𝜀).

We want

log2(a)k1<k1kj𝜀 log 2(a).

PIC

Since at most N blocks have a non-zero number, one of the blocks only has zeroes, which can be used with the above to show that a cannot be too large.

The constants in all results so far (except Liouville) are ineffective!

Are there any improvements of

|213pq|<100q3

(suppose 100 is the best you can get with Liouville) for q<1010101010101010. No!

To demonstrate what it means that the above results are ineffective:

Suppose that we want to find all the solutions of x32y3=11. Thue says that we have finitely many. But because it is ineffective, we have no idea how to bound the largest of these is, so would struggle to find all solutions, even with an arbitrarily powerful computer (or an army of postdocs).

1.1 Transcendence

Liouville proved α=n=0110n! is transcendental.

What about e, π, 22?

Hermite: e is transcendental.

Lindemann: If α0, then at least one of α or eα is transcendental.

Theorem 1.9 (Lindemann-Weierstrass). Assuming that:

  • α1,,αn distinct

Then eα1,,eαn are linearly independent over ¯ (algebraic closure of ).

Hilbert’s 7th problem: Let α0,1, algebraic, β irrational algebraic. Then αβ is transcendental.

Note (for this problem): αβ=exp(β log α) where logα is any complex number with elogα=α. So in the above problem we can think of “αβ is transcendental” as meaning “any choice for αβ is transcendental”.

Convention: If α>0, then logα.

Theorem. Let α1,α2 be non-zero algebraic numbers. Then logα1, log α2 are linearly independent over ¯ if and only if they are linearly independent over .

Proof of Hilbert’s 7th above Theorem is true.

Theorem (Baber). Let logα1,, log αn be -linearly independent logarithms of algebraic numbers.

Then 1,logα1,, log αn are linearly independent over ¯.

Conjecture 1.10 (Schanuel). Let α1,,αn be linearly independent over . Then the transcendence degree of (α1,,αn,eα1,,eαn) is at least n.

Let α1,,αn>0, and b1,,bn. Let Aj be the max of the numerator and the denominator of aj.

Let B=max(|b1|,,|bn|). Then

b1loga1+bn log an close to 0a1b1anbn close to 1.
|a1b1anbn1|A1bAnB=exp(( log A1+ log An)B).
|b1loga1++bn log an|12 exp (( log A1++ log An)B).

Notation. Let α¯, denote its minimal polynomial in [X] by fα.

If f[X], then H(f) (the height of f) is the maximal absolute value of its coefficients.

Theorem. Let α1,,αn¯0, β0,,βn¯. Fix some choices of logαj. Let Aj=max(H(fαj) exp (| log αj|),10).

Let Λ=β0+β1logα1++βn log αn. Then there exists an effective constant C depending on n and the degree of (α1,,αn,β0,β1,,βn) such that either Λ=0 or

|Λ|>exp(C( log A1)( log An)( log B)).

Conjecturially: this should be

|Λ|>exp(Cmax( log A1,, log An, log B)).

Theorem. Let α1,,αn¯0. Let logαj be a choice of their logarithms. Let b1,,bn. Let

Aj=max(H(f(α1)),,H(f(αn)), exp (|logα1|),, exp (| log αn|),10)B=max(|b1| log An,,|bn1| log An,|bn|,10))Λ=b1logα1++bn log αn homogeneous

Then there is an effective constant C that depends only on n and the degree of (α1,,αn) such that

|Λ|>exp(C( log A1),,( log An)( log B))orΛ=0.

Observe

exp(Re log αj)=|αj|H(f(αj)).

Recall

B=max(|b1|,,|bn|, log A1,, log An,10).

Typical scenario: α1,,αn1 fixed numbers, bn=1, bjlogAn.

In the setting of Diophantine approximations, it is possible to show

|αpq|>c(α)1qd𝜀(α),

with c(α) and 𝜀(α) being effective constants.

Proposition. There is an effective absolute constant C such that for all p,q,k,m:

|p2kq3m|>max(2k,3m)max(p,q,10)C log (max(k,m) log max(p,q,10)+10),

or p2k=q3m.

Proof. Suppose 3m>2k.

Λ=klog2m log 3+1 log (pq).

A2=A1=10, A3=max(p,q,10)

B=max(k,m)logA3+1.

Then:

|Λ|>exp(C log A3 log B)=A3C log B.

|exp(Λ)1|>110|Λ|

exp(Λ)1=|2kmpq1|A3C~ log B

Multiply by q3m.

Before:

|p2kq3m|>Cmax(2k,3m)1𝜀max(p,q).

The new bound wins when max(p,q)<max(2k,3m)o(1).

In particular, when p=q=1:

|2k3m|>max(2k,3m)max(k,m)Cvs|2k3m|>C21𝜀k.

p12k1+p23k2+p35k3 for k1,k2,k3>0, p1,p2,p3.

Recall: N(a,b) is the number of non-zero digits in the base b expansion of a.

Theorem (Stewart). There is an effective absolute constant C such that

N(a,2)+N(a,3)log log alog log log a+C1,

for a0.

Digit expansion of a

PIC

a=p2k1+e1.

We need pKe1<2k1 where K=Clog log 2a (this is an upper bound for the exponent of max(p,q,10) in the proposition). Previously we have pe1<2k1(1𝜀).

Alternative to heights of minimal polynomials (is better behaved under operations like addition):

Definition 1.11 (Mahler measure). Let P[X]

P(X)=adXd+ad1Xd1++A0=ad(Xα1)(Xαd)

Then we define

M(P)=|ad|j=1dmax(1,|αj|).

We could define the height of an algebraic number α as

H(α)=M(fα)1degfα,

but instead we will define it in a different (but equivalent) way.

Consider two algebraic integers α,β, and assume

[[α+β]:]=[(α):]×[(β):].

This means that the Galois-conjugates are αi+βj where αi runs through the conjugates of α and βj runs through the conjugates of β.

Then

M(fα+β)=i,jmax(1,|αi+β|)i,j2max(1,|αi|)max(1,|βj|)=2d1d2(imax(1,|αi|))d2(jmax(1,|βi|))d1=2d1d2M(fα)d2M(fβ)d1

Recall that we mentioned that we could define

H(α)=M(fα)1degfα.

Then would have

H(α+β)2H(α)H(β).

Similarly,

H(αβ)H(α)H(β).

Proposition. Let P[X] of degree d. Then

2dH(P)M(P)(d+1)H(P).

Proof. For the upper bound:

logM(P)=01 log |P(e2πit)|dt.

Known as Jensen’s formula (enough to prove for P of degree 1).

Note that

|P(X)|(d+1)H(P)

for all |X|=1. This with Jensen’s formula gives the upper bound. For the lower bound:

P(X)=adXd++a1X+a0.

Then

|ajad|={k1,,kj}{1,,d}|ak1||akj|M(f)|ad|.

The number of terms is 2d. Hence |aj|2dM(P).

Absolute Values

Let K be a number field. Then a function ||:K0 is an absolute value if:

Example.

Comment on the normalisation: for α, we have |α|σ=|α|, and |α|P=|α|p.

The places of K are MK comprises:

For vMK, ||v denotes the absolute value given above.

Infinite places: MK,: embeddings.

Finite places: MK,f: prime ideals.

For vMK, we define dv as follows:

Comment:

dv=[K:p]

where p is the place of below v.

LK extension of number fields, then wML lies above vMK; in notation w|v.

If both are embeddings and w|K=v or w|K=v¯ or both are finite and w lies over v as prime ideals, i.e. w|vOL.

Remark. v|dv=[K:], v|pdV=[K:].

Proposition (Product formula). Let K be a number field. Then for all αK0, we have

v|α|vdv=1.

Proof. We compute N(αOK) in two ways.

N(αOK)=vMK,fN(v)ordv(α)=vMK,fpfvordv(α),

where p is the rational prime lying below v.

Recall |α|v=pordv(α)ev=pordv(α)fvdv. So

N(αOK)=vMα,f|α|vdv.

Also,

N(αOK)=|N(α)|=vMK,|α|vdv.

Dividing the equations gives the desired result.

Now we define

H(α)=(vMKmax(1,|α|v))1[K:].

We will also use h(α)=logH(α). We won’t be using that much, but we mention it mostly because it is used in the literature.

H is known as “multiplicative height”, while h is known as “logarithmic / absolute / Weil height”.

Proposition 1.12. Let LK be an extension of number fields. Let αK. Then H(α) as defined above is the same for K and L.

Proof. Claim 1: If wML, vMK such that w|v then |α|w=|α|v for all αK.

Claim 2: w|vdw=[L:K]dv.

Assuming these claims are true, then for αK

w|vmax(1,|α|w)dw=max(1,|αv|)[L:K]dv

Then

(w|vmax(1,|α|w)dw)1[L:]=max(1,|αv|)dv[K:]

Which implies the desired result.

Proof of Claim 1: Will show if v,w are embeddings then

|α|w=|w(α)|=|v(α)|=|α|v.

If w,v are prime ideals, then we need

ordw(α)ew=ordv(α)ev.

For this, note that for all ideals IOK, we have

ordw(IOL)=ordw(vOL)ordv(I).

Use this for pOK and αOK in the role of I:

ew=ordw(vOL)evordw(α)=ordw(vOL)ordv(α)

Proof of Claim 2: Omitted.

Proposition. Let α¯0. Then

H(α)=M(fα)1deg(fα).

Remark. Recall 2dH(fα)H(α)d(1+d)H(fα).

Proof. Enough to prove

|ad|[K:]=vMK,fmax(1,|x|v)dv,

where K is a number field with αK.

If K=(α), then this is immediate from the definitions.

For a polynomial PK[X], we write |P|v for the maximum ||v of all the coefficients of P.

A variant of Gauss’s lemma can be stated as follows: Let Q1,Q2K[X]. Then |Q1Q2|v=|Q1|v|Q2|v for vMK,f.

Observe that |fα|v=1 (for all vMK,f) because the coefficients are coprime rational integers. We write fα=ad(Xα1)(Xαd) (we take K to be the splitting field of fα). Gauss’s lemma gives

vMK,f|ad|vdvvKK,fj=1dmax(1,|αj|v))dv=1.

Let σ be an automorphism of K such that σαj=α for some fixed j. This permutes MK,f. That is, vMK,f, there exists σvMK,f such that |σβ|σv=|β|v. So

vMK,fmax(1,|αj|v)dv=vMK,fmax(1,|σαj=α|σv)dσv=vMK,fmax(1,|α|v)dv

By the product formula:

vMK,f|ad|vdv=vMK,|ad|vdv=|ad|[K:].

So

[vMK,fmax(1,|α|v)dv]d=[(α):]=|ad|[K:].

Lemma. Let α¯, and k. Then

H(ak)=H(α)|k|.

Proof. If k>0, then this is immediate from the definition. So just need to consider k=1:

H(α1)d=vMKmax(1,|α|v1)dv

(d=degα). We multiply this by

vMv|α|vdv=1.

So

H(α1)d=vMkmax(|α|v,1)dv=H(α)d.

Let P be a polynomial in possibly several variables, with complex coefficients. Then L(P) is defined to be the sum of the absolute values of all the coefficients. This is sometimes called the length of P.

Proposition. Let k>1, n1,,nk>0. Let P,Q[X1,,Xk] of degree nj in Xj. Let α1,,αk¯0. Then:

H(P(α1,,αk)Q(α1,,αk))max(L(P),L(Q))j=1kH(αj)kj.

In particular: H(αβ)H(α)H(β) and H(α+β)2H(α)H(β).

Proof. Let K be a number field containing all αi.

H(P()Q())[K:]=vMKmax(1,|P()Q()|v)dv=vMKmax(|Q()|v,|P()|v)dvfrom product formula for Q()

Let first vMK,f. Then

|P(α1,,αk)|vmaxj1=0,,n1jk=0,,nk|α1|vj1|αk|jk=i=1kmax(1,|αi|vni)

For vMK,:

|P(α1,,αk)|vL(P)i=1kmax(1,|αi|vni).

So

H(P()Q())[K:]max(L(P),L(Q))[K:]i=1kvMKmax(1,|αi|v)nidv.

Then taking a [K:] root of both sides gives the desired inequality.

Lemma. Let α¯. Then:

H(α)degα|α|H(α)degα.

This is sometimes known as “trivial bound” or “Liouville’s bound”.

Proof.

H(α)degα=vMKmax(1,|α|v)dv|α|

Apply this for α1:

|α1|H(α1)d=H(α)d|α|H(α)d

Theorem (Siegel). Let α be a real algebraic irrational number. Then for all 𝜀>0, there exists c=c(α,𝜀)>0 such that

|αpq|cq2d𝜀

for all p,q0.

We will spend the next 3-5 lectures proving this.

We will spend today’s lecture discussing an outline of the proof, discussing why certain parts are necessary and also some intuition as to why one would expect this method to work.

1 variable is not enough: let P(X) be of degree n. Then P may vanihs at α to order nd. Then we have a lower bound of

|P(pq)|1qn,

and we might hope for an upper bound like

|P(pq)||αpq|nd.

To get a contradiction, we need |αpq|nd<1qn, i.e. |αpq|<1qd.

Lower bound

|P(p1q1,p2q2)|1q1n1q2n2

where n1 is the degree in X1 and n2 is the degree in n2.

Upper bound:

P(p1q1,p2q2)=j1,j2Pj1,j2(α,α)(αp1q1)j1(αp2q2)j2

where Pj1,j2(X1,X2)=1j1!j2!j1+j2X1j1X2j2P(X1,X2). Note

(αp1q1)j1(αp2q2)j21q1j1(2d+𝜀)1q2j2(2d+𝜀)=exp((2d+𝜀)(j1 log q1+j2 log qj2)).

Index of P at (β1,β2) with respect to the weights w1,w2.

IP(β1,β2;w1,w2)=min(j1w1+j2w2,Pj1,j2(β1,β2)0).

Use w1=logq1, w2=logq2. With this, we get the upper bound

|P(p1q1,p2q2)|exp((2d+𝜀)IP(α,α)).

How big can IP(α,α) be made? We look for P in the form

P(X1,X2)=i1=0n1i2=0n2ai1,i2X1i1X2i2.

The condition that Pj1,j2(α,α)=0 is a linear equation for ai1,i2 over [α].

By picking a basis of (α) over , this becomes a system of d linear equations. To find P such that IP(α,α)I we need to solve:

d|{(j1,j2):j1logq1+j2 log q2i}|I22 log q1 log q2

PIC

I can choose n,n2,I, and I want to do the following:

dI22logq1 log q2n1n2.
exp((2d+𝜀)I)1q1n1q2n2
(2d+𝜀)In1logq1+n2 log q2

Take nk2d+𝜀2Ilogqk for some large I.

Subtleties that still need to be considered:

Pj1,j2 coefficient of x1i2x2i2 is ai1+j1,i2+j2i1+j1i1i2+j2j2, where ai1+i2,j1+j2 is the coefficient of X1i1+j1X2i2+j2 in P.

H(Pj1,j2)2n1+n2H(P).

Thue: P(X,Y)=R1(X)+YR2(X).

Let L be a linear form in K[X1,,XN] where K is a number field.

For vMK: |L|v=max(|aj|v) where L=a1X1++aNXN. Then define

H(L)=(vMK|L|vdv)1[K:].

By the product formula, this is invariant under multiplication by an element αK×:

|αL|v=|α|v|L|v,

so

H(αL)=vMK|αL|vdv=H(L)vMK|α|vdv=H(L).

Lemma (Siegel’s lemma). Let K be a number field of degree D. Let M,N>0 such that N>MD and let H1. Let L1,,LMK[X1,,XN] be linear forms such that H(Lj)H. Then there exist x1,,xN (not all 0) such that Lh(x1,,xN)=0 for j=1,,M and

|xi|(NH)MDNMD.

In particular, if NMD, then the bound is NH.

There is a refinement of this lemma which is due to Bombieri and Vaaler.

Corollary. Let α be an algebraic number of degree D. Let w1,w2,δ>0, and let I>0. Let n1,n2>0. Suppose that

|{(i1,i2)02:i1w1+i2w2<I}|(n1+1)(n2+1)(1+δ)D.

Then there exists P0[X1,X2] of degree nj in Xj such that IP(α,α,w1,w2)I and

H(P)(4H(α))(n1+n2)δ1

where H(P) is the maximal absolute value of hte coefficients.

Proof. For (i1,i2) consider:

Li1,i2=j1=0n1j2=0n2j1i1j2i2aj1,j2αj1i1+j2i2

where aj1,j2 are variables of Li1,i2. Then

Li1,i2((aj1,j2)j1,j2)=0Pi1,i2(α,α)=0

where

P=j1=0n1j2=0n2aj1,j2X1j1X2j2.

Need to find (aj1,j2)j1,j2 such that Li1,i2((aj1,j2))=0 for all i1,i2 with i1w1+i2w2I.

Apply Siegel’s lemma:

N=(n1+1)(n2+1),MN(1+δ)D.

Then

MDNMDMD(1+δ)MDMD=δ1.

We need to estimate H(Li1,i2). For finite places v,

|Li1,i2|vmax(1,|α|v)n1+n2.

For infinite places:

|Li1,i2|v2n12n2max(1,|α|v)n1+n2

Then

H(Li1,i2)2n1+n2H(α)n1+n2=:H.

Then Siegel’s lemma gives the bound

[2n1+n2H(α)n1+n2(n1+1)(n2+1)2n1+n2]δ1.

Proof of Siegel’s lemma for K=. We can assume that the coefficients of each Lj are integers, and that they are relatively prime. Then each coefficient is bounded by H. Take Y=(NH)MDNMD.

Consider (y1,,yN){0,1,,Y}N. Evaluating Lj at all such (y1,,yN) we have

maxLj(y1,,yN)minLj(y1,,yN)YHN.

The number of possible values of Lj(y1,,yN) is YHN+1.

Claim: (YHN+1)M<(Y+1)N.

Indeed:

Y=(NH)MNMY+1>(NH)MNM(Y+1)N>(NH)M(Y+1)M

The claim follows by

NHY+1<NH(Y+1).

Note that the above line uses the fact that H1!

By the box principle, there exist (y1,,yN)(z1,,zN), with entries bounded by Y, such that

Lj(y1,,yN)=Lj(z1,,zN)j=1,,M.

In the K= case, a key step is that for L[X1,,XN] and H(L)H, the points L(y1,,yN) are integers confined in an interval of length NHY (where y1,,yN=0,,N).

In the general case, consider the map:

Φ:KnsDα(v(α))vMK,

The v-component of Φ(L(y1,,yN)) is confined in an interval (or box) of size NY|L|v.

Let α=L(y1,,yN)(z1,,zN)0. By the product formula,

vMK,|α|vdv=vMK,f|α|vdvvMK,f|L|vdv.

PIC

Make sure vlv RHS of above.

Non-vanishing:

Proposition. For every 𝜀>0, there exists C=C(𝜀) such that the following holds. Let n1,n2>0, and let p1q1,p2q2. Suppose that

exp(n1+n2)<qjnjC

for j=1,2, and that logq2>C log q1.

Let P0[X1,X2] of degree in nj in Xj for j=1,2 such that

H(P)<qjnjC

for j=1,2. Then

IP(p1q1,p2q2,logq1, log q2)𝜀(n1logq1+n2 log q2).

Note: from now on, whenever we say pq, we also mean gcd(p,q)=1.

When we apply this we will have n1logq1n2 log q2.

Without the asymmetry assumption (logq2>C log q1), we have the counterexample: P=(X1X2)n, with p1q1=p2q2.

Alternatively: P=(R(X1)X2Q(X1))n (for R,Q some small degree polynomials) for any p1q1,p2q2 such that

p2q2=R(p1q1)Q(p2q2)

Lemma. Let F,F(1),F(2)[X1,X2], and let i1,i20. Let α1,α2 and w1,w2>0. Then the following holds:

IFi1,i2(α1,α2)IF(α1,α2)i1w1i2w2IF(1)+F(2)(α1,α2)minj=1,2IF(j)(α1,α2)IF(1)F(2)=IF(1)(α1,α2)+IF(2)(α1,α2)

Baby case: P(X1,X2)=F(X1)G(X2) for some F,G polynomials.

In this case if IP𝜀(n1logq1+n2 log q2) then either IF𝜀n1logq1 or IG𝜀n2logq2.

If F vanishes at p1q1 to order m for some m, then

(q1X1p1)m|F.

The leading coefficient of Fis divisible by q1m. In particular, H(F)>q1m. Then H(F)>q1𝜀n1 or H(G)>q2𝜀n2.

Hence H(P)>min(q1𝜀n1,q2𝜀n2), which contradicts the assumptions.

In general, we can always write

P(X1,X2)=F(1)(X1)G(1)(X2)++F(h)(X1)G(h)(X2)

with hn2.

Consider h=2.

P(X1,X2)=F(1)(X1)G(1)(X2)+F(2)(X1)G(2)(X2)X2P=F(1)x2G(1)+F(2)X2G2
X2G(2)PG(2)X2P=F(1)(G(1)X2G(2)X2G(1)G(2))

We will later have to worry about whether the resulting polynomial is 0.

For any h:

   |PG(2)G(h)X2PX2G(2)X2G(h)h1X2(h1)Ph1X2(h1)G(2)h1X2(h1)G(h)|=|F1G(1)G(2)G(h)F1X2G(1)X2G(2)X2G(h)F1h1X2(h1)G(1)h1X2(h1)G(2)h1X2(h1)G(h)|=F1|G(1)G(2)G(h)X2G(1)X2G(2)X2G(h)h1X2(h1)G(1)h1X2(h1)G(2)h1X2(h1)G(h)|

The degree increases h-fold, but not the index.

   |P0,0P0,1P0,h1Ph1,0Ph1,1Ph1,h1|=|F(1)F(2)F(h)F1(1)F1(2)F1(h)Fh1(1)Fh1(2)Fh1(h)||G(1)G1(1)Gh1(1)G(2)G1(2)Gh1(2)G(h)G1(h)Gh1(h)|

where Pij=1i!j!i+jX1iX2jP, Fi=1i!iX1iF.

Lemma. Let F(1),F(2),,F(h) be -linearly independent polynomials in [X]. Then

|F(1)F(2)F(h)F1(1)F1(2)F1(h)Fh1(1)Fh1(2)Fh1(h)|0.

(Wronskian)

Proof of Proposition assuming the lemma. Suppose to the contrary that the proposition does not hold for some P,p1q1,p2q2. Write P=F(1)G(1)++F(h)G(h) such that h is minimal. Then hn2+1 and the F(1),,F(k) and G(1),,G(h) are -linearly independent. Then consider

P=|P0,0P0,h1Ph1,0Ph1,h1|

and

F=|F0,0F0,h1Fh1,0Fh1,h1|G=|G0,0G0,h1Gh1,0Gh1,h1|

Then P(X1,X2)=F(X1)G(X2)0 by the above Lemma.

Note degXjPhnj, degFn1, degGn2. Also

H(P)h!ways to multiply entries((n1+1)(n2+1)monomials in the entries)h(2n1+n2HP)hcoefficients of entries2(n1+n2)h2(n1+n2)hqjhnjC

for j=1,2.

H(P)=H(F)H(G). Then

H(F)(8n1+n2qjnjC)(qjhn1C)h

H(G)(8n1+n2q2n2C)(qjhn2C)h IPi,jIPilogq1j log q2. If j𝜀h10+1, logq1<𝜀10 log q2. By the indirect assumption
IP𝜀(n1logq1+n2 log q2),
IPi,j𝜀2n2logq2+𝜀2n1 log q1.
IP(p1q1,p2q2)𝜀220h(n1logq1+n2 log q2).

If F vanishes to order m at p1q1, then q1m divides the leading coefficient of F. In particular, q1mH(F).

Then

IF(p1q1;logq1)logH(F)10hn1 log q1C
IG(p2q2;logq2)logH(G)10hn2 log q2C

If C is sufficiently large in terms of 𝜀, then

IP(p1q1,p2q2)<IF(p1q1)+IG(p2q2).

A contradiction.

Now we prove the lemma from earlier:

Lemma. Let F(1),F(2),,F(h) be -linearly independent polynomials in [X]. Then

|F(1)F(2)F(h)F1(1)F1(2)F1(h)Fh1(1)Fh1(2)Fh1(h)|0.

(Wronskian)

Proof. The statement does not change if we replace F(j) by aF(i)+bF(j) for some a,b and i{1,,h} provided b0.

Then we may assume: F(i)=Xmi+lower order terms and the mi are distinct.

We will prove that:

|Xm1Xmhm11Xm11mh1Xmh1mh1Xm1h+1mhh1Xmhh+1|0.

Then this is the leading term of th Wronskian, so this will prove the claim. The determinant is equal to:

|m10mh0m1h1mhh1|XM

Supose to the contrary that a non-trivial linear combination of the rows is (0,0,,0). Now the i-th row is a polynomial of degree i1evaluated at m1,,mh. Then the linear combination of the rows is a non-zero polynomial of degree h1 evaluated at m1,,mh.

Theorem. Let α be an irrational, real algebraic number of degree d2. Then for all 𝜀>0, there exists C=C(α,𝜀) such that

|αpq|>Cq2d𝜀,

for all pq.

Proof. Suppose to the contrary that there are infinitely many pq with

|αpq|<q2d𝜀.

Then fix 𝜀0>0 sufficiently small in terms of α,𝜀 and let C be the constant when the proposition is applied with 𝜀0 in place of 𝜀.

Now let p1q1,p2q2 be such that

|αp1q1|,|αp2q2|<q2d𝜀

and

logq1>C𝜀01 log q2>C log q1.

We use Siegel’s lemma to construct P(X1,X2) that vanishes at (α,α) to high order.

We choose n1,n2 such that

n1logq1n2 log q2n1 log q1+ log q1.

We want a polynomial P such that

IP(α,α)n1logq1+n2 log q22d+𝜀10.

For this we need to estimate

|{(i1,i2)02:i1logq1+i2 log q2I}|(I+logq1+ log q2)22logq1 log q2(n1+1)(n2+1)(1+δ)d

PIC

This is because

I2n1logq12d2n2logq22d

so

I22logq1 log q22n12n222d=n1n2d.

So we find P[X1,X2] such that IP(α,α;logq1, log q2)I and H)(P)(4H(α))δ1(n1+n2). We need:

H(P),exp(n1+n2)qjnjCq1n1C

for j=1,2 and logq2>C log q1. This will be fine if (4H(α))δ1<q1C. This is fine if 𝜀0 is sufficiently small with respect to α and δ.

Then IP(p1q1,p2q2)𝜀0(n1logq1+n2 log q2). Then there exists P~ a partial derivative of P such that

H(P~)(8H(α))δ1(n1+n2),
ID~(α,α)I𝜀0(n1logq1+n2 log q2)n1 log q1+n1 log q22d+𝜀5,

if 𝜀0 is sufficiently small.

P~(p1q1,p2q2)0. Then

|P~(p1q1,p2q2)|>1q1n1q2n2.

Taylor’s formula:

P~(p1q1,p2q2)=i1,i2P~i1,i2(α,α)(αpq)i1(αp2q2)i2

If i1,i2 are such that Pi1,i2(α,α)0, then

i1logq1+i2 log q2>n1 log q1+n2 log q22d+𝜀5

hence

|αp1q1|i1|αp2q2|i2<exp((2d+𝜀)n1 log q1+n2 log q22d+𝜀5)<(q1n1q2n2)2d+𝜀2d+𝜀5

The exponent is smaller than 1!

Now estimate the coefficients:

P~i1,i2(α,α)(n1+1)(n2+1)(8H(α))δ1(n1+n2)max(1,|α|)n1+n2<C1(α,𝜀)n1+n2

and

P~(p1q1,p2q2)(n1+1)(n2+1)C1(α,𝜀)n1+n2(q1n1q2n2)2d+𝜀2d+𝜀5(2C1(α,𝜀))n1+n2(q1n1q2n2)2d+𝜀2d+𝜀5<(q1n1q2n2)1

Contradiction.

Theorem (Gelfond-Schneider). Let λ1,λ2 be logarithms of non-zero algebraic numbers. Then λ1,λ2 are linearly independent over ¯ if and only if they are linearly independent over .

We will prove this by assuming λ1λ2¯, and then showing that a particular determinant is both equal to zero and not equal to zero, hence getting a contradiction.

Before doing this, we will discuss how the previous proof could have been instead been phrased using determinants.

We considered some functions φ1,,φL which were some enumeration of X1j1X2j2. Then we used Siegel’s lemma to find a1,,aL such that D=a1φ1++aLφL vanishes at u1=(α,α) to some order. (Note that P also vanishes at all Galois-conjugates of (α,α): u2,,ud). Then we find an argument to show that P also vanishes at ud+1=(p1q1,p2q2) to some order.

This means that for i=1,,L there exists k(i){1,,d+1} and some partial differentiation operator i such that iP(uk(i))=0. We also showed that P with so much vanishing cannot exist.

Let:

M=(1φ1(uk(1))Lφ1(uk(l))1φL(uk(1))LφL(uk(L)))

Then P having all that vanishing is equivalent to

(a1,,aL)M=(0,,0).

Now the existence of P is equivalent to detM=0.

Let λ1,λ20, and α1=eλ1, α2=eλ2¯. Let β=λ2λ1¯. So we assumed that Gelfond-Schneider is false. We aim for a contradiction.

Let T0,T1,S>0 with

L:=(T0+1)(2T1+1)=(2S+1)2.

Consider the “monomials”

Xτexp(tλ1X)

for τ=0,,T0, t=T1,,T1 and the points s1+βs2 for s1,s2=s,,s.

Notation. []τ,ts1,s2 means a matrix with rows indexed by τ,t and columns indexed by s1,s2.

Let

Δ=det[(s1+βs2)τ exp (tλ1(s1+βs2))]τ,ts1,s2=det[(s1+βs2)τα1ts1α2ts2]τ,ts1,s2

Steps:

Steps (1) and (2) will be done in such a way that together they will give Δ=0. Then this will contradict (3).

We will alternate between viewing (s1+βs2)τexp(tλ1(s1+βs2)) as a function of a single variable (function of s1+βs2) and thinking of it as a function of two variables (function of s1 and s2).

Upper bound

Proposition. For n>0, there exists c=c(n)>0 such that the following holds:

Let L>0, E>1. Let f1,,fL:n be analytic functions (here, analytic means convergent power series on n). Let ξ1,,ξLn. Let r=maxs=1,,Lj=1,,n|ξs,j|. Then

det[ft(ξs)]t=1,,Ls=1,,LEcL1+1nL!t=1L|f|Er.

Notation. |f|R=max|x1|,,|xn|R|f(x1,,xn)|.

Corollary. With Δ,T0,T1,S,L as above, there exists c,C>0 depending only on β,z, such that for all Ee:

|Δ|exp(cL2 log E+CLT0 log (ES)+CLT1ES).

Proof. We take n=1 and some Ee. We have |s1+βs2|<C0S with C0=C0(β).

|zτexp(tλ1z)|< exp (C1T0 log ES+C1T1ES)

for |z|<EC0S, with C1=C1(β,λ1).

One possible choice of the parameters: E=e. SL12, T0L1𝜀, T1L𝜀. In this case:

|Δ|=exp(cL2).

(for large L).

Lemma (Schwart’s Lemma). Let f be a holomorphic function on DR the disc of radius R with a zero of order k at 0. Then: for all zDR:

|f(z)||z|K|f|RRK.

Proof. The maximum modulus principle for f(z)zK.

[Proof of Proposition. ] We apply Schwart’s Lemma for

f(z)=det[ft(zξs)]ts

and R=E. Note: |F|EL!t=1T|ft|Er.

So the proposition follows if we show that F vanishes to order cL1+1n at 0. We prove this. Enough to do it when each ft is of the form z1a1znan for some a1,,an depending on t.

This is because all fts are infinite linear combinations of such fts, and hence the determinant can be written as an infinite combination of special determinants. Furthermore we may assume that the (a1,,an) are distinct for different ts.

Observe: det[ft(zξs)]ts=z deg ftdet[ft(ξs)]ts if each ft is of the special form.

The number of monomials with degree d is at most dn. We take d=(L2)1n. Then at least half of the fts have degree d. So deg ft(L2)dcL1+1n.

Proposition (1). Let S=(T0+1)T1 be non-negative integers. Let w1,,wT1 and ξ1,,ξS be two sets of distinct real numbers.

Then

det[ξSτ exp (wtξS)]τ,ts0,

with: τ=0,,T0, t=1,,T1, s=0,,S.

alternant / interpolation determinant

Proposition (2). Let T1, let w1,,wT be distinct real numbers. Let P1,,PT[X] be non-zero. Then the function

F(x)=P1(x)ew1x++PT(x)ewTx

has at most degP1++ deg PT+T1 real zeroes counting multiplicities.

Proposition (2) Proposition (1). Suppose to the contrary that det=0. Then there exists aτ,t not all 0 such that

aτ,txτ exp (wtx)

vanishes for all x=ξ1,,ξS. This is a function of the type in Proposition (2). Each polynomil is of degree T0, and there are T1 many of them, so there can be no more than T0T1+T11<S zeroes.

Lemma 1.13. Let f be a C function on with N real zeroes. Then f has at least N1 zeroes.

Corollary of Rolle’s Theorem.

Proof of Proposition (2). By induction on N:=degP1++ deg PT+T1. If N=0, then T=1 and degP1=0. So F(x)=aexp(w1x) for some a0. This indeed has no zeroes.

Suppose N>0 and the claim holds for N1.

We assume as we may that w1=0 (if not, then replace wj by wjw1, which has the effect of replacing F by Few1x).

Then by the lemma, F has at most one more zero than

F=P1(x) deg P11+(P2(x)+P2(x)ww)ew2x deg P2+

By the induction hypothesis, F has at most N1 zeroes, so F has at most N zeroes.

Now we return to proving Gelfond-Schneider.

Let z1,z2=0 such that αj=eλj¯ for j=1,2.

We aim for a contradiction. We have integers L,T0,T1,S such that

L=(T0+1)(2T1+1)=(2S+1)2.

Let

Δ=det[(s1+βs2)τ exp (λ1t(s1+βs2))]τ,ts1,s2.

Last time:

log|Δ|cL2 log E+CLT0 log ES+CLT1ES

where E>1 arbitrary.

Apply Proposition (1) with ξS=(s1+βs2) with some enumeration of s1,s2 and wt=λ1t. Then Δ0.

Recall:

Δ=det[(s1+βs2)τα1ts1α2ts2]τ,ts1,s2.

Then

Δ=P(β,α1,α2)

for some P[X,Y,Z]. So:

H(Δ)L(P)H(β)T0LH(α1)T1SLH(α2)T1SL

using

L(P1,P2)L(P1)L(P2)

and

L(Pj)L(Pj)

we get

L(P)L!(2S)T0L.

Liouville bound:

log|Δ|>C( log L!+T0L log S).

Take: E=10.

Then we have a contradiction if

cL2+CLT0logS+CLT1S<C(L log L+T0L log S+T1LS).

I want:

L2>C(T0LlogS+LT1S).

Take: SL12, T0L1𝜀, T1L𝜀.

Theorem (Nesterenko). Let T0,T1,N,M>0. Let Σ1,Σ22 such that |Σ1|=N, |Σ2|=M, and the exponentials of the second coordinates of Σ1 and the first coordinates of Σ2 are distinct. Let P[X,Y] of degree T0 in X, and T1 in Y. Suppose that P(X,exp(y)) vanishes on Σ1+Σ2. Then

NT1orMT0(T1+1).

Proof. If P(X,Y)=P~(X,Y)Y, then P(X,exp(y)) vanishes at exactly the same places as P~(X,exp(y)). So we may assume YP(X,Y). Suppose that N>T1, and write Σ1={(ξ1,η1),,(ξN,ηN)}. Then P(ξj+X,exp(ηj+y)) vanishes on Σ2 for all j=1,,N. We write P(X,Y)=R1(X)Yk1++RKYkK with 0=k1<k2<<kKT1.

Then

P(ξj+X,exp(ηj+y))=R1(ξj+X) exp (ηj)k1 exp (y)k1+.

Write

Qi,j(X)=Ri(ξj+X)exp(ηj)ki.

Then

P(ξj+X,exp(ηj+y))=i=1nQi,j(X)( exp (y))ki.

I look for polynomials A1,,Ak[X] such that

j=1KAj(X)P(ξj+X, exp (ηj+y))=B(X)[X](∗)

such that degBT0(T1+1), and then since B vanishes at the first coordinates of Σ2, MT0(T1+1) will follow.

Lemma. Let Qij[X] for i,j=1,,K for some K>0. Then there exists A1,,Ak[X] such that

iAiQij={det[Qij]if j=00otherwise

Proof. Let [Q~ij] be the adjugate of [Qij]. Then

[Q~ij][Qjk]=det[Qjk]id.

Let A1,,Ak be the first row of [Q~ij].

(Q11(X)Q1k(X)Qk1(X)Qkk(X))(exp(y)e1exp(y)ek)=(P(ξ1+X,exp(ηi+y))P(ξk+X,exp(ηk+y)))

Premultiply this by the row vector (A1(X),,Ak(X)). We get () with B=det[Qij.

degBT0KT0(T1+1).

We need to make sure that B0

The leading term of Qij is aiexp(ηj)kiX deg Ri, where ai is the leading coefficient of Ri.

To show B0, we will consider the leading term of B:

det[ai exp (ηj)kiX deg Ri]ij=det[ exp (ηj)ki]ijX deg Riai.

Lemma. Let K1, wne let 0=k1<<kK. Let A such that |{exp(η):ηA}|>kK. Then there exists a choice of η1,,ηKA such that

det[ exp (ηi)kj]0.

Proof. By induction on K. K=1 is true.

Suppose K>1, and the claim holds for K1. Consider the determinant:

|exp(η1)k1exp(ηK)kKexp(ηK1)k1exp(ηK1kKzk1zkK|=D(z)

which has the property that the upper left (K1)×(K1) minor is 0.

Now D is a polynomial which is 0 of degree kK, so it has at most kK many 0s. Choose ηK such that exp(ηK) is not one of them.

Theorem. Let d3. Let F(X,Y)[X,Y] be a homogeneous polynomial of degree d without repeated factors. Let G(X,Y)[X,Y] be of degree d1. Assume FG is irreducible. Then

F(X,Y)=G(X,Y)X,Y

has at most finitely many solutions.

Schinzel proved this only assuming that FaQn for some irreducible Q of degree 2. He used Siegel’s theorem on integral points. If an algebraic curve has infinitely many points, then it has genus D and at most 2 points at infinity. Our proof is based on an argument of Corvaja and Zannier for proving Siegel’s theorem.

Subspace theorem: Let V be a vector space of dimension n over ¯. Let e1(0),,en(0) and e1,,en be two bases of V. Then for all 𝜀>0, there exists a finite number of elements f1,,fmV such that all φV that solves:

i=1n|φ(ei)|H(φ(e1(0)),,φ(en(0)))𝜀(∗)

with φ(ei(0)) for all i=1,,n, φ satisfies φ(fj)=0 for some j{1,,n}.

αi,j¯ such that

ei=jαijej(0)

and Li=αi1X1++αinXn. φ satisfies () if and only if

(x1,,xn)=(φ(e1(0)),,φ(en(0)))n

satisfies

i=1|Li(x1,,xn)|<H(x1,,xn)𝜀.

Let F,G be as in the theorem, and write P=FG.

We assume that YF.

Then there exists α1,,αd¯ distinct such that

F(X,Y)=(Xα1Y)(XαdY).

Write Γ for the set of (x,y)2 with P(x,y)=0. Then for (x,y)Γ we have

F(x,y)C(|x|+|y|)d1.

By a similar argument to the lemma for Thue’s equation, for all 𝜀>0 there exists R=R(P,𝜀) such that (x,y)Γ with |x|+|y|>R, then |xyαj|<𝜀 for some j.

We pick a small 𝜀>0, in particular |αiαj|>2𝜀 for ij. We define

Γ0={(x,y)Γ:|x|+|y|<R}Γj={(x,y)Γ:|x|+|y|R,|xyαj|<𝜀}

for j=1,,d.

PIC

Γ0 is bounded so only has finitely many integer points. We want to show this also for Γ1,,Γj. Write I=P[X,Y] for the ideal generated by P. Take some D1 and large enough. Write ¯[X,Y](D) for polynomials of degree D. We will apply the subspace theorem in the vector space

V=¯[X,Y](D)(I¯[I,Y](D)).

Elements fV can be evaluated on Γ.

In particular, for (x,y)Γ, the map ff(x,y) is an element of V. Reference basis: the monomials XkYm for k+mD span V. Pick a linearly independent family for e1(0),,en(0), where n=dimV.

If (x,y)Γ2, then ei(0)(x,y). Also,

H(e1(0)(x,y),,en(0)(x,y))<C|y|D.

We need to find some lj’s that decay on a fixed Γi.

For j=1,,d we introduce a symbol pj and call these the “points of Γ at infinity”. We define for fV:

ordpj(f)=sup{m:f(x,y)ym is bounded on Γj}.

Note ordpj(f)D.

Lemma. Let fV and let j{1,,d}. If ordpj(f)<, then the limit

lim(x,y)Γ,|y|f(x,y)yordpj(f)

exists and 0. In addition, we have

lim(x,y)Γj,|y|(XαY)Y1=αjα

for all α¯.

Can be proved that ordpj(f)= if and only if f=0.

ZY is a local uniformiser at pj=(αi,1,0).

Proof. Let j=1, and by taking the substitution Xα1YX, we may assume α1=0.

First, we show X is bounded on Γ1. To this end:

X=G(X,Y)a(Xα2Y)(XαdY).

Note

a(Xα2Y)(XαdY)cYd1

on Γ, with some c=c(P)>0. We may write P=0 as:

aXYd1+bYd1+P~(X,Y)

(P~ of degree d2 in Y). a is not the same as in the factorisation of F and a0, but b may be 0.

This gives:

X=ba+Y1Q(X,Y1)bounded.(∗∗)

For some polynomial Q. Then limX=ba on Γ.

Proving the first claim, suppose we can write

f(X,Y)=R1(X)Yk+R2(X)Yk1+.(∗∗)

Here, negative exponents of Y are allowed, but the sum must be finite. You can always do this with k=D if R1(ba)0. Then f(X,Y)YkR(ba)0 and ordp1(f)=k and the claim holds.

If R1(ba)=0, then use () to write () with k replaced by k1.

Iterate this.

Lemma. For each j=1,,d, there is a basis l1,,ln (n=dimV) of V such that

ordpj(li)D+i1.

Proof. By induction, we show that there l1,,li1 and ViV such that

V=Qbb¯l1¯li1Viordj(lk)D+k1for k=1,,i1ordpj(f)D+ifor fVi

i=1 is trivial: V=V1.

So suppose i>1 and the claim holds for i1. We define: li1 to be an element in Vi1 of minimal order at pj. Let Vi={fVi1:ordpj(f)>ordpj(li1)}.

Just need to show: Vi1=li1¯Vi. To this end, let gVi1. Write m=ordpj(li1). Then

limΓjgV1m=:b<.
f=gblimΓjli1Ymli1.

Then

limΓjfYm=0

so by the previous lemma, ordpjf>m. So fVi.

For this to be useful, we need n to be large. (We need n2D+2).

Lemma.

dimVdDd(d1).

Remark. Thinking about Γ as a projective curve, V is the space of rational functions with poles of order at most D at each point at . By Riemann-Roch: dimV=dDg+1, provided D is large enough.

Proof. Let R(X,Y)=(XαjY) (=F(X,Y)a). The point is that the polynomials

Qjl(X,Y)=R(X,Y)XαjYYlV

are linearly independent in V. j=1,,d, l=1,,Dd+1. Suppose Q=j,lβjlQjl for some βjl¯ not all 0.

Want to show Q0. To that end, let βj,l0 such that l is maximal with this property.

We can show that:

limΓjQ(X,Y)Yld+1=βj,li={1,,d}{j}(αjαi).

Uses the first lemma today.

Lemma. Let f,P[X,Y] without common factors in [X,Y]. Then the system of equations f(X,Y)=P(X,Y)=0 has only finitely many solutions.

Proof. [X,Y][X][Y] (poynomials in Y with coefficients in [X]). f,P have no common factors in [X][Y]. Then Gauss’s lemma gives us that they have no common factors in (X)[Y]. This is because [X] is a UFD and (X) is its quotient field.

Since (X)[Y] is a Euclidean domain, there exists F,G(X) such that

FP+Gf=1.

Multiply by the common denominator D of F,G, and we get

F~P+G~f=D(X)

for some F~,G~[X]. Hence the common solutions of f=P=0 has finitely many X-coordinates. Then swap X and Y.

Theorem. Let F[X,Y] homogeneous of degree d, without repeated factors. Let G[X,Y] of degree <d. Assume FG is irreducible in [X,Y]. Then there are at most finitely many solutions of F(X,Y)=G(X,Y) with X,Y.

F(X,Y)=(Xα1Y)(XαdY). Γ=Γ0Γ1Γd. P=FG, I=P¯[X,Y]. V=¯[X,Y](D)I¯[X,Y](D). ordpj(f)=sup(t:f(X,Y)Yt bounded on Γj). n=dimV.

jl1,,lnV a basis such that ordpj(li)D+i1. n=dimV>dDd(d1).

Subspace Theorem: Let V be a vector space of dimension n over ¯. Let l1,,ln, l1(0),,ln(0)V be two bases. 𝜀>0 there exists f1,,fmV0 such that φV that satisfies

i=1n|φ(lj)|H(φ(l1(0)),,φ(ln(0)))𝜀

then φ(fj)=0 for some j=1,,m.

Proof of Schinzel’s Theorem. We show that 2Γj is finite for any j=1,,d. Let l1,,lnV be a basis with ord(li)D+i1. Then

i=1n|li(X,Y)|CYordpj(li)on ΓjCYD1

if n2D+2. We set D to be large enough so that this holds.

Recall the reference basis lj(0) are suitable monomials of degree D, so

|li(0)(X,Y)|<CYD.

Then for x,y2Γj, we have:

H(li(0)(x,y),,ln(0)(x,y))C|Y|D.

Hence

i=1n|li(x,y)|<H(l1(0)(x,y),)1

provided y is still large.

By the subspace theorem, fi(x,y)=0 for some i=1,,m.

To apply the lemma, we need fi[X,Y]. This can be assumed: indeed, multiplying fi by an element of ¯, we can make the leading coefficient to be in , and all other coefficients will be algebraic integers. Then replace fi by the sum of its Galois conjugates.

Theorem. For q>0 with gcd(q,6)=1, we write ord(q) for the order of the multiplicative group generated by 2,3 in q.

Then:

limqord(q)( log q)2=.

Remark. 2n3m for n<12log2q, m<12log3q. Hence

ord(q)(12 log 2q)(12 log 3q).

Theorem (Corvaja, Zannier; Hernández, Luca). Write S={2n3m:n,m0}. Then for all 𝜀>0, there are only finitely many pairs of multiplicatively independent a,bS such that

gcd(a1,b1)max(a,b)𝜀.

a,b are multiplicatively independent if there does not exist n,m such that an=bm.

Fact: there exist infinitely many n such that

gcd(2n1,3n1)3nc log log n.

Theorem (1). If 2,3q, then

ord(q)(logq)2.

Theorem (2). For all 𝜀>0, there are only finitely many pairs of multiplicatively independent a,bS such that

gcd(a1,b1)>max(a,b)𝜀.

Proof of Theorem 1 using Theorem 2. Let

Λ={(n,k)2:2n2k(modq)}.

This is a subgroup of 2, and |2Λ=ord(q) The volume 2Λ is ord(q).

Our aim is to find (n1,k1),(n2,k2)Λ02 linearly idnependent and n1,k1,n2,k2Cord(q) log q, where C is absolute.

If we can do this, then: q|gcd(2n12k12n23k21). By Theorem (2), since 2n13k11 and 2n23k21 are multiplicatively independent, we would get

q<max(2n13k1,2n23k2)𝜀<exp(Cord(q) log q)𝜀

Taking log:

logq<C𝜀ord(q) log qord(q)>C1𝜀1(logq)2

provided q is sufficiently large in terms of 𝜀.

Now to the proof of the above stated aim: Let (ñ1,k~1),(ñ2,k~3Λ that generate Λ and such that their angle is as close to π2 as possible.

Then this angle is between π3 and 2π3:

PIC

The area of the parallelogram spanned by (ñ1,k~1) and (ñ2,k~2) is at least

23(ñ1,k~1)2(ñ2,k~2)2ord(q).

Minkowski’s second theorem in the geometry of numbers.

We know that q|2|ñ1|3|k~1|1 or q|2|ñ1|3|k~1|. Then: either |ñ1| or |k~1| has to be 12log3(q). In particular: (ñ1,k~1)2clogq (for some absolute constant c).

Then (ñ1,k~1)2,(ñ2,k~2)2cord(q)logq.

Proposition 1.14. Let L[X1,,Xn] be a linear form. Then there exists C=C(L) such that any solution x1,,xnS of L(x1,,xn)=0 satisfies

|xixi||xixj|2|xixj|3<C(∗)

for some ij{1,,n}.

Remark. For x such that x=2n3ky with n,k0, 2,3y, then

|x||x|2|x|3=|y|.

Note that () is invariant under multiplication by elements of S.

Theorem. Let V be a vector space of dimension n over ¯. Let SM be finite with S. For each vS, let Λ1(v),,Λn(v) be a basis of V. Furthermore, let Λ1(0),,Λn(0) be another basis. Fix an extension of each ||v from to ¯.

Then for all 𝜀>0, there are finitely many φ1,,φnV such that all solutions xV of

vSj=1n|Λj(v)(x)|vH(Λ1(0)(x),,Λn(0)(x))𝜀

with Λ1(0)(x),,Λn(0)(x) satisfy φi(x)=0 for some i=1,,n.

Proof of Proposition. By induction on n. Suppose n=2. As we observed the conclusion, is invariant under dividing x1,x2 by the same element of S. Now gcd(x1,x2)S. So it is enough to prove for solutions with gcd(x1,x2)=1.

Let L(X1,X2)=aX1+bX2. Then ax1+bx2=0 with gcd(x1,x2)=1 implies x1|b and x2|a.

So there are finitely many possibilities for x1,x2 in terms of L. Pick C that works for all.

(to be continued).

“generalised S-unit equations”.

Let K be a number field: OK={xK:|x|v1 for all vMK,f}. Let SMK be a finite set containing MK,: OK,s={xK:|x|v1 for all vS} (“S-integers”). OK,s× units in OK,s (“S-units”).

Unit eqution x+y=1 with x,y units.

Proof (continued). Induction on d. d=2 was checked before.

Suppose d>2, and the claims hold for d1. We make some simplifying assumptions to be specified later. We apply the subspace theorem on d1=V. The reference basis is Λj(0)=Xj, j=1,,d1. As a first approximation, we try Λj(v)=Xj for all j,v. Let S={,2,3}. Let x=(x1,,xd) be a solution of L(x1,,xd)=0. Then

vSj=1d1|Λj(v)(x)|v=1.

We can replace Λ1(w) by

a1anX1++ad1adXd1,

where L=a1X1++adXd. Then we replace |x1|w by |xn|w. We do this for some choice w.

Now back to the simplifying assumptions: We assume that |x| is maximal for j=n. Then |xn|2|xn|3=|xn|1. So let w{2,3} such that |xn|w|xn|12. We may also assume |x1|w=1. For this, we may need to divide x by the common divisor, and rearrange the indices.

For these augmented Λj(v)’s, we get

vSj=1d1|Λj(v)(x)|v|xn|12H(Λ1(0)(x),,Λd1(0)(x))12.

So the subspace theorem applies with 𝜀=12. So x1,,xd1 satisfies one of finitely many linear equations. Apply the induction hypothesis for each of them.

Theorem 1.15. For all 𝜀>0, there exist finitely many multiplicatively independent pairs a,bS such that

gcd(a1,b1)>max(a,b)𝜀.

Proof. Fix some 𝜀>0. Let a,bS multiplicatively independent and such that

d=gcd(a1,b1)>max(a,b)𝜀.

Our goal is to show d<C for some C=C(𝜀). Note: 2,3d, because otherwise 2a,b or 3a,b. Then a and b would be a power of the same prime. Not possible due to multiplicative independence.

Fix some n>0 sufficiently large depending on 𝜀. We apply the subspace theorem on V=n2{(x,,x):x}.

We will evaluate our functionals at the point ed=(e1d,,en2d) where e1,,en2 is an enumeration of akbl for k=0,,n1, l=0,,n1 such that e1=1, en2=an1bn1.

Note: eidejd. This is because ei1(modd). Also: |eidejd|vmin(|eid|v,|ejd|v) for all vS={,2,3}. The coordinates on n2 will be denoted by Y1,,Yn2. All our linear forms on V will be of the form YiYj for some ij. This is indeed well defined on V. Reference basis Λj(0)=YjYn2.

H(Λ1(0)(ed),,Λn21(0)(ed))anbn.

For v=:

Λj()=Yj+1Y1|Λj()(ed)|=|ej+1d|Λj(v)=YjYn2|Λj(v)(ed)|v=|ej|vj=1n21|Λj()(ed)|(j=1n2|ejd|)dj=1n21|Λj(v)(ed)|v(j=1n2|ejd|v)|an1bn1|vvSj=1n21|Λj(ed)|d(an1bn1)dn2|eid||ejd|2|ejd|3=1d

d=gcd(a1,b1) where a,bS are multiplicatively independent. We assume: d>max(a,b)𝜀 for some 𝜀>0. Our goal is to prove d<C(𝜀).

vS={,2,3}j=1n1|Λj(v)(ed)|vdan1bn1dn2.(∗)

e1,,en2 is an enumeration of akbl, k,l=0,,n1.

()max(a,b)2n2max(a,b)𝜀(n21).

Let’s take n>3𝜀1, ()<max(a,b)n.

H(Λ1(0)(ed),,Λn1(0)(ed))an1bn1.

()<H()12. Subspace theorem applies hence there exists a linear relation between e1,,en2S (distinct by multiplicative independence of a,b).

Proposition implies

|eiej||eiej|2|eiej|<C=C(𝜀)

for some ij. Then eiej so eiej0. However, d|eiej.

d|eiej||eiej|2|eiej|3<C.

Theorem 1.16 (Feldman). Let α¯ of degree d3. Then there exists effective C=C(α)>0 and 𝜀=𝜀(α)>0 such that for all pq,

|αpq|>Cqd𝜀.

Remark. This is enough to solve P(x,y)=m, where P is a degree d homogeneous polynomial without repeated factors. Thue equation.

Proposition. Let K be a number field. Then there exists r0 and u1,,urOK× and a constant C=C(K) such that αOK, there exists α~OK and b1,,br such that

H(α~)C|NK(α)|1[K:]|b1|,,|br|ClogH(α)α=α~u1b1urbr

Define Φ:K×MK,:(Φ(α))v=dvlog|α|v (logarithmic embedding). Note that here K× is the group under multiplication, while MK, is the additive group.

|NK(α)|=exp(vMK,(Φ(α))v)H(α)[K:]=exp(vMK,max(0,(Φ(α))v)) For αOK, Σ(Φ(α))v0. Then:
exp(Φ(α)12)H(α)[K:] exp (Φ(α)1).

For αOK×, NK(α)=1. So

Φ(α)W={xMK,:xv=0}.

Kronecker’s theorem: Φ1(0)=kerΦ are the roots of unity.

Dirichlet’s unit theorem: Φ(OK×) is a lattice in W that is a -module of rank dimW=r which spans W.

PIC

Let u1,,ur be a fundamental system of units, that is Φ(u1),,Φ(ur) is a basis for the lattice Φ(OK×). Fix some αOK. Pick some x0MK, such that

xv= log |NK(α)|0.

xΦ(α)+W.

Then there exist y1,,yr such that

x=Φ(α)+y1Φ(u1)++yrΦ(ur).

There exists C=C(K) such that |yj|CΦ(α)1.

Let bj with |yjbj|1 and |bj||yj|. This gives |bj|CΦ(α)1ClogH(α).

Take: α~=αu1b1urbr.

Φ(α~)=Φ(α)+b1Φ(u1)++brΦ(ur)=x+(b1y1)Φ(u1)+().

() is in a fixed, compact region of W.

Φ(α~)1C+x1.

H(α~)[K:]exp(Φ(α~)1) exp (C)NK(α).

Theorem. α algebraic of degree d3. Then there exists C=C(α)>0, 𝜀=𝜀(α)>0 such that for all pq:

|αpq|>q(d𝜀).

Proof. Fix some α and 𝜀>0 small enough. Suppose that

|αpq|<q(d𝜀)

for some pq. We aim to show that q<C=C(α). We assume as we may that α is an algebraic integer.

Let

P(X)=(Xα1)(Xαd)

be the minimal polynomial of α=α1. Then:

(pα1q)(pαdq)=Q<Cq𝜀.

With Q. Then

N(αj)(pαjq)|Qd.

In particular:

N(pαjq)<Cqd𝜀.

Therefore: α~j, u1,,ur,b1,,br such that pαjq=α~ju1b1urbr. Then

H(α~j)<Cq𝜀|bj|<Clogq

Use |pα1q|<q(d1𝜀).

Then pαjq is very close to (α1αj)q. Consider: (α1α2)(α1α3)q, which is similar to both of (pα2q)(α1α3) and (α1α2)(pα3q).

Now more formally:

|1(pα2q)(α1α3)(α1α2)(pα3q)|=|1((pα1q)+(α1α2)q)(α1α3)(α1α2)((pα1q)+(α1α3)q)|<Cq(d𝜀)
|Aκ1Bκ2AB|<max(κ1,κ2)q.

ABq. Now use the proposition:

pα2q=α~2u1b1urbrpα3q=α~3w1e1wrerH(α~2),H(α~3)Cq𝜀|b1|,,|br|,|e1|,,|er|<Clogq

Writing α=α~2(α1α3)α~3(α1α2) we have:

|1αu1b1urbrw1e1wrer|<Cq(d𝜀).

H(α)<Cq2𝜀. Take log to be the principal branch, that is |Im log ()|π. Warning: log(xy) log (x)+ log (y) in general. This is Lipschitz around 1, so we get

|log(α)+b1 log (u1)++br log (ur)e1 log (w1)er log (wr)+2k log (1)=πi|<Cq(d𝜀)

for a suitable k, and |k|<Clogq.

Reminder:

Theorem. Let n1. Let α1,,αn¯0, and let logαj be any choice of the log of αj. Let b1,,bn and let Λ=b1logα1++bn log αn. Let

Aj=max(H(αj), exp (| log αj|),10)B=max(|b1| log An,,|bn1| log An,|bn|,10)

Then there exists an effective constant C (a function of n and the degree of (α1,,αn)) such that Λ0 implies

|Λ|>exp(C log (A1) log (An) log (B)).

So the lower bound gives:

We apply the theorem with αn=α. A1,,An1<C, An<Cq2𝜀. BClogqlogAnC𝜀. |k|<Clogq. So

|Λ|>exp(C𝜀 log q log 𝜀1)>qC𝜀 log 𝜀1.

We still need to consider Λ=0. This is equivalent to:

1=(pα2q)(α1α3)(α1α2)(pα3q).

Solving this equation gives α2=α3 or p=α1q. Neither is the case.

If we use the weaker bound for |Λ|, then we would prove:

|αpq|>Cq(d𝜀log log q).

˙