Analytic Number Theory
Daniel Naylor

Contents

1Estimating Primes
1.1Asymptotic Notation
1.2Partial Summation
1.3Arithmetic Functions and Dirichlet convolution
1.4Dirichlet Series
2Elementary Estimates for Primes
2.1Merten’s Theorems
2.2Sieve Methods
2.3Selberg Sieve
3The Riemann Zeta Function
3.1Partial fraction approximation of ζ
3.2Zero-free region
Index

What is analytic number theory?

What kind of problems are studied?

A variety of problems about integers, especially primes.

Key feature: To show that a set (of primes) is infinite, want to estimate the number of elements x.

Definition. Define

π(x) = |{primes  x}| = px1.

Euclid showed: lim xπ(x) = .

Theorem (Prime number theorem).

lim x π(x)log x x = 1.

π(x) x logx. (Conjectured: Legendre, Gauss. Proved: Hadamard, de la Vallée Poussin)

1 Estimating Primes

Theorem (Euler). p1 p = .

Proof. Consider pN = pN (1 + 1 p + 1 p2 + + 1 pN ), for N = {1,2,3,}. We have:

pN n=1N 1 n n=1N1nn+1dt t =1N1dt t = log(N 1)

On the other hand, using 1 + x ex, so

pN pN exp (1 p + 1 p2 + + 1 pN ) = exp ( pN ( 1 p + 1 p2 + + 1 pN )) exp ( pN ( 1 p + 1 p2 p )) exp (C + pN 1 p )

Comparing these two bounds gives

pN 1 p loglog(N 1) C.

Then letting N gives the desired result.

Theorem (Chebyshev’s Theorem).

π(x) cx logx

(for x 2, where c is an absolute constant).

Proof. Consider

SN = 2N N = (2N)! (N!)2

for N . We have

SN j=02N2N j = (1 + 1)2N = 4N.

On the other hand,

SN = p2Npαp(N)

where αp(N) is the largest j such that pj|2N N . We have αp(N) = 1 for p (N,2N]. So

(log4)N N<p2N logp.

Take N = x 2 , for x 2. Hence

x<p2x logp (log4) x 2 + logx (log4)x 2 + log4 + logx.

Then

px 0jlogx log 2 ((log4) x 2j+2 + logx) (telescoping summation, take x 2 , x 22 , x 23 ,) (log4)x + (logx)2 + 1

So for x 2 and a suitable large enough constant c, we have

px (log4)x + c(logx)2.

Hence

x (logx)2 <px logp (log4)x + c(logx)2 log x (logx)2 (π(x) π ( x (logx)2 )) (log4)x + c(logx)2 π(x) x (logx)2 + (log4) x log x (logx)2 + c (logx)2 log x (logx)2 (log4 + 𝜀) x logx

for any 𝜀, as long as x x(𝜀).

Take 𝜀 = 1. Choose c > 0 large enough.

1.1 Asymptotic Notation

Definition 1.1 (Big O and little o notation). Let f,g,h : S , S .

Write f(x) = O(g(x)) if there is c > 0 such that |f(x)| c|g(x)| for all x S.

Write f(x) = o(g(x)) if for any 𝜀 > 0 there is x𝜀 > 0 such that |f(x)| 𝜀|g(x)| for x S, |x| x𝜀.

Write f(x) = g(x) + O(h(x)) if f(x) g(x) = O(h(x)) and write f(x) = g(x) + o(h(x)) if f(x) g(x) = o(h(x)).

Definition 1.2 (Vinogradov notation). Let f,g,h : S , S .

Write f(x) g(x) or g(x) f(x) if f(x) = O(g(x)).

Example.

Lemma. Let f,g,h,u : S .

Proof. Follows from the definition in a straightforward way. Example:

1.2 Partial Summation

Lemma 1.3 (Partial Summation). Assuming that:

  • (an)n are complex numbers

  • x y 0

  • f : [y,x] is continuously differentiable

Then
y<nxanf(n) = A(x)f(x) A(y)f(y) yxA(t)f(t)dt,

where for t 1, we define

A(t) = ntan = n=1ta n.

Proof. It suffices to prove the y = 0 case, since then

y<nxanf(n) = 0<nxanf(n) 0<nyanf(n).

Suppose y = 0. By the fundamental theorem of calculus,

f(n) = f(x) nxf(t)dt =0xf(t)𝟙[n,x](t)dt.

Summing over n x, we get

nxanf(n) = A(x)f(x) 0xf(t) ( nx𝟙[n,x](t)an) dt = A(x)f(x) 0xf(t)A(t)dt

Lemma. If x 1, then

nx 1 n = logx + γ + O (1 x ),

where γ is Euler’s constant, which is given by γ = lim N k=1N1 k log N.

Proof. Apply Partial Summation with an = 1, f(t) = 1 t , y = 1 2. Clearly A(t) = t. Then,

nx 1 n = x x +1x t t2 dt = x + O(1) x +1xt {t} t2 dt = 1 + O (1 x ) + logx 1x{t} t2 dt = 1 + logx 1{t} t2 dt + O (1 x )

The last equality is true since x{t} t2 dt x1 t2 dt = 1 x.

Let γ = 1 1{t} t2 dt. Then we have the asymptotic equation as desired.

Taking x in the formula, we see that γ is equal to the formula for Euler’s constant, as desired.

Lemma. For x 1,

px 1 p =1xπ(t) t2 dt + O(1).

Proof. Apply Lemma 1.3 with an = 𝟙(n) (where is the set of primes), f(t) = 1 t , and y = 1.

We get A(t) = π(t), and then

px 1 p = π(x) x +1xπ(t) t2 dt =1xπ(t) t2 dt + O(1)

1.3 Arithmetic Functions and Dirichlet convolution

Definition (Arithmetic function). An arithmetic function is a function f : .

Definition (Multiplicative). An arithmetic function f is multiplicative if f(1) = 1 and f(mn) = f(m)f(n) whenever m,n are coprime.

Moreover, f is completely multiplicative if f(mn) = f(m)f(n) for all m,n.

Example.

On the space of arithmetic functions, we have operations:

(f + g)(n) = f(n) + g(n) (fg)(n) = f(n)g(n) (f g)(n) = Dirichlet convolution

Definition (Dirichlet convolution). For f,g : , we define

(f g)(n) = d|nf(d)g (n d ),

where d|n means sum over the divisors of n.

Lemma 1.4. The space of arithmetic functions with operations +, is a commutative ring.

Proof. Since arithmetic functions with + form an abelian group, it suffices to show:

Proofs:

Lemma. The set of arithmetic functions f with f(1)0 form an abelian group with operation .

Proof. Need g such that fg = I.

(fg)(1) = f(1)g(1) = 1g(1) = 1 f(1).

Assume g(m) defined for m < n. We will defined g(n).

(fg)(n) = g(n)f(1) + d|n d1 f(d)g (n d ) g(n) = 1 f(1) d|n d1 f(d)g (n d )<n

Lemma. Multiplicative arithmetic functions form an abelian group with . Moreover, for completely multiplicative f, the Dirichlet inverse is μf.

Proof. For the first part, suffices to show closedness. Let f,g : be multiplicative, and m,n coprime.

If mn = ab, we can write a = a1a2, b = b1b2 where a1 = (a,m), a2 = (a,n), b1 = (b,m) and b2 = (b,n). Therefore we have:

(fg)(mn) = mn=abf(a)g(b) = m=a1b1 n=a2b2 f(a1a2)g(b1b2) (above observation) = m=a1b1 n=a2b2 f(a1)f(a2)g(b1)g(b2) (by multiplicativity) = (fg)(m)(fg)(n)

(also need to check that inverses are multiplicative).

Now, remains to show that for completely multiplicative f, fμf = I.

Note that fμf is multiplicative by the first part. So enough to show (fμf)(pk) = I(pk) for prime powers pk. Calculate:

fμf(p) = f(p) + μf(p) = f(p) f(p) = 0 = I(p)

and for k 2:

fμf(pk) = f(pk) + μf(p)f(pk1) = f(pk) f(p)f(pk1) = f(p)k f(p)f(p)k1 = 0 = I(pk)

Example. τk(n) = n=n1nk1. Then

τ k = 111k times.

So τk is multiplicative by the previous result.

Definition (von Mangoldt function). The von Mangoldt function Λ : is

Λ(n) = { logp if n = pk for some prime p 0 otherwise

Then log = 1Λ since

logn = log p|npαp(n) = p|nαp(n)logp = 1Λ(n).

Since μ is the inverse of 1, we have

logμ = 1Λμ = Λ1μ = ΛI = Λ.

1.4 Dirichlet Series

For a sequence (an)n, we want to associate a generating function that gives information of (an)n. Might consider

(an)n n=1a nxn.

If we do this, then pxp is hard to control. So this is not very useful.

The following series has nicer number-theoretic properties:

Definition (Formal series). For f : , define a (formal) series

Df(s) = n=1f(n)ns

for s .

Lemma. Assuming that:

  • f : satisfying |f(n)| no(1)

Then Df(s) converges absolutely for Re (s) > 1 and defines an analytic function for Re (s) > 1.

Proof. Let 𝜀 > 0 (fixed), n and Re (s) > 1 + 2𝜀.

Then

n=N|f(n)ns| = n=N|f(n)|n Re (s) n=Nn𝜀Re (s) n=Nn1𝜀 N1𝜀 + n=N+1n1nt1𝜀dt N1𝜀 +Nt1𝜀dt N𝜀

Hence we have absolute convergence for Re (s) > 1 + 2𝜀. Also, Df(s) is a uniform limit of the functions n=1Nf(n)ns. From complex analysis, a uniform limit of analytic functions is analytic. Hence Df(s) is analytic for Re (s) > 1 + 2𝜀.

Now let 𝜀 0.

Theorem (Euler product). Assuming that:

Then
D f (s) = p (1 + f(p) p2 + f(p2) p2s + ),

for Re (s) > 1. Furthermore, if f is completely multiplicative, then

D f (s) = p (1 f(p) ps ) 1,

for Re (s) > 1.

Proof. Let N , Re (s) = σ > 1. Let

Df(s,N) = pN (1 + f(p) ps + f(p2) p2s + ).

Note that the series defining the factors are absolutely convergent, since

k=1|f(pk)| |pks | k=1 1 pks <

(geometric series).

Therefore, multiplying out,

Df(s,N) = n=1a(n,N)f(n)ns

where

a(n,N) = #{ways to write n as a product of prime powers, where the primes are  N}.

The fundamental theorem of arithmetic tells us that a(n,N) {0,1} and a(n,N) = 1 for n N.

Now,

|Df(S) Df(s,N)| n=N+1|f(n)||ns| n=N+1nσ N 0

(since n=1nσ < ). Hence Df(s) = lim NDf(s,N).

Finally, for f completely multiplicative, use geometric formula:

k=1f(p)k pks = 1 1 f(p) ps .

Lemma. Assuming that:

  • f,g :

  • |f(n)|,|g(n)| no(1)

Then
D fg (s) = Df(s)Dg(s)

for Re (s) > 1.

Proof. We know Df(s) and Dg(s) are absolutely convergent, so can expand out the product.

Df(s)Dg(s) = a,b=1f(a)g(b)(ab)s = n=1 n=abf(a)g(b)ns = n=1(fg)(n)ns = Dfg(s)

Definition (Riemann zeta function). For Re (s) > 1, define

ζ(s) = n=1ns.

Example.

Dirichlet hyperbola method

Problem: How many lattice points (a,b) 2 satisfy ab x?

PIC

Note that this number is nxτ(n) = abx1.

Dirichlet proved that for x 2,

nxτ(n) = xlogx + (2γ 1)x + O(x12)

where γ is Euler’s constant.

We will see a proof of this shortly.

Conjecture: Can have O𝜀(x1 4 +𝜀). Current best exponent is 0.314.

First, we prove a lemma:

Lemma (Dirichlet hyperbola method). Assuming that:

  • f,g :

  • x y 1

Then
nx(fg)(n) = dyf(d) mx d g(m) + mx y g(m) y<dx m f(d).

Proof. nx(fg)(n) = dmxf(d)g(m). Split this sum into parts with d y and d > y to get the conclusion.

Theorem 1.5 (Dirichlet’s divisor problem). For x 2,

nxτ(n) = xlogx + (2γ 1)x + O(x12)

where γ is Euler’s constant.

Proof. We use the Dirichlet hyperbola method, with y = x1 2 . Note that τ = 11. Then,

nxτ(n) = dx 1 2 mx d 1 + mx 1 2 x 1 2 <dxm 1 = xx 1 2 ( x d + O(1)) + mx 1 2 ( x m x1 2 + O(1)) = x dx 1 2 1 d + x mx 1 2 1 m mx 1 2 x1 2 + O(x1 2 )

Recall from Lecture 2 that

dy 1 d = logy + γ + O (1 y ).

Taking y = x1 2 , the previous expression becomes

2 ×(1 2logx + γ + O ( 1 x1 2 ) ) x + O(x1 2 ) = xlogx + (2γ 1)x + O(x1 2 ).

2 Elementary Estimates for Primes

Recall from Lecture 1:

2.1 Merten’s Theorems

Theorem 2.1 (Merten’s Theorem). Assuming that:

  • x 3

Then
  • (i) pxlogp p = logx + O(1)
  • (ii) px1 p = loglogx + M + O ( 1 logx ) (for some M )
  • (iii) px (1 1 p ) = c+o(1) logx (for some c > 0)

Remark. Can show c = eγ.

Proof.

2.2 Sieve Methods

Definition (Sieve problem). Let P . Let

P(z) = pP pz p,

and let A . Denote

S(A,P,z) = |{n A : (n,P(z)) = 1}|.

Problem: Estimate S(A,P,z).

Note that if A [x 2 ,x] ,

S(A, ,x1 2 ) = |A | S(A, ,x1 3 ) = |A ( {p1p2 : p1p2 > x1 3 })|

Sieve hypothesis: There exists a multiplicative g : [0,1] and Rd such that

|{n A : n 0(modd)}| = g(d)|A| + Rd

for all square-free d (no repeated prime factors).

Example.

Lemma. We have

S (A,P,z) = d|P(z)μ(d)|Ad|.

Proof. Recall that

𝟙 n = 1 = (μ1)(n)

(since μ is the inverse of 1). Hence,

S(A,P,z) = nA𝟙(n,P(z)) = 1 = nA d|P(z) d|n μ(d) = d|P(z)μ(a)|Ad|

Example. Let

π(x,z) = |{n x : (n,P(z)) = 1}|.

Let P = . Let A = [1,x] . Then

|Ad| = x d + O(1).

By the previous lemma,

π(x,z) = S(A, ,z) = d|P(z)μ(d) (x d + O(1)) = x d|P(z) μ(d) d + O ( d|P(z) |μ(d)|1) = x d|P(z) μ(d) d + O(2π(z)) P(z) = p1pπ(2) = x pz (1 + μ(p) p 1) + O(2π(2)) fundamental theorem of arithmetic = c + o(1) logz x + O(2z) Merten’s theorem, for some c > 0

For 2 z logx,

π(x,z) = c + o(1) logz x.

Theorem (Sieve of Erastothenes – Legendre). Assuming that:

  • A [1,x]

  • 2 z x

  • Assume the Sieve Hypothesis

Then S(A,P,z) = |A| p2 pP (1 g(p)) + O (x1 2 (logx)1 2 2logx 4 logz ( dx d|P(z) |Rd|2) 1 2 + |A|e logx logz pz pP (1 + g(p))e)

Proof. Recall from previous lecture that

S(A,P,z) = d|P(z) dx μ(d)|Ad| (since Ad =  for d > x) = d|P(z) dx μ(d)g(d)|A| + O ( dx d|P(z) |Rd|) (sieve hypothesis) = |A| d|P(z)μ(d)g(d) + O ( dx d|P(z) |Rd| + |A| d|P(z) d>x g(d)) = |A| pz pP (1 g(p)) + O ( dx d|P(z) |Rd| + |A| d|P(z) d>x g(d))

We estimate the first error term using Cauchy-Schwarz:

dx d|P(z) |Rd| ( dx d|P(z) |Rd|2) 1 2 ( dx d|P(z) 1)1 2 .

Note that if d|P(z), d > x1 2 , then

ω(d) logx1 2 logz ,

(ω(d) – number of distinct prime factors of d), since zω(d) d x1 2 .

Hence, 2ω(d) 2 logx 2 logz , d|P(z), d > x1 2 .

Now we get

dx d|P(z) 1 2logx 2 logz dx d|P(z) 2ω(d) =τ(d) 2logx 2 logz dxτ(d) 2logx 2 logz xlogx

(the last step is by Dirichlet’s divisor problem: dxτ(d) = (1 + o(1)) xlogx).

Substituting this in, the first error term becomes as desired.

Now we estimate the second error term. We have

d|P(z) d>x g(d) x 1 logz d|P(z)g(d)d 1 logz (since d > x in the sum: Rankin’s trick) = x 1 logz pz pP (1 + g(p)p 1 logz z 1 logz =e) x 1 logz pz pP (1 + eg(p)) x 1 logz =elogxlog z pz pP (1 + g(p))e (1 + ey (1 + y)e)

Combining the error terms, the claim follows.

Example. Take A = [1,x] , P = . Then g(d) = 1 d, Rd = O(1), so the sieve gives us

S(A,P,z) = π(x,z) = (x + O(1)) pz (1 1 p ) + O (x1 2 (logx)1 2 2logx 4 logz x1 2 + (x + O(1))elogx logz pz (1 + 1 p )e)

By Merten’s Theorem,

pz (1 1 p ) = C + o(1) logz pz (1 + 1 p ) pz (1 1 p )1 = ( 1 C + o(1))logz

Hence,

π(x,z) = c + o(1) logz + O (x(logx)1 2 2logx 4 logz + xelogx logz (logz)e) .

Hence, for 2 z exp ( log x 10 log log x ),

π(x,z) = c + o(1) logz x.

This asymptotic in fact holds for z xo(1).

In particular, the Erastothenes-Legendre sieve gives

π(x) π(x,z) + z x logxloglogx

for z = exp ( log x 10 log log x ). Not quite the Chebyshev bound π(x) x logx.

2.3 Selberg Sieve

asymptotes good upper bound for primes
Erastothenes-Legendre
Selberg

Theorem 2.2 (Selberg sieve). Assuming that:

  • z 2

  • A finite

  • P

  • Assume the sieve hypothesis

  • h : [0,) be the multiplicative function supported on square-free numbers, given on the primes by

    h(p) = { g(p) 1g(p)p P 0 pP

Then
S(A,P,z) |A| dzh(d)+ dz2 d|P(z) τ3(d)|Rd|.

Sieve hypothesis: There is a multiplicative g : [0,1] and Rd such that

|Ad| = g(d)|A| + Rd

for all square-free d 1.

Proof. Let (ρd)d be real numbers with

ρ1 = 1,ρd = 0,d > z (∗)

Then,

𝟙(n,P(z)) = 1 ( d|n d|P(z) ρd) 2.

(If (n,P(z)) = 1, get 1 ρ otherwise use 0 x2).

Summing over n A,

S(A,P,z) = nA𝟙(n,P(z)) = 1 nA ( d|n d|P(z) ρd) 2 = d1d2|P(z)ρd1ρd2 nA [d1,d2]|n 1 = |A| d1,d2|P(z)ρd1ρd2g([d1,d2]) + d1,d2|P(z)ρd1ρd2R[d1,d2] E (sieve hypothesis)

([m,n] means lcm (m,n)).

We first estimate E:

E max k|ρk|2 d1,d2|P(z)|R[d1,d2]| = max k|ρk|2 dzk d|P(z) d1,d2 [d1,d2]=d |Rd| (d = [d1,d2])

We have

d1,d2 d=[d1,d2] 1 = lz d1,d 2 d=ld1d 2 (d1,d 2)=1 1 (l = (d1,d2),d1 = d 1l,[d1,d2] = ld1d 2) τ3(d)

Therefore

E dzz d|P(z) τ3(d)|Rd|max k|ρk|2.

Now it suffices to prove that there is a choice of (ρd)d satisfying () such that

Claim 1: d1,d2|P(z)ρd1ρd2g([d1,d2]) = 1 dzh(d).

Claim 2: |ρk| 1 for all k .

Proof of claim 1:

We have, writing k = (d1,d2), di = di k,

d1,d2|P(z)ρd1ρd2g([d1,d2]) = k|P(z)μ(k)2 d1,d 2|P(z) k (d1,d 2)=1 ρkd1ρkd2g(kd1d 2) = k|P(z)μ(k)2g(k) d1,d 2|P(z) k (d1,d 2)=1 ρkd1ρkd2g(d1)g(d 2) (multiplicativity)

We have

𝟙(d1,d 2) = 1 = c|d1 c|d2 μ(c)

(since μ1 = I) so the previous expression becomes

k|P(z)μ(k)2g(k) c|P(z) k μ(c) d1,d 2|P(z) d10(mod()c) d20(modc) ρkd1ρkd2g(d1)g(d 2) = k|P(z)μ(k)2g(k) c|P(z) k μ(k) ( d|P(z) k d0(modc) ) 2 = k|P(z)μ(k)2g(k)1 c|P(z) k μ(c) ( d |P(z) ck ρckdg(ckd))2 = nzh(m)1 ( d|P(z) d0(modn) ρdg(d))2

(multiplicativity, d = cd, m = ck, d = md) since 1 h = μ2 gμ (check on primes 1 h(p) = 1 g(p) 1).

Now we get

d1,d2|P(z)ρd1ρd2g([d1,d2]) = mzh(m)1ζ m2

where

ζm = d|P(z) d0(modm) ρdg(d).

Want to minimise this subject to ().

We need to translate the condition ρ1 = 1. Note that

mz m0(modc) μ(m)ζm = d|P(z)ρdg(d) m|dc|m μ(m) m|dc μ(c)μ(m)=μ(d)𝟙d=e = μ(e)ρeg(e)

Hence,

ρe = μ(e) g(e) m2 m0(modc) μ(m)ζm.

Now,

ρ1 = 1 = mzμ(m)ζm.

By Cauchy-Schwarz, we then get

( mzh(m)1ζ m2 ) ( mzμ(m)2h(m)) ( mzμ(m)ζm) 2 = 1.

Hence

mzh(m)1ζ m2 1 mzμ(m)2h(m) = 1 mzh(m).

Equality holds for Tm = h(m) G(z) , where G(z) = mzh(m). We now check that with these ζm, ρd = 0 for d > z.

Note that

ρc = μ(c) g(c)G(z) mz m(modc) μ(m)h(m).

Hence, ρc = 0 for c > 2.

This proves Claim 1.

Now we prove Claim 2 (|ρc| 1).

Note that any m has at most one representation as m = em, where e|d, (m,d) = 1 (for any d ).

Now,

G(z) c|d mz c (m,d)=1 h(cm) = c|dh(c) mz c (m,d)=1 h(m) c|dh(e) mz d (m,d)=1 h(m)

Now,

ρd = μ(d)2h(d) g(d)G(z) mz d (m,d)=1 μ(m)h(m).

Substituting the lower bound for G(z),

|ρd| h(d) g(d) c|dh(e) = 1,

since 1h = h g.

Lemma 2.3. Assuming that:

  • z 3

  • g : [0,1] multiplicative

  • for some K,A we have

    pzg(p)logp κlogz + A.

Then
1 mzh(m) 2 pz1(eκ+1)(1 g(p)),

where h is defined in terms of g as in Selberg’s sieve.

Proof. Note that for any c (0,1),

mzh(m) mz m|P(zc) h(m) = G(z,c).

Then

pzc pP (1 g(p))1 G(z,c) = pzc pP (1 + h(p)) mz m|P(zc) h(m) = m>z m|P(zc) h(m)

By Rankin’s trick,

1 pzc(1 g(p))G(z,c) = pzc(1 g(p)) m>z m|P(zc) h(m) pzc(1 g(p))z λ logz m|P(zc)h(m)m λ logz (for any λ > 0) = pzc(1 g(p))eλ pzc(1 + h(p)p λ logz ) = eλ pzc(1 g(p) + g(p)p λ logz eλ exp ( pzc (p λ logz 1) λ logz(logp)p λ logz ) exp (λ + λ log z pzcg(p)(logp)p λ logz ) (1 + t et) exp (λ + cecλλκ + λecλ A log z )

Choose c = 1 λ and λ = eκ + 1 to get the claim.

The Brun-Titchmarsh Theorem

Theorem 2.4 (Brun-Titchmarsh Theorem). Assuming that:

  • x 0, y 2

  • 𝜀 > 0 and y is large in terms of 𝜀

Then
π(x + y) π(x) (2 + 𝜀)y logy .

Remark. We expect

π(x + y) π(x) = (n + o(1))y logx

in a wide range of y (e.g. y x𝜀 for some 𝜀 > 0 fixed). The prime number theorem gives this for yx. The Brun-Titchmarsh Theorem gives an upper bound of the expected order for y x𝜀.

Proof. Apply the Selberg sieve with A = [x,x + y] , P = . Note that for any d 1,

|{a A : a 0(modd)} = y d + O(1).

Hence, the sieve hypothesis holds with g(d) = 1 d, Rd = O(1).

Now, the function h in Selberg sieve is given on primes by

h(p) = g(p) 1 g(p) = 1 p 1 = 1 φ(p),

where φ is the Euler totient function. In general,

h(d) = μ(d)2 1 φ(d)

(since φ is multiplicative). Now, for any z 2, Selberg sieve yields

S(A, ,z) y dzμ(d)2 φ(d) + O ( dz2τ3(d)).

By Problem 11 on Example Sheet 1, the error term is O(z2(logz)2).

Take z = y1 2 𝜀 10 . Then

z2(logz)2(y1 2 𝜀 10 )2(logy)2y1𝜀 20

(for y y0(𝜀)). We estimate

dz μ(d)2 φ(d) = dz μ(d)2 d d φ(d) = dz μ(d)2 d p|d (1 + 1 p + 1 p2 + )= p p1 = p φ(p) nz 1 n

since any n z has at least one representation as

n = dp1a1 pkak ,

where d z is square-free and pi|d are primes and a1 0.

We have proved

nz 1 n = logz + O(1) (1 1 10 )logz

for z z0(𝜀).

Putting everything together gives us

π(x + y) π(x) S(A, ,z) + z y (1 𝜀 10 ) logz + z + y1𝜀 20 (2 + 𝜀)y logy

for y y0(𝜀).

3 The Riemann Zeta Function

Recall that the Riemann zeta function is defined by

ζ(s) = n=1 1 ns

for Re (s) > 1.

Remarkable properties:

Notation. f(x) g(x) means f(x)g(x)f(x). σ means that the constant in can depend on σ.

Lemma 3.1. Assuming that:

  • σ > 1

  • t

Then |ζ(σ + it)|σ1.

Proof. By the Euler product formula,

ζ(σ + it) = p(1 pσit)1,

hence

|ζ(σ + it)| = p|1 pσit|1.

By the triangle inequality,

1 pσ |1 pσit| 1 + |pσit| = 1 + pσ.

Hence,

p(1 pσ)1 |ζ(σ + it)| p(1 = pσ)1.

Note that these products converge if and only if ppσ converges, and this sum converges by the comparison test.

Lemma 3.2 (Polynomial growth of eta in half-planes).

  • (i)
    f(ζ) extends to a meromorphic function on , with the only pole being a simple pole which is at s = 1.
  • (ii)
    Let k 0 be an integer. Then, for Re (s) k and |s 1| 1 10 we have |ζ(s)|k|s|k+2 + 1.

Proof. Let k 0 be an integer.

We claim that there exist polynomials Pk, Qk of degree k + 1 and such that for Re (s) > 1,

ζ(s) = 1 s 1 + Qk(s) + s(s + 1)(s + k)1Pk({t}) ts+k+1 dt. (∗)

First assume () holds.

Then, since Pk({t})k1, for Re (s) > k 1 2, |s 1| 1 10, () gives |ζ(s)|k|s|k+2 + 1. So (ii) follows.

For (i), using analytic continuation and (), it suffices to show that the RHS of () is meromorphic for Re (s) > k, with the only pole a simple one at s = 1.

Suffices to show that

1Pk({t}) ts+k+1 dt

is analytic for Re (s) > k.

This follows from the following criterion: If U is open, f : U × is piecewise continuous, and if sf(s,t) is analytic in U for any t , then f(s,t)dt is analytic in U, provided that |f(s,t)|dt is bounded on compact subsets of U.

Applying this with f(s,t) = Pk({t}) ts+k+1 𝟙[1,) concludes the proof of (i) assuming ().

We are left with proving (). We use induction on k.

Case k = 0: By Partial Summation,

ζ(s) = n=1 1 ns = s1 u us+1du (apply partial summation to  nxns and let x ) = s1 1 usdu 1 {u} us+1du = 1 s 1 + 1 s1 {u} us+1du

Take P0(u) = u, Q0(u) = 1.

Case k + 1 assuming k: Let

ck =01P k({t})dt.

For Re (s) > 1, we have

ζ(s) = 1 s 1 + Qk(s) + cks(s + 1)(s + k 1) + s(s + 1)(s + k)1Pk({t}) ck ts+k+1 dt.

Let

Pk+1(u) = 0u(P k(t) ck)dt.

This is a polynomial of degree k + 2. By integration by parts,

1Pt({t}) ck ts+k+1 dt = (s + k + 1)1Pk+1({u}) us+k+1 du.

Substituting this into the previous equation, we get that case k + 1 follows.

The Gamma function

Definition (Gamma function). For Re (s) > 0, let

Γ(s) =0ts1etdt.

Lemma 3.3. Γ(s) is analytic for Re (s) > 0.

Proof. Apply the same criterion for integral of analytic function being analytic as in the previous lemma, taking f(s,t) = ts1et𝟙[0,)(t).

Note that for 0 < σ1 Re (s) σ2,

|f(s,t)|dt 01tσ11etdt +1tσ21etdt < .

Lemma (Functional equation for Γ). The Γ function extends meromorphically to , with the only poles being simple poles at s = 0,1,2,.

Moreover:

Proof.

Theorem 3.4 (Functional equation for zeta). Assuming that:

  • ξ(s) = 1 2s(s 1)πs2Γ (s s ) ζ(s) for s

Then ξ is an entire function and ξ(s) = ξ(1 s) for s . Hence
πs2Γ (s 2 )ζ(s) = π1s 2 Γ (1 s 2 )ζ(1 s)

for s {0,1}.

Proof. Let Re (s) > 1. Then,

Γ (s 2 ) =0ts 2 1etdt.

Make the change of variables f = πn2u to get

Γ (s 2 ) = πs2ns0us 2 1eπn2u du.

Hence

πs 2 nsΓ (s 2 ) =0us 2 1eπn2u du.

Summing over n and using Fubini,

πs 2 Γ (s 2 )ζ(s) = n=10us 2 1eπn2u du = 1 20us 2 1(𝜃(u) 1)du 𝜃(u) = n=eπn2u = 1 201us 2 1(𝜃(u) 1)du + 1 21us 2 1(𝜃(u) 1)du ()

By the functional equation

𝜃(u) = 1 u𝜃 (1 u ),

we have

01us 2 1(𝜃(u) 1)du =01us 2 3 2 𝜃 (1 u )du 01us 2 1du =1vs+1 2 𝜃(v)dv 01us 2 1du (u = 1 v) =1vs+1 2 (𝜃(v) 1)dv + 1vs+1 2 dv 1 s1 2 01us 2 1du 1 s 2

Plugging this into (), we get

πs 2 Γ (s 2 )ζ(s) = 1 21(us+1 2 + us 2 1)(𝜃(u) 1)du 1 s(s 1).

Hence

ξ(s) = 1 2 + 1 4s(s 1)1(us+1 2 + us 2 1)(𝜃(u) 1)du. (∗∗)

Since |𝜃(u) 1|eπu, applying the criterion for integrals of analytic functions being analytic, we see that ξ(s) is entire. So by analytic continuation, ( ) holds for all s .

Moreover, the expression for ζ(s) is symmetric with respect to s1 s, so ξ(s) = ξ(1 s), s .

Corollary 3.5 (Zeroes and poles of zeta). The ζ function extends to a meromorphic function in and it has

  • (i)
    Only one pole, which is a simple pole at s = 1, residue 1.
  • (ii)
    Simple zeroes at s = 2,4,6,.
  • (iii)
    Any other zeroes satisfy 0 Re (s) 1.

Proof.

3.1 Partial fraction approximation of ζ

This is a formula for ζ(s) ζ(s).

For the proof we need a lemma. We write, for z , r > 0,

B(z0,r) = {z : |z z0| < r} B(z0,r)¯ = {z : |z z0| r} B(z0,r) = {z : |z z0| = r}

Lemma 3.6 (Borel-Caratheodory Theorem). Assuming that:

  • 0 < r < R

  • f analytic in B(0,R)¯, with f(0) = 0

Then
sup |z|r|f(z)| 2r R rsup |z|=R Re (f(z)).

Proof. This is Exercise 10 on Example Sheet 2.

Lemma 3.7 (Landau). Assuming that:

  • z0 and r > 0

  • f analytic in B(z0,r)

  • for some M > 1 we have |f(z)| < eM|f(z0)| for all z B(z0,r)

Then for z B(z0,r4),
|f(z) f(z) ρZ 1 z ρ | 96M r ,

where Z is the set of zeroes of f in B(z0,r2)¯, counted with multiplicities.

Note. If f is a polynomial, we can factorise

f(z) = a ρ(z ρ),

and then

f(z) f(z) = (logf(z)) = (loga + ρ log(z ρ)) = ρ 1 z ρ

Proof. Let

g(z) = f(z) ρZ(z ρ).

Then g is analytic and non-vanishing in B(z0,r2)¯. Note that

g(z) g(z) = f(z) f(z) ρZ 1 z ρ

((f1fn) f 1fn = i=1nf1 f 1 ).

Hence, it suffices to prove

|g(z) g(z) | 96M r

for z B(z0,r4). Write

h(z) = g(z0 + z) g(z0) .

Then h is analytic and non-vanishing in B(0,r2)¯, and h(0) = 1. We want to show

|h(z) h(z) | 96M r

for z B(0,r4).

For all z0 B(0,r) we have

|h(z)| = |f(z0 + z) f(z0) ρZ z0 ρ z0 + z ρ | |f(z0 + z) f(z0) | < eM

since

|z0 ρ| r 2 = r r 2 |z0 + z ρ|

for z B(0,r).

By the maximum modulus principle, |h(z)| < eM for z B(0,r2)¯, so

Re log h(z) = log |h(z)| < M.

By the Borel-Caratheodory Theorem with radii 3r 8 , r 4 we have for for z B(0,3r8),

|logh(z)| 2r 4 3r 8 r 4M = 4M.

Now, for z B(0,r4), Cauchy’s theorem gives us

|h(z) h(z) | = | 1 2πiB(0,3r8) logh(w) (z w)2 dw| 1 2π 2π3r 8 4M (3r 8 r 4 )2 = 96M r

Theorem 3.8 (Partial Fraction approximation of zetazeta).

  • (i) Let s = σ + it with |σ| 10, s1 and ζ(s)0. Then
    ζ(s) ζ(s) = 1 s 1 + |ρs|1 10 1 s ρ + O(log(|t| + 2)).

    where the sum is over the zeroes ρ of ζ counted with multiplicity.

  • (ii) For any T 0, there are log(T + 2) many zeroes ρ of ζ (counted with multiplicity) with |Im (ρ)| [T,T + 1].

Proof. We apply Landau with z0 = 2 + it, r = 50, with f(s) = (s 1)ζ(s).

By the lemma on polynomial growth of ζ on vertical lines, for σ + it B(z0,50), we have

|f(s)| C(|t| + 52)(|t + 2)50 Cexp (51log (|t| + 52)) Cexp (51log (|t| + 52))|f(z)|

(|ζ(z + it)| 1).

Let s B (z0, 25 2 ). Then Landau gives us

f(s) f(s) = 1 s 1 + ζ(s) ζ(s) = |ρz0|2S 1 s ρ + O(log(|t| + 2)) ()

Since B (z0, 25 2 ) contains all the points s = σ + it with |σ| 10, it suffices to show

|ρz0|25 |ρs|> 1 10 1 s ρ = O(log(|t|+2)). (∗∗)

Substituting s = z0 in (), we get

|ρz0|25 1 z0 ρ = O (|ζ(z0) ζ(z0) | + 1 + log(|t| + 2)) = O(log(|t| + 2))

since

|ζ(2 + it) ζ(2 + it) | = | n=1Λ(n)n2+it| n=1Λ(n)n2 = ζ(2) ζ(2)

Taking real parts,

|ρz0|25 2 Re (ρ) |z0 ρ|2 = O(log(|t| + 2)).

Since Re (ρ) 1,

|ρz0|25 1 |z0 ρ|2 = O(log(|t| + 2)).

This proves part (ii). It gives also ( ) since the sum there contains O(log(|t| + 2)) zeros.

3.2 Zero-free region

Proposition 3.9. Assuming that:

  • σ > 1 and t

Then
3ζ ζ (σ) + 4Re (ζ ζ (σ + it)) + Re (ζ ζ (σ + 2it)) 0. (∗∗)

Proof. Recall that

ζ ζ (s) = n=1Λ(n) ns ,

where Λ is the von Mangoldt function function, and Re (s) > 1.

Taking linear combinations, the LHS of ( ) becomes

n=1Λ(n)3 + 4Re (nit) + Re (n2it) nσ = n=1Λ(n)3 + 4cos (tlog n) + cos (2 + log n) nσ

(Re (niu) = cos (ulog n)).

We are done by the inequality:

3 + 4cos α + cos 2α = 2(1 + cos α)2 0

for α .

Theorem 3.10 (Zero-free region). There is a constant c > 0 such that ζ(σ + it)0 whenever σ > 1 c log(|t|+2). In particular, ζ(s)0 for Re (s) = 1.

PIC

Proof. Let σ [1,2] and t . Suppose ζ(β + it). uwe know that β 1. We know that ζ has no zeroes in some ball B(1,r) for some r > 0 (otherwise the entire function (s 1)ζ(s) would have an accumulation point for its zeros).

Choosing c > 0 small enough, we can assume that |t| r. By the key inequality for ζ ζ,

3ζ ζ (σ) + 4Re (ζ ζ (σ + it)). + Re (ζ ζ (σ + 2it)). 0.

Apply partial fraction decomposition of ζ ζ. So

ζ ζ (s) = 1 s 1 + |sρ|1 10 1 s ρ + O(log(|t| + 2)). (∗∗)

(t = Im (s)).

Since Re (ρ) 1 for any zero ρ,

Re 1 σ + iu ρ = σ Re (ρ) |σ + iu ρ|2 0

(σ > 1).

Discarding terms, we get

3 σ 1 + 4 σ β Clog(|t| + 2).

Take σ = 1 + sc log(|t|+2), and assume β 1 c log(|t|+2) to get

3log(|t| + 2) 5c + 4log(|t| + 2) 6c C(log|t| + 2).

Take c = 1 16C to get a contradiction.

Theorem 3.11 (Bounding zetazeta). Assuming that:

  • c > 0 sufficiently small

  • T 0

  • Re (s) 10

  • |Im (s)| [T,T + 1]

  • s is at least distance c log(T+2) away from any zero or pole

Then
|ζ(s) ζ(s) |c(log(T + 2))2.

Proof. If s = σ + it with σ > 10, then

|ζ(s) ζ(s) | = | n=1Λ(n)ns| n=1Λ(n)nσ 1

Assume then that Re (s) [10,10]. Apply ( ). Each term satisfies

1 |s ρ| log(T + 2) c 1 |s 1| log(T + 2) c

We know that there are O(log(T + 2)) zeros with multiplicity having imaginary part [T 2,T + 2]. The claim follows from triangle inequality.

˙

Index

arithmetic function

Chebyshev’s Theorem

completely multiplicative

Euler’s constant

multiplicative

sieve hypothesis

Euler’s Theorem

von Mangoldt function