Definition (Entropy). The entropy of a discrete random variable X is a quantity H[X] that takes real values and has the following properties:

  • (i) Normalisation: If X is uniform on {0,1} then H[X]=1.
  • (ii) Invariance: If X takes values in A, Y takes values in B, f is a bijection from A to B, and for every aA we have [X=a]=[Y=f(a)], then H[Y]=H[X].
  • (iii) Extendability: If X takes values in a set A, and B is disjoint from A, Y takes values in AB, and for all aA we have [Y=a]=[X=a], then H[Y]=H[X].
  • (iv) Maximality: If X takes values in a finite set A and Y is uniformly distributed in A, then H[X]H[Y].
  • (v) Continuity: H depends continuously on X with respect to total variation distance (defined by the distance between X and Y is supE|[XE][YE]|).

For the last axiom we need a definition:

Let X and Y be random variables. The conditional entropy H[X|Y] of X given Y is

y[Y=y]H[X|Y=y].

  • (vi) Additivity: H[X,Y]=H[Y]+H[X|Y].