12.2 Information-Theoretic Arguments
Looking at the game of 20 questions another way, if there was an
algorithm A that could always win the game with 19 or fewer
questions, then we could represent A with a binary decision tree:
The internal nodes of the tree are questions, and the leaves are
"verdicts" -- decisions that some guess is correct -- like "the
number is 21,345".
If the algorithm A exists, then the tree has height at most 19
and has at least one million leaves.
But we know that a binary tree of height k has at most 2k leaves.
219 is less than one-million, so the algorithm A can't exist.
THEOREM: A binary tree of height H has at most 2H leaves.
(height=H)==>(#leaves<=2H)
PROOF: If H=0, then the tree has one node, and one leaf, and 2H=1.
Now assume that H>=1, and for j=0,1,2,..,H-1 a binary tree of height j has
at most 2j leaves. Suppose T is a binary tree of height H. Since
H>=1, T has a root node and left and right sub-trees L and R. If not
empty, the heights of L and R are between 0 and H-1, inclusive. Therefore by
the induction hypothesis, L and R each have no more than 2H-1
leaves. Every leaf of T is either a leaf of L or a leaf of R (Note we used
here the fact that L and R can't both be empty). Therefore T has no more than
2H-1 + 2H-1 leaves (=2H).
COROLLARY: Any binary tree with N leaves must have height at
least lg(N).
PROOF: Suppose that a binary tree has N leaves and height H. By the theorem,
the tree has no more than 2H leaves. Therefore
N<=2H, and so lg(N)<=H. QED
THEOREM 12.2.1 Any binary tree with 2^H leaves has an average
height of at least H. (#leaves>=2^H)==>(average
height>=H)
THEOREM 12.2.1 (logarithmic form) Any binary tree with N leaves
has an average height of at least lg N.
PROOF: We proceed by induction on N.
If the number of leaves = N = 1, then the tree must look like one
of these pictures:
... * * * * * ...
/ / \ \
* * * *
/ \
* *
The (average) height is 0 or more,
and 0 = lgN = lg(1).
Now suppose that N>=2, and that for j=1,..,N-1 whenever a
binary tree has j leaves, its height is at least lg(j).
If T is a binary tree with N leaves, then T has left and right
sub-trees L and R, not both empty. Let p be the number of leaves
in L, and N-p the number of leaves in R.
If averageHeight(T) denotes the average height of the tree T,
then N*averageHeight(T) is the sum of the lengths of all the
paths from the root of T to the leaves. If we calculate this by
looking at the paths from the roots of L and R to the respective
leaves, we see that
N*averageHeight(T)
= p + p*averageHeight(L) + (N-p) +
(N-p)*averageHeight(R)
= N + p*averageHeight(L) + (N-
p)*averageHeight(R)
Assume for the moment that neither L nor R is empty.
---------------------------
Start Argument "A"
Then 1 <= p,N-p <= N-1. By the induction hypothesis,
averageHeight(L) >= lg(p) and averageHeight(R) >= lg(N-p).
Therefore:
(1) N*averageHeight(T) >= N + p*lg(p) + (N-p)*lg(N-p)
By analyzing the real-valued function
g(x) = xlg(x) + (N-x)lg(N-x),
we can determine that the derivative is
g'(x) = [lg(x) - lg(N-x)],
which is zero just when x=N/2. The second derivative is
g''(x) = (1/ln2)[(1/x)+(1/(N-x))]
which is positive when x = N/2 and N>=2. Therefore:
p*lg(p) + (N-p)*lg(N-p) >= (N/2)lg(N/2) + (N/2)lg(N/2) =
N*lg(N/2) = N*lg(N) - N,
and so relation (1) implies that
N*averageHeight(T) >= N + N*lg(N) - N = N*lg(N) In other words
averageHeight(T) >= lg(N).
End Argument "A"
---------------------------
Getting back to the possibility that L or R is empty. Suppose
for the sake of definiteness that R is empty. Then it suffices
to show that averageHeight(L) >= lg(N), since averageHeight(T)
= 1 + averageHeight(L) in that case. If L has two non-empty
sub-trees, then "Argument A" applies. If not, it will suffice to
show that the one sub-tree of L is of averageHeight >= lg(N).
Since the number of leaves in T is N, and N >= 2, we must
eventually reach a descendent of the root of T that has more than
one child. "Argument A" applies to that tree. It proves that
tree has averageHeight >= lg(N). Since averageHeight(T) = v +
averageHeight(G) where G is the tree with two sub-trees that is
reached eventually, and v is the number of steps required to
reach it, we see that this also proves that averageHeight(T)
>= lg(N).