12.3.1-ArrayMax

12.3.1 Finding the Maximum of an Array

Suppose that we have an array T[1..N] of N integers. Let X be the problem of finding the index of the largest element of the array. Suppose further that we are constrained to use comparisons of the form T[i]<T[j] as the only kind of value-testing allowed.

The "obvious" algorithm to solve X is:


int function maxIndex(T[1..N]) 
{    
  maxLocation = 1;    
  for (position=2; position<= N; position++)      
      if    (T[position] > T[maxLocation])
      then  maxLocation = position  ;
  return maxLocation ;
}

We can use an adversary argument to prove that no comparison-based algorithm can solve problem B without performing at least N-1 comparisons.

We can't get this information from a decision tree analysis: if we imagine that a comparison-based algorithm Al has been transformed into a decision tree, then we can observe that the tree has at least N leaves (there are N possible indices at which the max value could be found). Section 12.2 of these notes contains a proof that a binary tree with N nodes has height H>=lg(N). Therefore the number of comparisons required by Al is at least lg(N). That is a lower bound on the complexity of problem X, but it is a very low lower bound.

Here is an adversary argument that provides a sharper bound:

Suppose Al is an algorithm that finds the index of the max.

Every time that Al does a comparison, the "daemon" answers in a manner consistent with the hypothesis that T[i]=i for all i. In other words, when Al asks "Is T[i]<T[j]?", the daemon answers "yes" if i<j, and "no" otherwise.

(Note that the question "Is T[i]<=T[j]?" is equivalent to "Is T[j]<T[i]?" The answer to the first question is "yes" if and only if the answer to the second is "no". Therefore we can assume that all of Al's comparisons are of the form: "Is T[i]<T[j]?")

The daemon also keeps track of the set of indices r for which the daemon has said that T[r]<T[j] for some index j. In other words the set of "losers" is initialized to the empty set, and each time the daemon says "yes" to a question of the form "Is T[i]<T[j]?", it also places i into the set of "losers".

If an index p is not among the losers, it means that the daemon has never stated that T[p]<T[j] -- not for any value of j. From Al's point of view, this means that it could be true that T[p]>=T[j] for every index j. Thus, Al has not been able to rule out the possibility that T[p] is the max value of the array.

If Al stops before making N-1 comparisons, then "losers" contains less than N-1 elements. Therefore there are still at least two array elements T[p] and T[q] that, for all Al knows, could be the max.

If Al stops after doing less than N- 1 comparisons and claims that k is the index of the max, the daemon can choose one of the non-loser indices p or q that is not equal to k (say it's p for definiteness), and claim that T[i]=i for all values of i except that T[p]=N+1. Every answer that the daemon has made has been consistent with that hypothesis -- it could have been so. Therefore, if Al is a correct algorithm, it will not stop before making at least N-1 comparisons.