meaning of big-O

THE BIG-O MEASURE OF COMPLEXITY:

We talk about the big-O of mathematical FUNCTIONS, usually positive real-valued functions defined on the set of positive integers. That is because we are usually interested in functions that count the number of steps, or the amount of time, required to carry out a given algorithm on a data set of a given size. The size of the data set is always a positive integer, and the number of steps or amount of time is always a positive real number.

NOTATION:

The set of positive integers is usually denoted by the symbol N. R denotes the set of real numbers -- the familiar set of numbers corresponding to the points on a line. R+ denotes the set of POSITIVE real numbers.

A positive real-valued function defined on N is a correspondence or "rule" which ASSIGNS (pairs) A POSITIVE REAL NUMBER TO EACH POSITIVE INTEGER.

Take as an example the case of a program that sorts lists. The CORRESPONDENCE of a number to the average TIME required for the program to sort a list containing that number of items is a positive real-valued function on N. The various times are the positive real numbers assigned by the rule.

We often denote functions using lower case letters, particularly "f","g", "h", and "k". A familiar short-hand which states that the function f is a positive real valued function defined on N is

"f: N --> R+". We customarily denote the assigned value of a function by using the notation f(s), where s is any symbol we are using to denote a positive integer. For example, if f is the function described above, then "f(3)" would stand for the average time required to sort a list of 3 items, f(5) would stand for the average time needed for sorting a list of 5 items. And, in general, if "s" stands for a positive integer then "f(s)" stands for the average time required to sort a list of s items.

DEFINITION OF BIG-O:

Suppose that f: N --> R+ and g: N --> R+.

If for SOME positive constants m and C,

(1) f(n) < [C * g(n)],

for EVERY positive integer n that is greater than m, then we say that f is big-O of g (also written "f is O(g)").

For example, suppose that Frank's car is capable of great speed, but takes quite a while to accelerate to its top speed. Suppose that Gary's motorcycle is very quick to accelerate, but not capable of the top speed of Frank's car. Gary may be able to beat Frank in some short races, but there must be a number of meters m such that Frank can beat Gary in any race of more than m meters.

Let s be any positive integer. If f is the function that assigns the time f(s) that Frank's car requires to race s meters, and if g is the function that assigns the time g(s) that Gary's motorcycle requires to race s meters, then according to our definition of big-O, f is O(g).

Since Frank's elapsed time is less than Gary's elapsed time in any race of more than m meters, f(n) < 1 * g(n) whenever n > m. So the conditions given in line (1) hold true, with the constant C having the value "1".

Now suppose that whenever Gary and Frank race, Frank is never able to go 3 times as fast as Gary. Then Frank's times are more than 1/3 of Gary's times, which is to say that g(s) < 3 * f(s) for all s > 1. Thus g is O(f).

It may seem odd, but two positive real valued functions on N can each be big-O of the other, and such pairs of functions are considered roughly EQUIVALENT -- in mathematical parlance, "ASYMPTOTICALLY PROPORTIONATE". f is big-O of g if and only if

f(n) < [C*g(n)] whenever n > m,

for some positive constants m and C. This is true if and only if

f(n)/g(n) < C whenever n > m,

and if and only if

g(n)/f(n) > 1/C when n > m.

On the other hand, if g is big-O of f, then for some positive K and r,

g(a) < [K*f(a)] whenever a > r,

and this is the same as

g(a)/f(a) < K, and f(a)/g(a) > 1/K.

So

C > f(s)/g(s) > 1/K

whenever s is greater than both m and r. If f(s)/g(s) was a constant, not depending at all on s, then we would say that f and g are proportionate. The last inequality above shows that the ratio of f(s) to g(s), while quite possibly not a constant, does become "trapped" between C and 1/K for large enough values of s. This phenomenon is called "asymptotic proportionality". We think of the two functions as being "roughly proportionate".

Computer scientists classify algorithms by how they compare in the "big-O" sense. Typically the functions f(n)=n, g(n)=n², k(n)=log(n), h(n)=n*log(n) are used as yardsticks to measure algorithms.

If you know that a sorting algorithm requires big-O of n² steps to sort n items , then generally speaking you can say that it is inefficient. On the other hand, a sort that requires only big-O of n*log(n) steps is generally considered quite efficient. A more detailed analysis is needed to find out which method is the better for any specific sorting task, but knowledge of the big-O information is almost always the starting point for such an analysis.