12.5.1 The Classes P and NP

DEFINITION: For our present purposes, we say that an algorithm is efficient if it has a worst case work function W(N) such that W is O(p(N)) and p(N) is some polynomial in N.

For example, algorithms that are big-O of N, N2, N3, N4, N5, N6, and so on are considered efficient.

DEFINITION: We call such efficient algorithms polynomial time algorithms.

Efficiency is a relative notion. It is true that some polynomial time algorithms are useless in practice, but generally a polynomial time algorithm is more efficient in practice than an algorithm that is not polynomial time. For example, an algorithm that is Θ(2N) is not a polynomial time algorithm.

A major goal of complexity theory is to distinguish problems that can be solved efficiently from those that cannot. Much is unknown, but some interesting results are known. To understand what is known, we have to begin by learning the vocabulary of the field:

DEFINITION: A decision problem is one that just outputs "yes" or "no", whatever be its input.

DEFINITION: A deterministic algorithm is one whose actions and outputs are completely determined by its input. When a deterministic algorithm is executed twice on the same input, it always performs exactly the same set of actions in the same sequence and produces the same output.

DEFINITION: P is the class of decision problems that can be solved by a deterministic polynomial time algorithm.

EXAMPLE: of a problem in P: Consider an algorithm that can input any undirected graph G with N nodes and output "yes" or "no", telling you whether or not G is connected. This problem can be solved with an O(N2) depth-first search.

CONVENTION: When we refer to a decision problem X, X is the set of inputs x such that the correct answer is "yes".

EXAMPLE: If X is the problem discussed above, of deciding whether an undirected graph G is connected, then we would use the convention that the set X is the set of all inputs representing connected undirected graphs.

DEFINITION: NP is a class of decision problems. A decision problem X is in NP if the following three conditions hold:
  1. There exists for each yes-instance x in X a "proof", q(x), of the fact that x is a yes-instance.

  2. There is a polynomial g(n) such that sizeOf(q(x)) is O(g(sizeOf(x))). (The storage taken up by the "proof" q(x) is not much bigger than the storage taken up by the input x itself.)

  3. There is a polynomial p(n) and an algorithm Al such that Al can input x and q(x), and decide whether x is in X in polynomial time p(sizeOf(q(x) + sizeOf(x)). (The proof can be checked in polynomial time.)

EXAMPLE: The question of whether a given undirected graph contains a Hamiltonian cycle is NP, since the "proof" q(x) could be a list of edges of the graph G, comprising a Hamiltonian path. This proof would generally take up less space than the graph itself, and could be verified to be a Hamiltonian in polynomial time.

THEOREM: P is contained in NP.

PROOF: Suppose that X is in P. Let Al be a polynomial time algorithm that decides if x is in X. All one has to do is assign an arbitrary q(x), say q(x)=0, to each x in X. Then clearly conditions 1, and 2 of the definition of NP above are met. Condition 3 can be met merely by modifying Al slightly so that it will accept q(x)=0, in addition to x, as a part of its input. Then all Al has to do is what it did before: decide if x is in X. QED

As Brassard and Bratley say, the central open question is:

Is NP contained in P?

Evidently, no one has ever been able to answer that question. (If someone knows the answer the word hasn't gotten out in the scholarly community!)

Basically the question means this: Suppose we have a way to efficiently verify proposed solutions to a certain problem. Does that imply that there is an efficient way to find a solution to such a problem too?

Most people would conjecture that P <> NP. In other words they would guess that there are problems which are inherently difficult to solve but easy to check. Stated still another way, they think there must be problems which can be validated in polynomial time, but not solved in polynomial time. Nevertheless, there is no known proof of that conjecture.