(rev. December 13, 2020)
Five Problems
- In the ideal approach to problem-solving,
- we find the most efficient algorithm,
- we prove it works,
- we compute how efficient it is, and
- we prove that no other algorithm is more efficient.
- For a few kinds of problems we can come pretty close
to attaining the ideal goals described above. When it comes to a lot
of other problems, there's much we don't understand. We don't know
how close to the ideal we can ever get.
- In section 1.2 of our text, the authors, Kleinberg and Tardos, consider
five problems that help illustrate the range of difficulty
of problems we will see in this course. Computer scientists know
how to solve some of the five problems efficiently, but not all.
- Interval Scheduling is the first problem: There is some resource
that cannot be shared - for example a CPU. We are given N requests for
reservations for using the resource. Each request is for an interval of
time during which the resource may be used:
(s1,f1), (s2,f2), ...,
(sN,fN). Each request is an ordered pair
(si,fi), where si is a start time and
fi is a finish time. The goal is to maximize the number of
requests accepted. Because the resource is not sharable, no pair of the
accepted requests is allowed to overlap. (Except, in one version of the problem,
it's allowed if the endpoint of an interval is the starting point
of the next interval.) Compared to what are considered really hard problems,
this one turns out to be relatively easy. There is an O(Nlog(N)) greedy solution.
Greedy algorithms are the subject of chapter 4.
- Weighted Interval Scheduling: This is like the previous problem
except that each request has a value, which we may think of as the reward
for performing the work to be done during the requested interval. Here
the goal is to choose a set of pairwise disjoint intervals with maximum total
value. There's no known greedy algorithm, but we do know an O(Nlog(N))
algorithm that solves the problem. The algorithm uses a technique called
dynamic programming. It involves building up partial solutions
and keeping track of them in a table. Dynamic programming is the subject
of chapter 6.
- Bipartite Matching: A bipartite graph is an undirected graph, G= <N,E>,
in which there is a way to partition the node set N into two disjoint
subsets X and Y, in such a way that, every edge e ∈ E has one end
in X and one end in Y. A matching is a subset M ⊂ E
of the edge set E such that each node appears in at most one of the edges in M.
(In other words the edges in M determine a one-to-one and onto mapping between
a subset of X and a subset of Y.) The problem of finding a perfect
match is to produce a subset M of the edges such that every node appears
in exactly one of the edges in M. More generally the bipartite matching
problem is to find a matching of maximum size. This problem seems harder
than the previous two, and not amenable to solving with a greedy
algorithm or with dynamic programming. However, one can solve it
in O(n3) time using a network flow procedure, where n is the number
of nodes in the graph. Network flow algorithms are the subject of chapter 7.
- Independent Set: A set of nodes in a graph is independent
if there are no edges joining any two of them. The independent set
problem is to find an independent set of maximum size. It's possible to
view both the interval scheduling and bipartite matching problems as
special cases of the independent set problem. This independent set
problem is an example of an NP-hard problem. The concepts
of NP problem, NP-hard problem
and NP-complete problem are very closely
related. We'll study them all in chapter 8.
If we are given a list of nodes in a graph, it's easy
to efficiently check to see if they are an independent set. NP problems
all have a similar "easy to check" characteristic.
However, there is a huge number of NP-hard and
NP-complete problems, and computer
scientists don't know an efficient way to solve any of them!
(Lots of computer scientists think probably no efficient
algorithms exist for these problems,
but no one has been able to prove that.)
- Competitive Facility Location: Two players alternately choose
locations. Each location has a value. No one is allowed to choose a
location adjacent to a location that has already been taken. The problem
is to determine, given a target bound B, whether there is a strategy for
one of the players by which he can be guaranteed to acquire a set of
locations with total value B or greater. This is a problem which is
believed to be even harder than an NP-complete problem. It is
conjectured that not only is there no efficient way to solve the problem,
but also there is no efficient algorithm for checking a solution.