five problems

(rev. December 13, 2020)

Five Problems

In the ideal approach to problem-solving,
- we find the most efficient algorithm,
- we prove it works,
- we compute how efficient it is, and
- we prove that no other algorithm is more efficient.
For a few kinds of problems we can come pretty close to attaining the ideal goals described above. When it comes to a lot of other problems, there's much we don't understand. We don't know how close to the ideal we can ever get.
In section 1.2 of our text, the authors, Kleinberg and Tardos, consider five problems that help illustrate the range of difficulty of problems we will see in this course. Computer scientists know how to solve some of the five problems efficiently, but not all.
Interval Scheduling is the first problem: There is some resource that cannot be shared - for example a CPU. We are given N requests for reservations for using the resource. Each request is for an interval of time during which the resource may be used: (s₁,f₁), (s₂,f₂), ..., (s_N,f_N). Each request is an ordered pair (s_i,f_i), where s_i is a start time and f_i is a finish time. The goal is to maximize the number of requests accepted. Because the resource is not sharable, no pair of the accepted requests is allowed to overlap. (Except, in one version of the problem, it's allowed if the endpoint of an interval is the starting point of the next interval.) Compared to what are considered really hard problems, this one turns out to be relatively easy. There is an O(Nlog(N)) greedy solution. Greedy algorithms are the subject of chapter 4.
Weighted Interval Scheduling: This is like the previous problem except that each request has a value, which we may think of as the reward for performing the work to be done during the requested interval. Here the goal is to choose a set of pairwise disjoint intervals with maximum total value. There's no known greedy algorithm, but we do know an O(Nlog(N)) algorithm that solves the problem. The algorithm uses a technique called dynamic programming. It involves building up partial solutions and keeping track of them in a table. Dynamic programming is the subject of chapter 6.
Bipartite Matching: A bipartite graph is an undirected graph, G= <N,E>, in which there is a way to partition the node set N into two disjoint subsets X and Y, in such a way that, every edge e ∈ E has one end in X and one end in Y. A matching is a subset M ⊂ E of the edge set E such that each node appears in at most one of the edges in M. (In other words the edges in M determine a one-to-one and onto mapping between a subset of X and a subset of Y.) The problem of finding a perfect match is to produce a subset M of the edges such that every node appears in exactly one of the edges in M. More generally the bipartite matching problem is to find a matching of maximum size. This problem seems harder than the previous two, and not amenable to solving with a greedy algorithm or with dynamic programming. However, one can solve it in O(n³) time using a network flow procedure, where n is the number of nodes in the graph. Network flow algorithms are the subject of chapter 7.
Independent Set: A set of nodes in a graph is independent if there are no edges joining any two of them. The independent set problem is to find an independent set of maximum size. It's possible to view both the interval scheduling and bipartite matching problems as special cases of the independent set problem. This independent set problem is an example of an NP-hard problem. The concepts of NP problem, NP-hard problem and NP-complete problem are very closely related. We'll study them all in chapter 8.

If we are given a list of nodes in a graph, it's easy to efficiently check to see if they are an independent set. NP problems all have a similar "easy to check" characteristic. However, there is a huge number of NP-hard and NP-complete problems, and computer scientists don't know an efficient way to solve any of them! (Lots of computer scientists think probably no efficient algorithms exist for these problems, but no one has been able to prove that.)
Competitive Facility Location: Two players alternately choose locations. Each location has a value. No one is allowed to choose a location adjacent to a location that has already been taken. The problem is to determine, given a target bound B, whether there is a strategy for one of the players by which he can be guaranteed to acquire a set of locations with total value B or greater. This is a problem which is believed to be even harder than an NP-complete problem. It is conjectured that not only is there no efficient way to solve the problem, but also there is no efficient algorithm for checking a solution.