(rev. 03/11/2012)
Algorithm for Finding Nearest Pair of Points in a Set P of Points in the Plane
- #1 Pre-processing: If necessary, rotate the set so that no pair of points
have the same x- or y-coordinate. (I'm not sure of the complexity of
finding such a rotation. We'll just restrict ourselves to seeking a
solution to the simplified problem in which no two points have the same
x- or y-coordinates.)
- #2 Pre-processing: Also produce lists Px (the points of P,
sorted by increasing x-coordinate) and Py (the points of P,
sorted by increasing y-coordinate). "Attach to each entry in each list a
record of the position of that point in both lists." So Px
and Py are lists of entries like this: (xi,
yi, rank of xi in Px, rank of
yi in Py). So, for example, an element like this
(3, 8, 24, 81) would be the point (3,8) in the plane and the 24 and 81
mean that this point is the 24th item in Px and the 81st
element in Py. This can all be done in O(N*log(N)) time where
|P|=N. As an example of a possible structure, Px could be
implemented as an array of N pointers, the ith item pointing to a record
representing the point with x-rank equal to i. Py could be
implemented similarly, pointing into the same collection of records as
Px.
- Let Q be the first ceil(N/2) elements of Px - the "left half".
Let R be the last floor(N/2) elements of Px - the "right
half".
- In O(N) time create Qx, a version of the "left half" Q sorted
by increasing x-value. This can be done by walking through Px
and selecting the first ceil(N/2) elements to build a new list. We might
use a new array of ceil(N/2) pointers for Qx and point the
pointers at duplicate records of the ones used for Px. These
records have the form (s,t,u,v), where s is the x-coord, t is the
y-coord, u is the x-rank, and v is the y-rank. The ranks u and v are
relative to Px and Py, but u is also correct
relative to Qx.
- In O(N) time create Qy - the elements of Q in order of
ascending y-value. This can be done by traversing Py from
lowest y to highest y, and selecting the points whose x-rank is <=
ceil(N/2). This information could be used to create another array of
pointers pointing into the set of records used with Qx. The
idea would be, for the ith element found in Py with x-rank
rx <= ceil(N/2), to point the ith element of the new
Qy array at the element J=(s,t,u,v), to which the
rxth element of Qx points, and change the value of
v to i, to indicate the y-rank of J within Qy.
- In a similar manner, create Rx and Ry in O(N) time
- lists of the points on the "right side" by increasing x-value and
y-value.
- To summarize, we can construct, in O(N) time, Qx,
Qy, Rx, and Ry, so that we now have two
problems of half the size, problems that are exactly alike in structure
to the original problem represented by P, Px, and
Py. Note that this did NOT require sorting the half-sets. We
need do only O(N) work to obtain sorted half-sets alike in structure to
Px, and Py.
- Next, recursively determine the closest pairs (q0*,
q1*) of points in Q and (r0*,
r1*) in R (assume that the base case is where the
set of points contains 3 or fewer elements, and the solution is computed
in that case just by examining all possible pairs of points.)
- let δ = min( d(q0*,
q1*), d(r0*,
r1*) ).
- Let L be the vertical line that goes through x* - the
x-coordinate of the rightmost point in Q.
- Any pair of points of P whose distance apart is less than δ must be
inside the "strip" of width 2δ with center-line L.
- By scanning through Py in O(N) time, select the set of
elements of P that lie inside the strip, ordered by increasing y-value.
(We simply select points whose x-coordinate differ from x* by
δ or less.) Call that ordered list Sy.
- For each point s ∈ Sy, compute distance from s to the
next 15 points in Sy . let (σ, σ') be the pair
found achieving the minimum of these distances, after completing the
entire traversal of Sy. The traversal requires O(N) work.
- Return the lesser of δ and d(s,s').