## Sunday, August 4, 2013

### Closest Pair Algorithm

Here we will discuss about the closest pair algorithm.

Input : A set p = { p1,p2, ... pn} of n points in the plane (R2)

Notation d(pi , pj) = Euclidean distance.
d = x

Assumption - All points have distinct x-coordinates distinct y-coordinates.

Brute force method
Say you are given a set of points in the plane and you want to find the two points closest to one another. The brute force method of doing this would be to compute the distance between every possible pair of points and keep track of the minimum. This will take O(n2) because there is a quadratic number of pairs.

Can we do better?

Divide and Conquer
Divide and conquer - Initial observations
Let's assume that this problem were on a line, or 1-D. Then we can just sort all the points, O(nlogn) at best, and then run through the line to find the closest pair which must be adjacent points, O(n). So can we find a similar method that will solve the problem in 2-D?

1-D version at closest pair
1. sort the points//O (nlog n) time
2. return closest pair of adjacent pairs (O (n))

We can sort the points according to x-coordinate and y-coordinate which will just take O(nlogn). This may not work for all points.

2-D version
Now, we have to apply the algorithm on 2D.  If we wish to apply the divide and conquer method to solving this problem,  we might think to split the problem in half, recursively do work in each half, and combine the solutions of the subproblems to recover the solution to original problem.

So, here we make the copies of the points sorted by :
x-coordinate (Px) and
y-coordinate (Py)
This pre-processing takes O( n log n) for sorting.

So, now we will have various points which are :
a) very close in x axis (lets denote them by blue dots)
b) very close in y axis ( green dots)
c) not very far by both x and y axis (red dots)

So, here we apply divide and conquer algorithm twice - one for sorting (assuming we are using merge sort or quick sort) and then applying divide and conquer to the original problem.

ClosestPair (Px,Py) algorithm

Step1 : Let Q be the left half of P and R be the right half of P, which forms Qx, Qy, Rx, and Ry.
[This takes O (n) time ]
Step 2 : p1,q1 = ClosestPair(Qx,Qy)
Step 3 : p2,q2 = ClosestPair(Rx,Ry)
Step 4: p3,q3 = ClosestSplitPair(Px,Py)
Step 5 : return the best of [(p1,q1),(p2,q2) and (p3,q3)]

Time complexity:
Sorting of points = O(n log n)
Scanning = O(n)
Recursive call = 2 i.e. O (n log n)

So, overall time compexity = O(n log n)

Source