Sunday, March 30, 2014

Find a line which passes the most number of points

Problem
Given a two dimensional graph with points on it, find a line which passes the most number of points.
Solution
Method 1 - Naive solution
This is the brute force solution. We take point 1, and then point 2 and make a line. Now in the third nested loop, check if point 3 is existing on the same line or not.
Pseudocode
Line findLineWithMaxPoint(Set points){
   foreach Point p1 in points
      foreach Point p2  in (points - {p1}){
         Line l = makeLine(p1,p2);
         int count = 2 //line contains p1 and p2
         foreach(Point p3 in (points-{p1,p2})){
            if(l.contains(p3))
               count++;
         }
         if(maxCount<count){
            count = maxCount;
            resultLine = l;
         }
      }
return resultLine;
}
As we can see there are 3 nested loops, and hence time complexity : O(n^3).

Java code
public class Line {
    private double slope;
    private double intercept;
 
    private final double THRESHOLD = Math.pow(10, -8);
 
    public Line(double slope, double intercept) {
        super();
        this.slope = slope;
        this.intercept = intercept;
    }
 
    public Line(Pair<Double> p1, Pair<Double> p2) {
        this.slope = (p2.getY() - p1.getY()) / (p2.getX() - p1.getX());
        this.intercept = (p2.getX() * p1.getY() - p1.getX() * p2.getY())
                / (p2.getX() - p1.getX());
    }
 
    public boolean contains(Pair<Double> p) {
        double x = p.getX();
        double y = p.getY();
        return Math.abs(y - (slope * x + intercept)) < THRESHOLD;
    }
}
 
public static Line findLine(Set<Pair<Double>> points) {
    int maxNumber = -Integer.MAX_VALUE;
    int number = 0;
    Line result = null;
    for (Pair<Double> p1 : points) {
        for (Pair<Double> p2 : points) {
            if (!p1.equals(p2)) {
                number = 0;
                Line line = new Line(p1, p2);
                for (Pair<Double> p : points) {
                    if (line.contains(p))
                        number++;
                }
                if (number > maxNumber) {
                    maxNumber = number;
                    result = line;
                } 
            }
        }
    }
    return result;
}


Method 2 - Use hashtable
Problem with above approach is that the lines may be duplicate. And if we have points p1, and p2 already covered as part of line, which helped us find p3, when we select p1 and p3, we will again select p2.

We have a bunch of line segments, represented as a slope and y-intercept, and we want to find the most common slope and y-intercept. How can we find the most common one? This is really no different than the old "find the most common number in a list of numbers" problem. We just iterate through the lines segments and use a hash table to count the number of times we've seen each line.

This can be solved in O(n^2) time and O(m) space. where m is the distinct number of lines among all given points.

  1. Use a hash table - key is line and number of appearance of line as value.
  2. For each pair of points find the line in slope y intercept form
  3. if the line is not there in hash table, add it to the hash table with appearance value 1
  4. if the line is already in hash table, increment the appearance value
  5. line with the max number of appearance in the hash table is the result.

Java code
In java we have to implement a hashCode() method to make the class object uniquely identifiable by the hashtable.
public static Line findBestLine(GraphPoint[] points) {
    Line bestLine = null;
    HashMap<Line, Integer> line_count = new HashMap<Line, Integer>();
    for (int i = 0; i < points.length; i++) {
        for (int j = i + 1; j < points.length; j++) {
            Line line = new Line(points[i], points[j]);
            if (!line_count.containsKey(line)) {
                line_count.put(line, 0);
            }
            line_count.put(line, line_count.get(line) + 1);
            if (bestLine == null
                    || line_count.get(line) > line_count.get(bestLine)) {
                bestLine = line;
            }
        }
    }
    return bestLine;
}
 
public class Line {
    private static double epsilon = .0001;
    public double slope;
    public double intercept;
    private boolean infinite_slope = false;
 
    public Line(GraphPoint p, GraphPoint q) {
        if (Math.abs(p.x - q.x) > epsilon) { // if x’s are different
            slope = (p.y - q.y) / (p.x - q.x); // compute slope
            intercept = p.y - slope * p.x; // y intercept from y=mx+b
        } else {
            infinite_slope = true;
            intercept = p.x; // x-intercept, since slope is infinite
        }
    }
 
    public boolean isEqual(double a, double b) {
        return (Math.abs(a - b) < epsilon);
    }
 
    @Override
    public int hashCode() {
        int sl = (int) (slope * 1000);
        int in = (int) (intercept * 1000);
        return sl | in;
    }
 
    @Override
    public boolean equals(Object o) {
        Line l = (Line) o;
        if (isEqual(l.slope, slope) && isEqual(l.intercept, intercept)
                && (infinite_slope == l.infinite_slope)) {
            return true;
        }
        return false;
    }
}


  • Be careful about the calculation of the slope of a line. The line might be completely vertical. We can keep track of this in a separate flag (infinite_slope). We need to check this condition in the equals method.
  • Remember that when we perform division to calculate the slope, division is not exact. Therefore, rather than checking to see if two slopes are exactly equal, we need to check if they’re different by greater than epsilon, in our case 10-8.
Method 3 - Hough transform
Not really sure how hough transform works, but you can read more here.

References
http://tianrunhe.wordpress.com/2012/04/03/find-a-line-which-passes-the-most-number-of-points/

0 comments:

Post a Comment