Line Through Points
Line Through Points
Category
Arrays
Difficulty
Very Hard
Problem Statement
Given a set of 2D points, find the maximum number of points that lie on the same straight line. The points are given as an array of (x, y) coordinate pairs.
Intuition
For each point, we can compute the slope it forms with every other point. Points that share the same slope relative to a fixed reference point are collinear with it. By grouping slopes and finding the largest group, we find the maximum collinear count through that reference point. Repeating this for every point as the reference gives the global answer. The critical subtlety is representing slopes in a way that avoids floating-point errors — using reduced fractions (rise/run in lowest terms) instead of decimal division.
Approach
- For each point
pin the array, treat it as the "anchor" and compute the slope to every other pointq. - Represent the slope as a pair
(dy, dx)reduced to lowest terms using the GCD. Normalize sign so thatdxis always non-negative (and ifdx == 0, ensuredyis positive). This avoids hash collisions between equivalent slopes. - Use a hash map to count how many points share each slope relative to the anchor.
- Track duplicate points (same coordinates as the anchor) separately, since they are collinear with any line through the anchor.
- The best count through this anchor is
maxSlopeCount + duplicates. - The global answer is the maximum across all anchors.
Pseudocode
function lineThroughPoints(points):
if len(points) <= 2:
return len(points)
maxPoints = 1
for i from 0 to len(points) - 1:
slopeMap = {}
duplicates = 1 # count self
for j from 0 to len(points) - 1:
if i == j:
continue
dx = points[j].x - points[i].x
dy = points[j].y - points[i].y
if dx == 0 and dy == 0:
duplicates += 1
continue
g = gcd(abs(dy), abs(dx))
dy /= g
dx /= g
# Normalize sign
if dx < 0:
dx = -dx
dy = -dy
elif dx == 0:
dy = abs(dy)
slope = (dy, dx)
slopeMap[slope] = slopeMap.get(slope, 0) + 1
localMax = duplicates
for count in slopeMap.values():
localMax = max(localMax, count + duplicates)
maxPoints = max(maxPoints, localMax)
return maxPoints
Time & Space Complexity
- Time: O(n^2) where n is the number of points. For each of the n points, we iterate over all other n-1 points. GCD computation is O(log(max_coordinate)) per pair, which is negligible.
- Space: O(n) for the slope hash map, which is rebuilt for each anchor point.
Key Insights
- Representing slopes as reduced integer fractions avoids floating-point precision issues entirely. This is essential for correctness.
- Sign normalization ensures that slopes like (1, -2) and (-1, 2) are treated as the same slope.
- Vertical lines (dx = 0) and horizontal lines (dy = 0) are handled naturally by the fraction representation.
- Duplicate points must be counted separately since they have an undefined slope but lie on every line through that location.
- The O(n^2) approach is optimal because in the worst case, verifying that more than 2 points are collinear requires examining all pairs.
Edge Cases
- Single point: return 1.
- Two points: always collinear, return 2.
- All points are the same: all duplicates, return n.
- All points are collinear: the answer is n.
- Large coordinates: ensure GCD and fraction operations do not overflow (use long integers if necessary).
- Points forming a grid: many distinct slopes exist, but the answer might still be a full row or column.