Union Find

Difficulty

Medium

Problem Statement

Implement the Union-Find (Disjoint Set Union) data structure that supports two primary operations efficiently:

Find(x): Determine which set element x belongs to (return the representative/root of the set).
Union(x, y): Merge the sets containing elements x and y.

Use union by rank and path compression optimizations for near-constant amortized time per operation.

Intuition

Union-Find maintains a forest of trees, where each tree represents a set and the root of the tree is the set's representative. Initially, each element is its own set (a tree of size 1). Union merges two trees, and Find traverses up the tree to locate the root.

Without optimization, trees can become tall and degenerate into linked lists. Union by rank keeps trees balanced by always attaching the shorter tree under the taller one. Path compression flattens the tree during Find operations by making every visited node point directly to the root. Together, these yield nearly O(1) amortized time per operation.

Approach

Initialize: Each element is its own parent with rank 0.
Find(x): Follow parent pointers from x to the root. Apply path compression by setting each visited node's parent to the root.
Union(x, y): a. Find the roots of x and y. b. If they are the same, do nothing (already in the same set). c. Otherwise, attach the tree with lower rank under the tree with higher rank. If ranks are equal, choose one as the new root and increment its rank.

Pseudocode

class UnionFind:
    parent = array where parent[i] = i
    rank = array where rank[i] = 0

    function find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])  // path compression
        return parent[x]

    function union(x, y):
        rootX = find(x)
        rootY = find(y)
        if rootX == rootY:
            return
        // union by rank
        if rank[rootX] < rank[rootY]:
            parent[rootX] = rootY
        else if rank[rootX] > rank[rootY]:
            parent[rootY] = rootX
        else:
            parent[rootY] = rootX
            rank[rootX] += 1

Time & Space Complexity

Time: O(alpha(n)) amortized per Find or Union operation, where alpha is the inverse Ackermann function. For all practical purposes, this is effectively O(1).
Space: O(n) for the parent and rank arrays.

Key Insights

Path compression alone gives O(log n) amortized time. Combined with union by rank, it improves to O(alpha(n)).
The inverse Ackermann function alpha(n) grows so slowly that it is at most 4 for any physically realizable input size (n < 2^65536).
Union by rank keeps the tree height logarithmic, while path compression flattens it over time.
An alternative to union by rank is union by size, which also achieves O(alpha(n)) with path compression.
Union-Find is used extensively in Kruskal's MST algorithm, connected components, cycle detection in undirected graphs, and percolation problems.

System Design Application

Union-Find is the cycle detection mechanism in kruskals-algorithm. Kruskal's MST algorithm processes edges in ascending weight order and uses Union-Find to check whether adding an edge would create a cycle (i.e., both endpoints are already in the same component). With path compression and union by rank, each Union-Find operation is nearly O(1) amortized, making the edge sort the dominant cost.

Edge Cases

Union of an element with itself: no-op (roots are the same).
Find on an element that has never been unioned: returns itself.
Many unions creating a single large set: path compression ensures subsequent finds are fast.
Sequence of finds without any unions: each find returns the element itself.
Large number of elements with sparse unions: most sets remain singletons.

Union Find

Union Find

Category

Difficulty

Problem Statement

Intuition

Approach

Pseudocode

Time & Space Complexity

Key Insights

System Design Application

Edge Cases

Linked mentions

Union Find

Tags

Union Find

Category

Difficulty

Problem Statement

Intuition

Approach

Pseudocode

Time & Space Complexity

Key Insights

System Design Application

Edge Cases

Linked mentions