Kruskal's Algorithm

Kruskal's Algorithm

Category

Famous Algorithms

Difficulty

Hard

Problem Statement

Given a connected, undirected, weighted graph with V vertices and E edges, find the minimum spanning tree (MST) — a subset of edges that connects all vertices with the minimum total edge weight and no cycles.

Intuition

Kruskal's algorithm applies a greedy strategy: always pick the cheapest available edge that does not create a cycle. This works because of the cut property of MSTs — for any cut of the graph, the minimum-weight edge crossing the cut must be in the MST. By processing edges in ascending weight order and adding them only if they connect two previously disconnected components, we build the MST one edge at a time.

Approach

  1. Sort all edges by weight in ascending order.
  2. Initialize a Union-Find (Disjoint Set Union) data structure with V components.
  3. Iterate through the sorted edges:
    • For each edge (u, v, w), check if u and v are in different components using find.
    • If they are, add this edge to the MST and union the two components.
    • If they are already in the same component, skip the edge (it would create a cycle).
  4. Stop when the MST has V - 1 edges (or all edges have been processed).
  5. Return the MST edges and total weight.

Pseudocode

function kruskal(vertices, edges):
    sort edges by weight ascending
    uf = new UnionFind(vertices)
    mst = []
    totalWeight = 0

    for (u, v, w) in edges:
        if uf.find(u) != uf.find(v):
            uf.union(u, v)
            mst.append((u, v, w))
            totalWeight += w
            if length(mst) == length(vertices) - 1:
                break

    return mst, totalWeight

class UnionFind:
    parent = [i for i in range(n)]
    rank = [0 for i in range(n)]

    function find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])    // path compression
        return parent[x]

    function union(x, y):
        rootX = find(x)
        rootY = find(y)
        if rootX == rootY: return
        if rank[rootX] < rank[rootY]:
            parent[rootX] = rootY
        elif rank[rootX] > rank[rootY]:
            parent[rootY] = rootX
        else:
            parent[rootY] = rootX
            rank[rootX] += 1

Time & Space Complexity

MeasureComplexityJustification
TimeO(E log E)Dominated by sorting edges. Union-Find operations are nearly O(1) amortized (inverse Ackermann).
SpaceO(V + E)Union-Find arrays of size V, edge list of size E

Key Insights

  • Union-Find with path compression and union by rank makes cycle detection nearly O(1) per operation, which is why the sort dominates the runtime.
  • Kruskal's is particularly efficient for sparse graphs (E close to V) since sorting E edges is cheap. For dense graphs (E close to V^2), Prim's algorithm with a Fibonacci heap (O(E + V log V)) may be better.
  • The algorithm naturally handles disconnected graphs: it produces a minimum spanning forest (one MST per connected component).
  • If all edge weights are distinct, the MST is unique. With duplicate weights, multiple valid MSTs may exist but all have the same total weight.

Algorithm Dependencies

This algorithm depends on union-find for O(alpha(n)) amortized cycle detection. Union-Find with path compression and union by rank determines whether adding an edge would create a cycle by checking if both endpoints are in the same connected component.

Edge Cases

  • Single vertex, no edges: the MST is empty with weight 0.
  • Already a tree (E = V - 1): all edges are in the MST (assuming connected).
  • Disconnected graph: the algorithm produces a minimum spanning forest; the MST will have fewer than V - 1 edges.
  • Negative edge weights: handled correctly since the algorithm only compares weights.
  • Self-loops: always create a cycle with a single node; they are never included.
  • Parallel edges (multi-graph): only the cheapest among parallel edges can be in the MST; the sort naturally prioritizes it.