Topological Sort

Topological Sort

Category

Famous Algorithms

Difficulty

Hard

Problem Statement

Given a directed acyclic graph (DAG) with n nodes and a list of directed edges, produce a linear ordering of the nodes such that for every directed edge (u, v), node u appears before node v in the ordering. If the graph has a cycle, no valid topological ordering exists.

Intuition

A topological sort is possible if and only if the graph is a DAG. Two classic approaches exist:

  1. Kahn's Algorithm (BFS-based): Nodes with no incoming edges (in-degree 0) have no prerequisites and can appear first. After "removing" them, new nodes may become prerequisite-free. This peeling process continues until all nodes are placed or a cycle is detected.

  2. DFS-based: A node's position in the topological order is after all nodes it can reach. A post-order DFS traversal naturally discovers this: when a node's DFS finishes (all descendants explored), it is appended to the result. Reversing the result gives the topological order.

Approach

Kahn's Algorithm (BFS)

  1. Compute the in-degree of every node.
  2. Add all nodes with in-degree 0 to a queue.
  3. While the queue is not empty:
    • Dequeue a node u, append it to the result.
    • For each neighbor v of u, decrement v's in-degree. If it becomes 0, enqueue v.
  4. If the result contains all n nodes, return it. Otherwise, a cycle exists.

DFS-based

  1. Maintain a visited state for each node: unvisited, in-progress, or done.
  2. For each unvisited node, run DFS:
    • Mark the node as in-progress.
    • Recurse on all unvisited neighbors. If any neighbor is in-progress, a cycle exists.
    • Mark the node as done and push it onto a stack.
  3. Pop all nodes from the stack to get the topological order.

Pseudocode

Kahn's Algorithm

function topologicalSort(nodes, edges):
    graph = adjacency list from edges
    inDegree = array of size n, filled with 0

    for (u, v) in edges:
        inDegree[v] += 1

    queue = all nodes where inDegree[node] == 0
    result = []

    while queue is not empty:
        u = queue.dequeue()
        result.append(u)
        for v in graph[u]:
            inDegree[v] -= 1
            if inDegree[v] == 0:
                queue.enqueue(v)

    if length(result) != n:
        return []    // cycle detected
    return result

DFS-based

function topologicalSort(nodes, edges):
    graph = adjacency list from edges
    state = array of UNVISITED for each node
    stack = []

    for node in nodes:
        if state[node] == UNVISITED:
            if not dfs(node, graph, state, stack):
                return []    // cycle detected

    return reverse(stack)

function dfs(u, graph, state, stack):
    state[u] = IN_PROGRESS
    for v in graph[u]:
        if state[v] == IN_PROGRESS:
            return false    // cycle
        if state[v] == UNVISITED:
            if not dfs(v, graph, state, stack):
                return false
    state[u] = DONE
    stack.push(u)
    return true

Time & Space Complexity

MeasureComplexityJustification
TimeO(V + E)Each node and edge is processed exactly once
SpaceO(V + E)Adjacency list plus auxiliary arrays

Key Insights

  • A topological ordering exists if and only if the graph is a DAG. Both algorithms double as cycle detectors.
  • The topological order is generally not unique; many valid orderings may exist. Kahn's algorithm can produce different orderings depending on queue processing order.
  • Kahn's algorithm is iterative and naturally avoids stack overflow on deep graphs. The DFS approach is elegant but may hit recursion limits on very large graphs.
  • Common applications: build systems (Make, Gradle), course scheduling, task dependency resolution, and compilation order.

Edge Cases

  • Graph with a cycle: no valid ordering exists; return empty or signal an error.
  • Single node, no edges: the trivial ordering is just that node.
  • Disconnected graph: both algorithms handle disconnected components (Kahn's starts with all in-degree-0 nodes; DFS iterates over all nodes).
  • All nodes independent (no edges): any permutation is a valid topological order.
  • Linear chain (A -> B -> C -> ...): exactly one valid ordering.