Find Loop

Find Loop

Category

Linked Lists

Difficulty

Medium

Problem Statement

Given a singly linked list that contains a cycle (loop), find the node where the cycle begins. The head of the linked list is guaranteed to not be part of the cycle itself (i.e., there is a "tail" portion leading into the cycle). Return the node at the start of the cycle.

Intuition

This is solved using Floyd's Tortoise and Hare algorithm. A slow pointer (moving one step at a time) and a fast pointer (moving two steps at a time) will eventually meet inside the cycle. The mathematical insight is that when they meet, the distance from the head to the cycle start equals the distance from the meeting point to the cycle start (traveling forward along the cycle). So after the meeting, reset one pointer to the head, then advance both pointers one step at a time; they will meet at the cycle start.

Why it works: Let D be the distance from head to cycle start, and let M be the distance from cycle start to the meeting point. When slow has traveled D + M steps, fast has traveled 2(D + M) steps. The difference (D + M) must be a multiple of the cycle length C. This relationship means that traveling D more steps from the meeting point lands you back at the cycle start.

Approach

  1. Initialize both slow and fast pointers at the head.
  2. Phase 1 (Detect meeting point): Advance slow by 1 and fast by 2 until they meet.
  3. Phase 2 (Find cycle start): Reset slow to the head. Advance both slow and fast by 1 step at a time until they meet again.
  4. The node where they meet in Phase 2 is the start of the cycle.

Pseudocode

function findLoop(head):
    slow = head
    fast = head

    // Phase 1: find meeting point
    while true:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            break

    // Phase 2: find cycle start
    slow = head
    while slow != fast:
        slow = slow.next
        fast = fast.next

    return slow

Time & Space Complexity

  • Time: O(n), where n is the total number of nodes. Phase 1 takes at most O(n) steps, and Phase 2 takes at most O(n) steps.
  • Space: O(1). Only two pointers are used.

Key Insights

  • Floyd's algorithm is the classic O(1) space solution for cycle detection and cycle start identification.
  • The meeting point is guaranteed to exist since a cycle is guaranteed.
  • The mathematical relationship D = kC - M (where k is some positive integer) is what makes Phase 2 work.
  • An alternative using a hash set of visited nodes uses O(n) space and is simpler but less elegant.

Edge Cases

  • The cycle starts at the second node (minimal tail of length 1).
  • The entire list except the head is the cycle.
  • The cycle has length 1 (a node pointing to itself).
  • Very long tail with a small cycle.
  • Very short tail with a large cycle.