Find Loop
Find Loop
Category
Linked Lists
Difficulty
Medium
Problem Statement
Given a singly linked list that contains a cycle (loop), find the node where the cycle begins. The head of the linked list is guaranteed to not be part of the cycle itself (i.e., there is a "tail" portion leading into the cycle). Return the node at the start of the cycle.
Intuition
This is solved using Floyd's Tortoise and Hare algorithm. A slow pointer (moving one step at a time) and a fast pointer (moving two steps at a time) will eventually meet inside the cycle. The mathematical insight is that when they meet, the distance from the head to the cycle start equals the distance from the meeting point to the cycle start (traveling forward along the cycle). So after the meeting, reset one pointer to the head, then advance both pointers one step at a time; they will meet at the cycle start.
Why it works: Let D be the distance from head to cycle start, and let M be the distance from cycle start to the meeting point. When slow has traveled D + M steps, fast has traveled 2(D + M) steps. The difference (D + M) must be a multiple of the cycle length C. This relationship means that traveling D more steps from the meeting point lands you back at the cycle start.
Approach
- Initialize both slow and fast pointers at the head.
- Phase 1 (Detect meeting point): Advance slow by 1 and fast by 2 until they meet.
- Phase 2 (Find cycle start): Reset slow to the head. Advance both slow and fast by 1 step at a time until they meet again.
- The node where they meet in Phase 2 is the start of the cycle.
Pseudocode
function findLoop(head):
slow = head
fast = head
// Phase 1: find meeting point
while true:
slow = slow.next
fast = fast.next.next
if slow == fast:
break
// Phase 2: find cycle start
slow = head
while slow != fast:
slow = slow.next
fast = fast.next
return slow
Time & Space Complexity
- Time: O(n), where n is the total number of nodes. Phase 1 takes at most O(n) steps, and Phase 2 takes at most O(n) steps.
- Space: O(1). Only two pointers are used.
Key Insights
- Floyd's algorithm is the classic O(1) space solution for cycle detection and cycle start identification.
- The meeting point is guaranteed to exist since a cycle is guaranteed.
- The mathematical relationship D = kC - M (where k is some positive integer) is what makes Phase 2 work.
- An alternative using a hash set of visited nodes uses O(n) space and is simpler but less elegant.
Edge Cases
- The cycle starts at the second node (minimal tail of length 1).
- The entire list except the head is the cycle.
- The cycle has length 1 (a node pointing to itself).
- Very long tail with a small cycle.
- Very short tail with a large cycle.