Merge Sorted Arrays
Merge Sorted Arrays
Category
Heaps
Difficulty
Very Hard
Problem Statement
Given k sorted arrays, merge them into a single sorted array. The total number of elements across all arrays is n.
Intuition
A naive approach merges arrays pairwise, but this repeatedly scans already-merged elements. A min heap (priority queue) of size k provides a much better strategy: at any moment, the next smallest element across all k arrays must be the minimum of the k current front elements. The heap efficiently identifies this minimum.
Approach
- Initialize a min heap.
- Insert the first element of each non-empty array into the heap, along with metadata: which array it came from and its index within that array.
- While the heap is not empty:
- Extract the minimum element.
- Append it to the result.
- If the extracted element's array has a next element, insert that next element into the heap.
- Return the result.
Pseudocode
function mergeSortedArrays(arrays):
minHeap = new MinHeap()
result = []
// Initialize heap with the first element of each array
for i from 0 to length(arrays)-1:
if arrays[i] is not empty:
minHeap.insert((arrays[i][0], i, 0))
// value, arrayIndex, elementIndex
while minHeap is not empty:
(value, arrayIdx, elemIdx) = minHeap.extractMin()
result.append(value)
// If there's a next element in the same array, insert it
if elemIdx + 1 < length(arrays[arrayIdx]):
nextVal = arrays[arrayIdx][elemIdx + 1]
minHeap.insert((nextVal, arrayIdx, elemIdx + 1))
return result
Time & Space Complexity
| Measure | Complexity | Justification |
|---|---|---|
| Time | O(n log k) | Each of the n elements is inserted into and extracted from a heap of size at most k; each operation is O(log k) |
| Space | O(k + n) | The heap holds at most k elements; the result array holds n elements |
Key Insights
- The heap always has at most
kelements (one per array), keeping operations efficient even whennis very large. - This is the standard algorithm behind external sort (merging sorted runs from disk) and is used in many database systems.
- An alternative approach is divide and conquer: recursively merge pairs of arrays. This is also O(n log k) but uses O(n log k) space for intermediate arrays unless done in-place.
- In languages with built-in priority queues (Python's
heapq, Java'sPriorityQueue), the implementation is straightforward.
Edge Cases
- k = 0 (no arrays): return an empty array.
- Some arrays are empty: skip them during initialization.
- k = 1: the single array is already sorted; return it directly.
- All arrays have one element each: reduces to sorting k elements, which the heap does in O(k log k).
- Arrays of vastly different lengths: the algorithm handles this naturally since each array independently feeds the heap.
- Duplicate values across arrays: handled correctly; the heap may contain equal values and extracts them in any stable order.