Smallest Substring Containing

Smallest Substring Containing

Category

Strings

Difficulty

Hard

Problem Statement

Given a big string and a small string, find the smallest substring of the big string that contains all the characters of the small string. Characters can appear in any order, and if a character appears multiple times in the small string, it must appear at least that many times in the substring.

Intuition

This is the classic minimum window substring problem. We use a sliding window with two pointers. The window expands by moving the right pointer to include more characters, and contracts by moving the left pointer to try to minimize the window size. We maintain a character frequency map to track how many required characters are currently in the window.

Approach

  1. Build a frequency map of the small string characters.
  2. Initialize two pointers left = 0, right = 0, a counter matched = 0 tracking how many unique characters are fully satisfied, and variables for the best window.
  3. Expand the window by moving right: a. If the character at right is in the frequency map, decrement its required count. If the count reaches 0, increment matched.
  4. While matched equals the number of unique characters in the small string: a. Update the best window if the current window is smaller. b. Try to shrink from the left: if the character at left is in the frequency map, increment its required count. If the count becomes positive, decrement matched. c. Move left right.
  5. Return the best window substring.

Pseudocode

function smallestSubstringContaining(bigString, smallString):
    targetCounts = frequency map of smallString
    uniqueChars = number of distinct keys in targetCounts
    windowCounts = {}
    matched = 0
    bestRange = [0, infinity]
    left = 0

    for right from 0 to length(bigString) - 1:
        char = bigString[right]
        if char in targetCounts:
            windowCounts[char] = windowCounts.get(char, 0) + 1
            if windowCounts[char] == targetCounts[char]:
                matched++

        while matched == uniqueChars:
            if right - left < bestRange[1] - bestRange[0]:
                bestRange = [left, right]

            leftChar = bigString[left]
            if leftChar in targetCounts:
                windowCounts[leftChar]--
                if windowCounts[leftChar] < targetCounts[leftChar]:
                    matched--
            left++

    if bestRange[1] == infinity:
        return ""

    return bigString[bestRange[0] : bestRange[1] + 1]

Time & Space Complexity

  • Time: O(b + s) where b is the length of the big string and s is the length of the small string. Each character is visited at most twice (once by right, once by left).
  • Space: O(s + u) where s is for the target frequency map and u is the number of unique characters in the window map.

Key Insights

  • Tracking matched (the count of fully satisfied unique characters) avoids rechecking the entire frequency map at each step, making the solution efficient.
  • The window only contracts when all characters are satisfied, ensuring we do not miss valid windows.
  • This is a template for many sliding window problems involving "contain all" or "at least k" character constraints.

Edge Cases

  • Small string is empty: return an empty string.
  • Big string is shorter than small string: return an empty string.
  • No valid window exists: return an empty string.
  • Small string has duplicate characters: the frequency map naturally handles this.
  • Multiple windows of the same minimum size: return any one (typically the first found).
  • Small string equals big string: return the entire big string.