Multi String Search

Difficulty

Hard

Problem Statement

Given a "big" string and an array of "small" strings, determine for each small string whether it is contained in the big string. Return an array of booleans corresponding to each small string.

Intuition

A naive approach checks each small string individually using substring search, which can be expensive if there are many small strings. Instead, we build a trie from the small strings and then traverse the big string, checking against the trie from each starting position. Alternatively, we can build a suffix trie from the big string and look up each small string, though this uses more space.

The trie-based approach is efficient because it allows us to search for all small strings simultaneously during a single scan-like process over the big string.

Approach

Using a Trie of Small Strings

Build a trie from all small strings, marking the end of each word with its index.
For each starting index i in the big string: a. Traverse the trie character by character starting from bigString[i]. b. Whenever a word-end marker is encountered, mark that small string as found. c. Stop if the character is not in the trie or the big string is exhausted.
Return the results array.

Pseudocode

function multiStringSearch(bigString, smallStrings):
    trie = new Trie()
    for i from 0 to length(smallStrings) - 1:
        trie.insert(smallStrings[i], i)

    results = array of length(smallStrings), all false

    for i from 0 to length(bigString) - 1:
        searchTrie(bigString, i, trie, results)

    return results

function searchTrie(bigString, startIdx, trie, results):
    node = trie.root
    for i from startIdx to length(bigString) - 1:
        char = bigString[i]
        if char not in node.children:
            break
        node = node.children[char]
        if node.wordIndices is not empty:
            for idx in node.wordIndices:
                results[idx] = true

Time & Space Complexity

Time: O(b * s + n * s), where b is the length of the big string, s is the length of the longest small string, and n is the number of small strings. Building the trie takes O(n * s). Searching takes O(b * s) in the worst case.
Space: O(n * s) for the trie.

Key Insights

The trie allows efficient prefix matching against multiple patterns simultaneously.
Building the trie from small strings (rather than a suffix trie from the big string) is more space-efficient when the total length of small strings is much less than the big string length.
For very large big strings with few small strings, individual substring searches (e.g., KMP or Rabin-Karp) might be simpler.
The Aho-Corasick algorithm builds on this approach by adding failure links for even more efficient multi-pattern matching in O(b + n*s + matches).

Edge Cases

Empty small strings array: return an empty array.
A small string is empty: it is trivially contained in any string.
A small string is longer than the big string: it cannot be contained.
Duplicate small strings: each occurrence in the array should be marked independently.
Big string is empty: all small strings (except empty ones) return false.
Small strings that are prefixes of each other: the trie handles this naturally with markers at multiple levels.

Multi String Search

Multi String Search

Category

Difficulty

Problem Statement

Intuition

Approach

Using a Trie of Small Strings

Pseudocode

Time & Space Complexity

Key Insights

Edge Cases

Linked mentions

Multi String Search

Tags

Multi String Search

Category

Difficulty

Problem Statement

Intuition

Approach

Using a Trie of Small Strings

Pseudocode

Time & Space Complexity

Key Insights

Edge Cases

Linked mentions