DSA Lab | Tries

What Is a Trie?

A trie (pronounced "try") is a tree-shaped data structure designed for storing and retrieving strings efficiently. The name comes from the word "retrieval" — Edward Fredkin coined the term in 1960, though the structure itself was described by René de la Briandais in 1959.

Each node in a trie represents a single character. The path from the root (the topmost node, which is empty) to any node spells out a prefix (the beginning portion of a string). Some nodes are marked as "end of word" to indicate a complete string ends there.

Think of it like a physical dictionary organized by spelling. The word "cat" and "car" share a path for "c" → "a", then branch: one path continues to "t", the other to "r". Words with common prefixes (shared beginnings) share the same nodes in the trie, which saves both space and lookup time.

Here's another analogy. Imagine a phone tree at a company. You call in, press 1 for sales, then 3 for enterprise, then 2 for pricing. Each button press narrows your destination. That's exactly how a trie search works — each character narrows the search space.

Why Not Just Use a Hash Set?

A hash set (a data structure that stores unique items and supports O(1) lookup using a hash function) can tell you "does this exact word exist?" in O(1) average time. But it can't do these things efficiently:

Prefix search: "Give me all words starting with 'pre'" — a trie walks to the prefix node and collects everything below it. A hash set would need to iterate through every stored word and check each one. That's O(n) instead of O(prefix_length + results).
Autocomplete: Type a few letters, get all completions. Tries do this naturally because the tree structure groups words by their shared prefixes.
Lexicographic ordering: A DFS (depth-first search) on a trie visits words in alphabetical order automatically. You get free sorting without an actual sort step.
Longest common prefix: Walk down the trie from the root until the path splits (a node has more than one child). Everything up to that point is the longest prefix shared by all words.
Wildcard matching: Searching for "c.t" (where . matches any character) is natural with DFS — at the wildcard position, explore all children instead of just one.

A Brief History

Tries emerged in the late 1950s when researchers needed fast string retrieval for early computing applications. The original motivation was building efficient dictionaries — literally, storing and looking up words. The structure turned out to be far more versatile than anyone expected. Today, tries (and their compressed variants) power everything from DNS resolution to genome analysis to the autocomplete in your phone's keyboard.

Real-World Applications

Autocomplete — every search bar you've ever used. Type "pyt", get "python", "pytorch", "pythagorean". Google processes billions of autocomplete queries daily using trie-like structures.
Spell checkers — is this word in the dictionary? Tries can also efficiently find nearby words (by edit distance) for "did you mean?" suggestions.
IP routing — routers use longest prefix match on binary representations of IP addresses. The routing table is essentially a binary trie where each level represents one bit of the IP address.
T9 predictive text — the old phone keyboards (where 2=ABC, 3=DEF, etc.) used tries internally. Each key press narrows the set of possible words.
Genome sequencing — suffix tries (a trie containing all suffixes of a string) enable fast DNA pattern matching. Finding where a short sequence appears in a 3-billion-character genome would be impossibly slow without them.
File systems — directory paths form a trie structure. /home/user/docs and /home/user/pics share the /home/user/ prefix.
HTTP routers — web frameworks like Express and FastAPI use radix trees (compressed tries) to match URL paths to handler functions. /api/users/:id and /api/users/:id/posts share the /api/users/ prefix in the routing trie.

How It Works

Structure

Each trie node has two components:

A map of child nodes: character → child node. For lowercase English letters, this could be either a hash map (flexible, slightly more memory overhead) or a fixed array of 26 slots (fast lookup, wastes space for sparse nodes). The hash map approach is almost always better in practice.
A boolean flag called isEnd (or is_end): "does a complete word end at this node?" This is critical — without it, you can't distinguish between "app" being a word and "app" being just the prefix of "apple".

The root node is empty — it doesn't represent any character. Think of it as the starting point. Every word begins its path from the root.

Insert

Walk character by character from the root. For each character:

Check if a child for this character exists.
If yes, move down to that child.
If no, create a new node and link it as the child for this character. Then move to it.

After processing the last character, mark that node as isEnd = true.

Example: Inserting "app" and "apple". "app" creates nodes root→a→p→p (marking the last 'p' as end-of-word). "apple" then walks down root→a→p→p (all exist already), then creates l→e (marking 'e' as end-of-word). The two words share the first three nodes — no duplication.

Search (Exact Match)

Walk character by character from the root:

If at any point the child for the current character doesn't exist, the word isn't in the trie. Return false immediately.
If you reach the last character successfully, check isEnd.
If isEnd is true, the word exists. If false, that string is only a prefix of some longer word (e.g., searching for "app" when only "apple" was inserted, and "app" wasn't marked as a word).

Prefix Search (startsWith)

Identical to search, but skip the isEnd check at the end. If you can walk the entire prefix without hitting a missing child, at least one word with that prefix exists in the trie. This is the operation that makes tries special.

Autocomplete (Collect All Words with Prefix)

Two steps: First, walk to the node at the end of the prefix (same as prefix search). Then, run a DFS from that node to collect all complete words below it. Each time you hit a node with isEnd = true, add the current path to your results.

Delete

Deletion is trickier than it looks. You can't just remove nodes blindly — they might be part of another word's path. The algorithm:

Walk to the end of the word. If the word doesn't exist, there's nothing to delete.
Unmark isEnd on the final node.
Walk back up the path. For each node, remove it if it has no children AND is not an end-of-word for another word. Stop removing as soon as you hit a node that has children or is an end-of-word.

Example: After inserting "app" and "apple", deleting "apple" unmarks the 'e' node and removes 'e' and 'l' (they have no children and aren't end-of-word for anything else). The 'p' node stays because it's the end-of-word for "app".

Interactive Visualization

Type a word and insert it into the trie. Search for words — the matching prefix path lights up green for found, red for not found. Watch the tree grow as you add more words.

Trie Explorer

Operations & Complexity

Let m = length of the word/key being operated on. Let n = total number of words in the trie.

Operation	Time	Notes
Insert	O(m)	Walk/create m nodes — one per character
Search	O(m)	Walk m nodes, check isEnd flag at the end
startsWith (Prefix)	O(m)	Walk m nodes — no isEnd check needed
Delete	O(m)	Walk down + cleanup walk back up
Autocomplete (all words with prefix)	O(m + k)	m to reach prefix node, k = total chars in all results below it

Space: O(n × m × alphabet_size) in the worst case — if no words share any prefixes, every character of every word gets its own node. But in practice, prefix sharing reduces this dramatically. For English text, tries use roughly 3-10x less space than you'd expect from the worst case.

💡 Space Optimization: Compressed Tries (Radix Trees) If memory is tight, consider a compressed trie, also called a radix tree or Patricia trie. The idea: merge chains of single-child nodes into one node holding a substring instead of a single character. For example, if "ation" always appears as a straight chain with no branching, collapse it into a single node labeled "ation". This cuts node count significantly for sparse key sets. Production HTTP routers (like Go's httprouter) use radix trees for URL path matching.

Implementation

Python — Full Trie with All Operations

Python

class TrieNode:
    """A single node in the trie. Holds a map of children and an end-of-word flag."""
    def __init__(self):
        self.children = {}   # char → TrieNode
        self.is_end = False  # True if a complete word ends at this node
        # Optional: store a count of words passing through this node
        # Useful for autocomplete ranking or counting prefix matches
        self.prefix_count = 0

class Trie:
    def __init__(self):
        self.root = TrieNode()
        self.word_count = 0  # Total words stored

    def insert(self, word: str) -> None:
        """Insert a word into the trie. O(m) where m = len(word)."""
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
            node.prefix_count += 1  # Track how many words pass through here
        if not node.is_end:  # Avoid double-counting duplicate inserts
            node.is_end = True
            self.word_count += 1

    def search(self, word: str) -> bool:
        """Check if an exact word exists in the trie. O(m)."""
        node = self._walk(word)
        # Must reach the end AND the node must be marked as a complete word
        return node is not None and node.is_end

    def starts_with(self, prefix: str) -> bool:
        """Check if any word in the trie starts with this prefix. O(m)."""
        # Just need to walk the prefix — no is_end check needed
        return self._walk(prefix) is not None

    def count_prefix(self, prefix: str) -> int:
        """Count how many words have this prefix. O(m)."""
        node = self._walk(prefix)
        return node.prefix_count if node else 0

    def _walk(self, s: str):
        """Follow the path for string s from root. Returns final node or None."""
        node = self.root
        for ch in s:
            if ch not in node.children:
                return None   # Path doesn't exist
            node = node.children[ch]
        return node

    def autocomplete(self, prefix: str, limit: int = 10) -> list:
        """Return all words starting with prefix (up to limit). O(m + k)."""
        node = self._walk(prefix)
        if node is None:
            return []
        results = []

        # DFS from the prefix node, collecting complete words
        def _dfs(n, path):
            if len(results) >= limit:
                return          # Early exit once we have enough
            if n.is_end:
                results.append(prefix + path)
            # Visit children in sorted order for alphabetical results
            for ch in sorted(n.children):
                _dfs(n.children[ch], path + ch)

        _dfs(node, "")
        return results

    def delete(self, word: str) -> bool:
        """Delete word from trie. Returns True if word existed.
        Cleans up nodes that are no longer part of any word's path."""
        def _delete(node, word, depth):
            if depth == len(word):
                if not node.is_end:
                    return False  # Word doesn't exist — nothing to delete
                node.is_end = False
                self.word_count -= 1
                # Safe to remove this node if it has no children
                return len(node.children) == 0

            ch = word[depth]
            if ch not in node.children:
                return False  # Word doesn't exist

            child = node.children[ch]
            child.prefix_count -= 1
            should_delete = _delete(child, word, depth + 1)

            if should_delete:
                del node.children[ch]  # Remove the child node
                # Propagate deletion: remove this node too if it's empty
                # and not an end-of-word for some other word
                return not node.is_end and len(node.children) == 0
            return False

        return _delete(self.root, word, 0)


# Usage
trie = Trie()
for word in ["apple", "app", "application", "apt", "banana", "band", "ban"]:
    trie.insert(word)

print(trie.search("app"))        # True
print(trie.search("appl"))       # False (only a prefix, not a word)
print(trie.starts_with("app"))   # True
print(trie.autocomplete("app"))  # ['app', 'apple', 'application']
print(trie.autocomplete("ban"))  # ['ban', 'banana', 'band']
print(trie.count_prefix("app"))  # 3 (app, apple, application)

trie.delete("apple")
print(trie.search("apple"))      # False (deleted)
print(trie.search("app"))        # True (still exists)

Python — Wildcard Search Variant

Python

class WildcardTrie(Trie):
    """Extends basic Trie to support '.' wildcard matching.
    This is exactly LC 211 — Design Add and Search Words Data Structure."""

    def search_with_wildcards(self, word: str) -> bool:
        """Search where '.' matches any single character. O(26^d * m) worst case
        where d is the number of dots, but usually much faster."""

        def _dfs(node, i):
            # Base case: processed all characters
            if i == len(word):
                return node.is_end

            ch = word[i]

            if ch == '.':
                # Wildcard: try ALL children
                for child in node.children.values():
                    if _dfs(child, i + 1):
                        return True
                return False
            else:
                # Normal character: follow the specific child
                if ch not in node.children:
                    return False
                return _dfs(node.children[ch], i + 1)

        return _dfs(self.root, 0)


# Usage
wt = WildcardTrie()
wt.insert("cat")
wt.insert("car")
wt.insert("bat")

print(wt.search_with_wildcards("c.t"))   # True (matches "cat")
print(wt.search_with_wildcards("c.r"))   # True (matches "car")
print(wt.search_with_wildcards(".at"))   # True (matches "cat" and "bat")
print(wt.search_with_wildcards("..."))   # True (matches any 3-letter word)
print(wt.search_with_wildcards("c..s"))  # False (no 4-letter word starting with c)

JavaScript

class TrieNode {
  constructor() {
    this.children = {};  // char → TrieNode
    this.isEnd = false;  // Does a complete word end here?
  }
}

class Trie {
  constructor() { this.root = new TrieNode(); }

  insert(word) {
    let node = this.root;
    for (const ch of word) {
      if (!node.children[ch]) node.children[ch] = new TrieNode();
      node = node.children[ch];
    }
    node.isEnd = true;
  }

  search(word) {
    const node = this._walk(word);
    return node !== null && node.isEnd;
  }

  startsWith(prefix) {
    return this._walk(prefix) !== null;
  }

  _walk(s) {
    let node = this.root;
    for (const ch of s) {
      if (!node.children[ch]) return null;
      node = node.children[ch];
    }
    return node;
  }

  autocomplete(prefix, limit = 10) {
    const node = this._walk(prefix);
    if (!node) return [];
    const results = [];

    const dfs = (n, path) => {
      if (results.length >= limit) return;
      if (n.isEnd) results.push(prefix + path);
      // Sort keys for alphabetical results
      for (const ch of Object.keys(n.children).sort()) {
        dfs(n.children[ch], path + ch);
      }
    };
    dfs(node, '');
    return results;
  }

  delete(word) {
    const _del = (node, depth) => {
      if (depth === word.length) {
        if (!node.isEnd) return false;
        node.isEnd = false;
        return Object.keys(node.children).length === 0;
      }
      const ch = word[depth];
      if (!node.children[ch]) return false;
      const shouldDelete = _del(node.children[ch], depth + 1);
      if (shouldDelete) {
        delete node.children[ch];
        return !node.isEnd && Object.keys(node.children).length === 0;
      }
      return false;
    };
    return _del(this.root, 0);
  }
}

Common Mistakes

⚠️ Beginner Pitfalls with Tries

Confusing search and startsWith: search("app") returns false if only "apple" is in the trie (because the 'p' node at the end of "app" has isEnd = false). startsWith("app") returns true because the prefix path exists. This is the most common trie bug in interviews — forgetting to check isEnd in search.
Deleting nodes that belong to other words: If "apple" and "app" are both in the trie, deleting "apple" should remove the 'l' and 'e' nodes but NOT the 'p' node (it's the end-of-word for "app"). You need to check both isEnd and child count before removing a node.
Using a fixed-size array when the alphabet is large: For lowercase English (26 chars), an array of 26 pointers per node is fine. But for Unicode strings, full ASCII, or mixed-case, a fixed array wastes enormous memory. Use a hash map for children unless you know the alphabet is small.
Not handling empty string: What happens if someone inserts an empty string ""? The root node itself gets marked as isEnd = true. Make sure your implementation handles this gracefully.
Forgetting that trie search is O(m), not O(1): Unlike a hash set, trie lookup scales with the length of the search string, not the number of stored strings. For very long keys, this matters. For typical dictionary words (5-15 chars), it's negligible.

Real-World Example: Autocomplete Search Engine

Here's a simplified version of what happens when you type into Google's search bar. The system maintains a trie of popular queries, with each word weighted by search frequency. As you type, it walks the trie and returns the most popular completions.

Python

import heapq

class AutocompleteNode:
    def __init__(self):
        self.children = {}
        self.is_end = False
        self.frequency = 0     # How often this query was searched
        # Top-k cache: store the best completions at each node
        # This avoids DFS on every keystroke (real systems do this)
        self.top_results = []

class AutocompleteEngine:
    """A search autocomplete system (simplified LC 642 variant).
    
    In production, this would be distributed across many servers,
    with tries sharded by first character or hash prefix. Google's
    actual system handles ~8.5 billion searches per day.
    """

    def __init__(self, k=5):
        self.root = AutocompleteNode()
        self.k = k  # Max suggestions to return

    def add_query(self, query: str, frequency: int = 1):
        """Record a search query with its frequency."""
        node = self.root
        for ch in query:
            if ch not in node.children:
                node.children[ch] = AutocompleteNode()
            node = node.children[ch]
        node.is_end = True
        node.frequency += frequency

    def suggest(self, prefix: str) -> list:
        """Return top-k suggestions for a prefix, ranked by frequency.
        
        Returns list of (query, frequency) tuples, highest frequency first.
        This is what fires on every keystroke in a search bar.
        """
        # Walk to the prefix node
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return []  # No words with this prefix
            node = node.children[ch]

        # DFS to collect all completions with their frequencies
        # Use a min-heap of size k to efficiently track the top results
        heap = []  # min-heap of (frequency, query)

        def _collect(n, path):
            if n.is_end:
                if len(heap) < self.k:
                    heapq.heappush(heap, (n.frequency, prefix + path))
                elif n.frequency > heap[0][0]:
                    heapq.heapreplace(heap, (n.frequency, prefix + path))
            for ch in n.children:
                _collect(n.children[ch], path + ch)

        _collect(node, "")
        # Sort by frequency (descending), then alphabetically for ties
        return sorted(heap, key=lambda x: (-x[0], x[1]))


# Simulate a search engine
engine = AutocompleteEngine(k=5)

# Load some search queries with their popularity
queries = [
    ("python tutorial", 50000), ("python download", 45000),
    ("python list", 30000), ("python dictionary", 28000),
    ("python for loop", 25000), ("python class", 22000),
    ("pytorch install", 15000), ("pythagorean theorem", 12000),
    ("pizza near me", 80000), ("pizza delivery", 60000),
]
for query, freq in queries:
    engine.add_query(query, freq)

# User starts typing...
print("Typing 'py':")
for freq, query in engine.suggest("py"):
    print(f"  {query} ({freq:,} searches)")

print("\nTyping 'python ':")
for freq, query in engine.suggest("python "):
    print(f"  {query} ({freq:,} searches)")

print("\nTyping 'pi':")
for freq, query in engine.suggest("pi"):
    print(f"  {query} ({freq:,} searches)")

The real Google autocomplete is vastly more complex — it considers your search history, location, trending topics, and recency. But the underlying data structure is a trie (or a compressed trie variant), with caching and precomputation at every level.

Interview Patterns

💡 Pattern: When to Use a Trie Over a Hash Set If the problem involves prefix matching, autocomplete, or "find all words with this prefix" — use a trie. If it's just "does this exact word exist?" — a hash set is simpler and faster. The signal: the word "prefix" in the problem statement is almost always a trie hint. Also look for: "lexicographic order," "common prefix," or "wildcard matching."

💡 Pattern: Word Search in Grid (LC 212) Build a trie from the word list, then DFS from each cell in the grid. At each step, check if the current path exists in the trie. This lets you prune early — if no word starts with the current prefix, stop exploring that direction. Much faster than checking each word individually against the grid. This is the hardest trie problem you'll see in interviews.

💡 Pattern: Bitwise Trie for XOR Problems For "maximum XOR" problems (like LC 421), build a trie where each node represents a bit (0 or 1) instead of a character. Insert numbers as their 32-bit binary representations. To find the max XOR with a given number, walk the trie greedily, always choosing the opposite bit (to maximize the XOR). This turns an O(n²) brute force into O(n × 32) = O(n).

💡 Don't Forget: Space Matters Tries can use a lot of memory. Each node might have 26 pointers (for lowercase English). In an interview, mention this tradeoff explicitly. If the interviewer pushes back on space, suggest using a hash map for children instead of an array, or mention compressed tries (radix trees). Showing awareness of practical tradeoffs makes a strong impression.

💡 Pattern: Trie for String Matching with Constraints Problems like "Replace Words" (LC 648) ask you to find the shortest prefix of each word that matches a dictionary entry. Build a trie from the dictionary, then for each word, walk the trie character by character — the first time you hit an isEnd node, that's your shortest match. Clean and efficient.

💡 Pattern: Design Autocomplete System (LC 642) This system design / coding hybrid problem combines a trie with frequency tracking. Each node tracks the most popular completions below it. The trick: precompute and cache the top-3 suggestions at each prefix node so you don't need to DFS on every keystroke. Real autocomplete systems use exactly this approach.

Practice Problems

Problem	Difficulty	Key Technique
LC 208 — Implement Trie (Prefix Tree)	Medium	Core trie implementation — insert, search, startsWith
LC 211 — Design Add and Search Words	Medium	Trie + DFS with '.' wildcard matching
LC 14 — Longest Common Prefix	Easy	Trie or vertical scanning (trie is overkill but good practice)
LC 648 — Replace Words	Medium	Trie for shortest prefix match
LC 1268 — Search Suggestions System	Medium	Trie-based autocomplete with sorted results
LC 212 — Word Search II	Hard	Trie + grid DFS backtracking with pruning
LC 336 — Palindrome Pairs	Hard	Trie of reversed words + palindrome suffix check

Tries (Prefix Trees)

On This Page

What Is a Trie?

Why Not Just Use a Hash Set?

A Brief History

Real-World Applications

How It Works

Structure

Insert

Search (Exact Match)

Prefix Search (startsWith)

Autocomplete (Collect All Words with Prefix)

Delete

Interactive Visualization

Operations & Complexity

Implementation

Python — Full Trie with All Operations

Python — Wildcard Search Variant

JavaScript

Common Mistakes

Real-World Example: Autocomplete Search Engine

Interview Patterns

Practice Problems