DSA Lab | Trees

What Is a Tree?

A tree is a collection of nodes (data points) connected by edges (links between nodes), with no cycles (you can't follow edges in a loop back to where you started). One node is designated as the root (the top). Each node can have zero or more children (nodes directly below it). A node with no children is called a leaf. The height of a tree is the longest path from root to any leaf. The depth of a node is its distance from the root.

If you've ever looked at a file system (folders containing folders containing files), you've already understood trees. The root directory sits at the top, subdirectories branch out, and files are leaves.

A Brief History

Trees as a data structure emerged in the late 1950s. The binary search tree (BST) was described independently by several researchers around 1960. The concept of self-balancing trees came next: Georgy Adelson-Velsky and Evgenii Landis published the AVL tree in 1962 — the first self-balancing BST. Rudolf Bayer invented B-trees in 1970 at Boeing Research Labs, specifically for efficient disk-based data access (they now back every SQL database index).

The red-black tree was invented by Leo Guibas and Robert Sedgewick in 1978 as a simpler alternative to AVL trees. It trades slightly less balanced trees for fewer rotations during insertion and deletion, making it the go-to for in-memory sorted containers (Java's TreeMap, C++'s std::map, Linux's CFS scheduler all use red-black trees).

The Family

Binary Tree — each node has at most two children (called left and right). This is the most common type in interviews. No ordering rules — values can be anywhere.

Binary Search Tree (BST) — a binary tree with a rule: for every node, all values in its left subtree are smaller, and all values in its right subtree are larger. This property (called the BST invariant) makes search fast — you can throw away half the tree at each step, just like binary search on a sorted array.

AVL Tree — a self-balancing BST. For every node, the heights of the left and right subtrees differ by at most 1 (this difference is called the balance factor). When an insertion or deletion breaks this rule, the tree performs rotations (restructuring operations) to fix itself. This guarantees O(log n) height.

Red-Black Tree — another self-balancing BST. Each node is colored red or black. A set of coloring rules (no two consecutive red nodes, all paths from root to leaves have the same number of black nodes) ensures the tree never gets too unbalanced. Less strictly balanced than AVL (height ≤ 2·log(n+1) vs. ~1.44·log(n+2) for AVL), but faster insertions because fewer rotations are needed.

Why Balance Matters

A BST's operations are O(h) where h is the height. A balanced tree has h = log n, giving O(log n) operations. But if you insert sorted data into a plain BST — say, 1, 2, 3, 4, 5 — each new value goes to the right of the previous one. The tree degenerates into a linked list with h = n. That's O(n) for everything. AVL and red-black trees prevent this by guaranteeing balance after every modification.

Where You'll See Them

Databases — B-trees (a generalization of BSTs with many children per node) back every SQL index. They're optimized for disk reads.
File systems — directory structures are trees. ext4, NTFS, and APFS all use tree-based structures internally.
DOM — HTML is a tree. Every CSS selector and JS querySelector walks it.
Compilers — abstract syntax trees (ASTs) represent parsed code. When Python reads x + y * z, it builds a tree with + at the root, x on the left, and a subtree for y * z on the right.
Autocomplete — tries (prefix trees) power search suggestions. Each node represents a character, and paths from root to leaf spell out words.
Game AI — minimax trees evaluate possible moves and counter-moves.
Networking — routing tables, spanning trees (the protocol that prevents loops in Ethernet networks).

How It Works

BST Operations

Search: Start at the root. If the target equals the current node, you found it. If the target is smaller, go left. If larger, go right. Repeat until you find it or hit null (meaning it's not in the tree). Each step eliminates one entire subtree.

Insert: Search for where the value would be. When you reach null (an empty spot), that's where the new node goes. Create it there. In a BST, new nodes always become leaves.

Delete: Three cases, from easiest to hardest:

Leaf node (no children) — just remove it. Update the parent's pointer to null.
One child — replace the node with its only child. The child "moves up."
Two children — this is the tricky one. Find the in-order successor (the smallest value in the right subtree — keep going left from the right child until you hit a leaf). Copy its value into the node being deleted, then delete the successor (which has at most one child, so it falls into case 1 or 2).

Tree Traversals

There are four ways to visit every node. The first three are depth-first (go deep before going wide); the last is breadth-first (go level by level).

In-order (Left → Node → Right): visits BST nodes in sorted order. Used for: getting sorted output, validating BST property.
Pre-order (Node → Left → Right): visits the root first. Used for: copying a tree, serialization, building expression trees.
Post-order (Left → Right → Node): visits children before parent. Used for: deleting a tree (delete children before parent), evaluating expression trees, calculating directory sizes.
Level-order / BFS: visits all nodes at depth d before depth d+1. Uses a queue. Used for: finding minimum depth, level averages, zigzag traversal, right-side view.

AVL Rotations

When an AVL tree becomes unbalanced (left and right subtree heights differ by more than 1), it performs rotations. There are four cases:

Left Rotation (RR case) — right subtree of right child is too tall. Lifts the right child up to become the new root of that subtree.
Right Rotation (LL case) — mirror image. Left subtree of left child is too tall.
Left-Right (LR case) — left child's right subtree is too tall. Left-rotate the child first (to convert to LL case), then right-rotate the node.
Right-Left (RL case) — mirror. Right child's left subtree is too tall. Right-rotate the child, then left-rotate the node.

Each rotation is O(1) — just pointer reassignment. An insertion needs at most 2 rotations; a deletion might need O(log n) rotations (one per ancestor).

💡 AVL vs Red-Black: When Does It Matter?

AVL trees are more strictly balanced, so lookups are slightly faster (shorter average path). Red-black trees do fewer rotations on insert/delete, so modifications are faster. In practice, the difference is tiny — about 5-10%. If you're choosing: use AVL for read-heavy workloads, red-black for write-heavy. But for interviews, just know BSTs and how balancing works conceptually. You'll never be asked to implement AVL rotations from scratch.

Interactive Visualization

Build a BST interactively. Insert nodes, delete them, and watch traversals highlight the path in real-time.

BST Explorer

Operations & Complexity

Binary Search Tree

Operation	Average	Worst (Unbalanced)	Notes
Search	O(log n)	O(n)	Worst: degenerate to linked list
Insert	O(log n)	O(n)	Find position + create node
Delete	O(log n)	O(n)	Find + restructure
Min / Max	O(log n)	O(n)	Go left / right until null
In-order traversal	O(n)	O(n)	Must visit every node

AVL Tree / Red-Black Tree (Self-Balancing)

Operation	Time	Space	Notes
Search	O(log n)	O(log n)	Guaranteed — height is always O(log n)
Insert	O(log n)	O(log n)	AVL: at most 2 rotations. RB: at most 2 rotations + recoloring
Delete	O(log n)	O(log n)	AVL: may need O(log n) rotations. RB: at most 3 rotations

Space complexity: O(n) for all tree types — one node per element, plus two child pointers per node (and a parent pointer for some implementations).

Implementation

Binary Search Tree (Recursive)

Python

class TreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None

class BST:
    def __init__(self):
        self.root = None

    def insert(self, val):
        """Iterative insert — avoids recursion stack overflow on large trees."""
        node = TreeNode(val)
        if not self.root:
            self.root = node
            return

        current = self.root
        while True:
            if val < current.val:
                if current.left is None:
                    current.left = node
                    return
                current = current.left
            elif val > current.val:
                if current.right is None:
                    current.right = node
                    return
                current = current.right
            else:
                return  # Duplicate — skip (or handle however you want)

    def search(self, val):
        """Follow the BST property: smaller → left, larger → right."""
        current = self.root
        while current:
            if val == current.val:
                return current
            elif val < current.val:
                current = current.left
            else:
                current = current.right
        return None

    def inorder(self):
        """Returns values in sorted order — that's the BST guarantee.
        
        Iterative version using an explicit stack to avoid
        recursion depth issues on large trees.
        """
        result = []
        stack = []
        current = self.root
        while current or stack:
            # Go as far left as possible
            while current:
                stack.append(current)
                current = current.left
            # Process the node
            current = stack.pop()
            result.append(current.val)
            # Move to right subtree
            current = current.right
        return result

    def delete(self, val):
        self.root = self._delete_rec(self.root, val)

    def _delete_rec(self, node, val):
        if not node:
            return None

        if val < node.val:
            node.left = self._delete_rec(node.left, val)
        elif val > node.val:
            node.right = self._delete_rec(node.right, val)
        else:
            # Found the node to delete — three cases
            if not node.left:
                return node.right  # Case 1 & 2: no left child
            if not node.right:
                return node.left   # Case 2: no right child

            # Case 3: two children
            # Find in-order successor (smallest in right subtree)
            successor = node.right
            while successor.left:
                successor = successor.left
            node.val = successor.val
            node.right = self._delete_rec(node.right, successor.val)

        return node


# Usage
bst = BST()
for val in [50, 30, 70, 20, 40, 60, 80]:
    bst.insert(val)

print(bst.inorder())        # [20, 30, 40, 50, 60, 70, 80]
print(bst.search(40).val)   # 40
bst.delete(30)
print(bst.inorder())        # [20, 40, 50, 60, 70, 80]

BST in JavaScript

JavaScript

class TreeNode {
  constructor(val) {
    this.val = val;
    this.left = null;
    this.right = null;
  }
}

class BST {
  constructor() {
    this.root = null;
  }

  insert(val) {
    const node = new TreeNode(val);
    if (!this.root) { this.root = node; return; }

    let current = this.root;
    while (true) {
      if (val < current.val) {
        if (!current.left) { current.left = node; return; }
        current = current.left;
      } else if (val > current.val) {
        if (!current.right) { current.right = node; return; }
        current = current.right;
      } else {
        return; // Skip duplicates
      }
    }
  }

  search(val) {
    let current = this.root;
    while (current) {
      if (val === current.val) return current;
      current = val < current.val ? current.left : current.right;
    }
    return null;
  }

  // Recursive traversals — clean and interview-friendly
  inorder(node = this.root, result = []) {
    if (!node) return result;
    this.inorder(node.left, result);
    result.push(node.val);
    this.inorder(node.right, result);
    return result;
  }

  preorder(node = this.root, result = []) {
    if (!node) return result;
    result.push(node.val);
    this.preorder(node.left, result);
    this.preorder(node.right, result);
    return result;
  }

  postorder(node = this.root, result = []) {
    if (!node) return result;
    this.postorder(node.left, result);
    this.postorder(node.right, result);
    result.push(node.val);
    return result;
  }
}

Variation: Iterative Level-Order Traversal (BFS)

Recursive traversals are DFS. For level-order (visiting nodes top to bottom, left to right), you need a queue:

Python

from collections import deque

def level_order(root):
    """Visit tree nodes level by level using BFS.
    
    Returns a list of lists — each inner list is one level.
    Example for tree [3, 9, 20, null, null, 15, 7]:
    → [[3], [9, 20], [15, 7]]
    
    This is the foundation for:
    - Finding minimum depth (first leaf you encounter)
    - Right-side view (last element of each level)
    - Zigzag traversal (alternate left-right and right-left)
    - Level averages
    """
    if not root:
        return []

    result = []
    queue = deque([root])

    while queue:
        level_size = len(queue)  # How many nodes at this level
        level = []

        for _ in range(level_size):
            node = queue.popleft()
            level.append(node.val)

            # Enqueue children for the next level
            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)

        result.append(level)

    return result


def iterative_preorder(root):
    """Pre-order traversal using an explicit stack.
    
    Useful when the tree is very deep (avoids stack overflow).
    Push right child first, then left — so left gets processed first.
    """
    if not root:
        return []

    result = []
    stack = [root]

    while stack:
        node = stack.pop()
        result.append(node.val)
        # Push right first so left is on top (processed next)
        if node.right:
            stack.append(node.right)
        if node.left:
            stack.append(node.left)

    return result

Common Mistakes

⚠️ Mistakes That Trip Up Beginners

Forgetting the base case in recursion. Every recursive tree function needs if not node: return ... as the first line. Forgetting it causes a crash on empty trees or when you reach a leaf's null children. This is the most common tree bug.
Confusing "binary tree" with "BST." Many interview problems give you a plain binary tree with no ordering guarantee. Don't assume left < root < right unless the problem explicitly says "binary search tree."
Validating a BST incorrectly. Checking that node.left.val < node.val < node.right.val is NOT sufficient. The entire left subtree must be less than the node, and the entire right subtree must be greater. A node in the deep left might violate the root's constraint even though it's valid relative to its parent. Pass min/max bounds down the recursion instead.
Stack overflow with deep recursion. If the tree can have 10⁵+ nodes and might be unbalanced (height = n), recursive solutions will crash. Convert to iterative with an explicit stack. Even balanced trees with 10⁵ nodes have depth ~17, which is fine — but linked-list-shaped trees of the same size have depth 10⁵.
Not recognizing that in-order traversal of a BST gives sorted output. This is the key insight for many BST problems: "kth smallest element" = in-order traversal stopping at the kth element. "Validate BST" = check that in-order traversal is strictly increasing.

Real-World Example: File System Tree

File systems are trees. Here's a simplified version showing how du (disk usage) and find work internally — they're both tree traversals:

Python

class FileNode:
    """A node in a file system tree.
    
    Files are leaves (no children). Directories are internal
    nodes with a list of children. This is exactly how ext4,
    NTFS, and APFS represent file systems internally.
    """
    def __init__(self, name, size=0, is_dir=False):
        self.name = name
        self.size = size        # File size in bytes (0 for directories)
        self.is_dir = is_dir
        self.children = []      # Only used if is_dir is True

    def add_child(self, child):
        self.children.append(child)
        return child


def calculate_size(node, depth=0):
    """Calculate total size of a directory (post-order traversal).
    
    This is what `du -sh` does on Unix. It must visit children
    before the parent because a directory's size is the sum of
    its children's sizes.
    """
    indent = "  " * depth
    if not node.is_dir:
        print(f"{indent}{node.name}: {node.size:,} bytes")
        return node.size

    total = 0
    for child in node.children:
        total += calculate_size(child, depth + 1)

    print(f"{indent}{node.name}/: {total:,} bytes (total)")
    return total


def find_large_files(node, threshold, path=""):
    """Find files larger than threshold (pre-order traversal).
    
    This is what `find . -size +1M` does.
    """
    current_path = f"{path}/{node.name}"
    results = []

    if not node.is_dir and node.size > threshold:
        results.append((current_path, node.size))

    for child in node.children:
        results.extend(find_large_files(child, threshold, current_path))

    return results


# Build a sample file system
root = FileNode("home", is_dir=True)
docs = root.add_child(FileNode("documents", is_dir=True))
docs.add_child(FileNode("resume.pdf", size=245_000))
docs.add_child(FileNode("notes.txt", size=1_200))
pics = root.add_child(FileNode("pictures", is_dir=True))
pics.add_child(FileNode("vacation.jpg", size=4_500_000))
pics.add_child(FileNode("screenshot.png", size=890_000))
root.add_child(FileNode(".bashrc", size=350))

print("=== Directory Sizes (like `du`) ===")
total = calculate_size(root)

print("\n=== Large Files > 1MB (like `find -size +1M`) ===")
for path, size in find_large_files(root, 1_000_000):
    print(f"  {path}: {size / 1_000_000:.1f} MB")

Every time you run ls -la, du, find, or browse files in Finder/Explorer, you're running tree traversals. The file manager does a level-order traversal to show the current directory's contents. du does a post-order traversal to compute sizes bottom-up. find does a pre-order traversal to search top-down.

Interview Patterns

💡 DFS vs BFS — Choose Your Weapon

Use DFS (recursion or explicit stack) when: you need to explore all paths, compare subtrees, or the answer involves leaf-to-root information. Use BFS (queue) when: you need level-order info, the shortest path, or the answer is close to the root. Most tree interview problems use DFS because trees are naturally recursive.

💡 The "Pass Info Up" Pattern

Many tree problems follow this structure: ask each subtree a question, combine the answers. "Is this tree balanced?" → ask left subtree for its height, ask right for its height, compare. "What's the diameter?" → for each node, left height + right height is the path through that node. The answer is the maximum across all nodes. The recursion handles the traversal; you just define what info to return.

💡 Validate BST — Range Passing

Pass valid ranges (min, max) down the recursion. Each node must fall between its allowed min and max. The root can be anything. Its left child must be < root (so max = root value). Its right child must be > root (so min = root value). Propagate these constraints down.

💡 Serialize and Deserialize

To save a tree and rebuild it later, use pre-order traversal with null markers. Serialize: visit root, output value (or "null" for empty nodes), recurse left, recurse right. Deserialize: read values in order, using a queue or index pointer.

💡 Lowest Common Ancestor (LCA)

The LCA of two nodes is the deepest node that has both as descendants. Recursive approach: if the current node is null, or equals either target, return it. Recurse left and right. If both return non-null, current node is the LCA. If only one returns non-null, that side has both targets. For BSTs, it's simpler: if both targets are smaller, go left; if both larger, go right; otherwise, current node is the LCA.

💡 Path Sum Patterns

"Does any root-to-leaf path sum to target?" → DFS, subtract from target at each node, check if target == 0 at leaves. "Find all such paths?" → DFS with backtracking, collect paths. "Maximum path sum (any-to-any)?" → at each node, compute the best path through it (left + node + right) and update a global maximum. Return the best single-branch path upward.

Problems: Path Sum (#112), Path Sum II (#113), Binary Tree Maximum Path Sum (#124).

Practice Problems

Trees are one of the most common interview topics. These problems cover the essential patterns.

#	Problem	Difficulty	Pattern
104	Maximum Depth of Binary Tree	Easy	Recursive DFS
226	Invert Binary Tree	Easy	Recursive swap
100	Same Tree	Easy	Parallel DFS
98	Validate Binary Search Tree	Medium	Range passing
102	Binary Tree Level Order Traversal	Medium	BFS with queue
230	Kth Smallest Element in BST	Medium	In-order + counter
236	Lowest Common Ancestor	Medium	Recursive LCA
124	Binary Tree Maximum Path Sum	Hard	DFS + global max tracking
297	Serialize and Deserialize Binary Tree	Hard	Pre-order + null markers

Trees

On This Page

What Is a Tree?

A Brief History

The Family

Why Balance Matters

Where You'll See Them

How It Works

BST Operations

Tree Traversals

AVL Rotations

Interactive Visualization

Operations & Complexity

Binary Search Tree

AVL Tree / Red-Black Tree (Self-Balancing)

Implementation

Binary Search Tree (Recursive)

BST in JavaScript

Variation: Iterative Level-Order Traversal (BFS)

Common Mistakes

Real-World Example: File System Tree

Interview Patterns

Practice Problems