Hierarchical data done right. From file systems to databases to DOM rendering โ trees are everywhere once you start looking.
โฌก IntermediateA tree is a collection of nodes (data points) connected by edges (links between nodes), with no cycles (you can't follow edges in a loop back to where you started). One node is designated as the root (the top). Each node can have zero or more children (nodes directly below it). A node with no children is called a leaf. The height of a tree is the longest path from root to any leaf. The depth of a node is its distance from the root.
If you've ever looked at a file system (folders containing folders containing files), you've already understood trees. The root directory sits at the top, subdirectories branch out, and files are leaves.
Trees as a data structure emerged in the late 1950s. The binary search tree (BST) was described independently by several researchers around 1960. The concept of self-balancing trees came next: Georgy Adelson-Velsky and Evgenii Landis published the AVL tree in 1962 โ the first self-balancing BST. Rudolf Bayer invented B-trees in 1970 at Boeing Research Labs, specifically for efficient disk-based data access (they now back every SQL database index).
The red-black tree was invented by Leo Guibas and Robert Sedgewick in 1978 as a simpler alternative to AVL trees. It trades slightly less balanced trees for fewer rotations during insertion and deletion, making it the go-to for in-memory sorted containers (Java's TreeMap, C++'s std::map, Linux's CFS scheduler all use red-black trees).
Binary Tree โ each node has at most two children (called left and right). This is the most common type in interviews. No ordering rules โ values can be anywhere.
Binary Search Tree (BST) โ a binary tree with a rule: for every node, all values in its left subtree are smaller, and all values in its right subtree are larger. This property (called the BST invariant) makes search fast โ you can throw away half the tree at each step, just like binary search on a sorted array.
AVL Tree โ a self-balancing BST. For every node, the heights of the left and right subtrees differ by at most 1 (this difference is called the balance factor). When an insertion or deletion breaks this rule, the tree performs rotations (restructuring operations) to fix itself. This guarantees O(log n) height.
Red-Black Tree โ another self-balancing BST. Each node is colored red or black. A set of coloring rules (no two consecutive red nodes, all paths from root to leaves have the same number of black nodes) ensures the tree never gets too unbalanced. Less strictly balanced than AVL (height โค 2ยทlog(n+1) vs. ~1.44ยทlog(n+2) for AVL), but faster insertions because fewer rotations are needed.
A BST's operations are O(h) where h is the height. A balanced tree has h = log n, giving O(log n) operations. But if you insert sorted data into a plain BST โ say, 1, 2, 3, 4, 5 โ each new value goes to the right of the previous one. The tree degenerates into a linked list with h = n. That's O(n) for everything. AVL and red-black trees prevent this by guaranteeing balance after every modification.
querySelector walks it.x + y * z, it builds a tree with + at the root, x on the left, and a subtree for y * z on the right.Search: Start at the root. If the target equals the current node, you found it. If the target is smaller, go left. If larger, go right. Repeat until you find it or hit null (meaning it's not in the tree). Each step eliminates one entire subtree.
Insert: Search for where the value would be. When you reach null (an empty spot), that's where the new node goes. Create it there. In a BST, new nodes always become leaves.
Delete: Three cases, from easiest to hardest:
There are four ways to visit every node. The first three are depth-first (go deep before going wide); the last is breadth-first (go level by level).
When an AVL tree becomes unbalanced (left and right subtree heights differ by more than 1), it performs rotations. There are four cases:
Each rotation is O(1) โ just pointer reassignment. An insertion needs at most 2 rotations; a deletion might need O(log n) rotations (one per ancestor).
AVL trees are more strictly balanced, so lookups are slightly faster (shorter average path). Red-black trees do fewer rotations on insert/delete, so modifications are faster. In practice, the difference is tiny โ about 5-10%. If you're choosing: use AVL for read-heavy workloads, red-black for write-heavy. But for interviews, just know BSTs and how balancing works conceptually. You'll never be asked to implement AVL rotations from scratch.
Build a BST interactively. Insert nodes, delete them, and watch traversals highlight the path in real-time.
| Operation | Average | Worst (Unbalanced) | Notes |
|---|---|---|---|
| Search | O(log n) | O(n) | Worst: degenerate to linked list |
| Insert | O(log n) | O(n) | Find position + create node |
| Delete | O(log n) | O(n) | Find + restructure |
| Min / Max | O(log n) | O(n) | Go left / right until null |
| In-order traversal | O(n) | O(n) | Must visit every node |
| Operation | Time | Space | Notes |
|---|---|---|---|
| Search | O(log n) | O(log n) | Guaranteed โ height is always O(log n) |
| Insert | O(log n) | O(log n) | AVL: at most 2 rotations. RB: at most 2 rotations + recoloring |
| Delete | O(log n) | O(log n) | AVL: may need O(log n) rotations. RB: at most 3 rotations |
Space complexity: O(n) for all tree types โ one node per element, plus two child pointers per node (and a parent pointer for some implementations).
class TreeNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
class BST:
def __init__(self):
self.root = None
def insert(self, val):
"""Iterative insert โ avoids recursion stack overflow on large trees."""
node = TreeNode(val)
if not self.root:
self.root = node
return
current = self.root
while True:
if val < current.val:
if current.left is None:
current.left = node
return
current = current.left
elif val > current.val:
if current.right is None:
current.right = node
return
current = current.right
else:
return # Duplicate โ skip (or handle however you want)
def search(self, val):
"""Follow the BST property: smaller โ left, larger โ right."""
current = self.root
while current:
if val == current.val:
return current
elif val < current.val:
current = current.left
else:
current = current.right
return None
def inorder(self):
"""Returns values in sorted order โ that's the BST guarantee.
Iterative version using an explicit stack to avoid
recursion depth issues on large trees.
"""
result = []
stack = []
current = self.root
while current or stack:
# Go as far left as possible
while current:
stack.append(current)
current = current.left
# Process the node
current = stack.pop()
result.append(current.val)
# Move to right subtree
current = current.right
return result
def delete(self, val):
self.root = self._delete_rec(self.root, val)
def _delete_rec(self, node, val):
if not node:
return None
if val < node.val:
node.left = self._delete_rec(node.left, val)
elif val > node.val:
node.right = self._delete_rec(node.right, val)
else:
# Found the node to delete โ three cases
if not node.left:
return node.right # Case 1 & 2: no left child
if not node.right:
return node.left # Case 2: no right child
# Case 3: two children
# Find in-order successor (smallest in right subtree)
successor = node.right
while successor.left:
successor = successor.left
node.val = successor.val
node.right = self._delete_rec(node.right, successor.val)
return node
# Usage
bst = BST()
for val in [50, 30, 70, 20, 40, 60, 80]:
bst.insert(val)
print(bst.inorder()) # [20, 30, 40, 50, 60, 70, 80]
print(bst.search(40).val) # 40
bst.delete(30)
print(bst.inorder()) # [20, 40, 50, 60, 70, 80]
class TreeNode {
constructor(val) {
this.val = val;
this.left = null;
this.right = null;
}
}
class BST {
constructor() {
this.root = null;
}
insert(val) {
const node = new TreeNode(val);
if (!this.root) { this.root = node; return; }
let current = this.root;
while (true) {
if (val < current.val) {
if (!current.left) { current.left = node; return; }
current = current.left;
} else if (val > current.val) {
if (!current.right) { current.right = node; return; }
current = current.right;
} else {
return; // Skip duplicates
}
}
}
search(val) {
let current = this.root;
while (current) {
if (val === current.val) return current;
current = val < current.val ? current.left : current.right;
}
return null;
}
// Recursive traversals โ clean and interview-friendly
inorder(node = this.root, result = []) {
if (!node) return result;
this.inorder(node.left, result);
result.push(node.val);
this.inorder(node.right, result);
return result;
}
preorder(node = this.root, result = []) {
if (!node) return result;
result.push(node.val);
this.preorder(node.left, result);
this.preorder(node.right, result);
return result;
}
postorder(node = this.root, result = []) {
if (!node) return result;
this.postorder(node.left, result);
this.postorder(node.right, result);
result.push(node.val);
return result;
}
}
Recursive traversals are DFS. For level-order (visiting nodes top to bottom, left to right), you need a queue:
from collections import deque
def level_order(root):
"""Visit tree nodes level by level using BFS.
Returns a list of lists โ each inner list is one level.
Example for tree [3, 9, 20, null, null, 15, 7]:
โ [[3], [9, 20], [15, 7]]
This is the foundation for:
- Finding minimum depth (first leaf you encounter)
- Right-side view (last element of each level)
- Zigzag traversal (alternate left-right and right-left)
- Level averages
"""
if not root:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue) # How many nodes at this level
level = []
for _ in range(level_size):
node = queue.popleft()
level.append(node.val)
# Enqueue children for the next level
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(level)
return result
def iterative_preorder(root):
"""Pre-order traversal using an explicit stack.
Useful when the tree is very deep (avoids stack overflow).
Push right child first, then left โ so left gets processed first.
"""
if not root:
return []
result = []
stack = [root]
while stack:
node = stack.pop()
result.append(node.val)
# Push right first so left is on top (processed next)
if node.right:
stack.append(node.right)
if node.left:
stack.append(node.left)
return result
if not node: return ... as the first line. Forgetting it causes a crash on empty trees or when you reach a leaf's null children. This is the most common tree bug.node.left.val < node.val < node.right.val is NOT sufficient. The entire left subtree must be less than the node, and the entire right subtree must be greater. A node in the deep left might violate the root's constraint even though it's valid relative to its parent. Pass min/max bounds down the recursion instead.File systems are trees. Here's a simplified version showing how du (disk usage) and find work internally โ they're both tree traversals:
class FileNode:
"""A node in a file system tree.
Files are leaves (no children). Directories are internal
nodes with a list of children. This is exactly how ext4,
NTFS, and APFS represent file systems internally.
"""
def __init__(self, name, size=0, is_dir=False):
self.name = name
self.size = size # File size in bytes (0 for directories)
self.is_dir = is_dir
self.children = [] # Only used if is_dir is True
def add_child(self, child):
self.children.append(child)
return child
def calculate_size(node, depth=0):
"""Calculate total size of a directory (post-order traversal).
This is what `du -sh` does on Unix. It must visit children
before the parent because a directory's size is the sum of
its children's sizes.
"""
indent = " " * depth
if not node.is_dir:
print(f"{indent}{node.name}: {node.size:,} bytes")
return node.size
total = 0
for child in node.children:
total += calculate_size(child, depth + 1)
print(f"{indent}{node.name}/: {total:,} bytes (total)")
return total
def find_large_files(node, threshold, path=""):
"""Find files larger than threshold (pre-order traversal).
This is what `find . -size +1M` does.
"""
current_path = f"{path}/{node.name}"
results = []
if not node.is_dir and node.size > threshold:
results.append((current_path, node.size))
for child in node.children:
results.extend(find_large_files(child, threshold, current_path))
return results
# Build a sample file system
root = FileNode("home", is_dir=True)
docs = root.add_child(FileNode("documents", is_dir=True))
docs.add_child(FileNode("resume.pdf", size=245_000))
docs.add_child(FileNode("notes.txt", size=1_200))
pics = root.add_child(FileNode("pictures", is_dir=True))
pics.add_child(FileNode("vacation.jpg", size=4_500_000))
pics.add_child(FileNode("screenshot.png", size=890_000))
root.add_child(FileNode(".bashrc", size=350))
print("=== Directory Sizes (like `du`) ===")
total = calculate_size(root)
print("\n=== Large Files > 1MB (like `find -size +1M`) ===")
for path, size in find_large_files(root, 1_000_000):
print(f" {path}: {size / 1_000_000:.1f} MB")
Every time you run ls -la, du, find, or browse files in Finder/Explorer, you're running tree traversals. The file manager does a level-order traversal to show the current directory's contents. du does a post-order traversal to compute sizes bottom-up. find does a pre-order traversal to search top-down.
Use DFS (recursion or explicit stack) when: you need to explore all paths, compare subtrees, or the answer involves leaf-to-root information. Use BFS (queue) when: you need level-order info, the shortest path, or the answer is close to the root. Most tree interview problems use DFS because trees are naturally recursive.
Many tree problems follow this structure: ask each subtree a question, combine the answers. "Is this tree balanced?" โ ask left subtree for its height, ask right for its height, compare. "What's the diameter?" โ for each node, left height + right height is the path through that node. The answer is the maximum across all nodes. The recursion handles the traversal; you just define what info to return.
Pass valid ranges (min, max) down the recursion. Each node must fall between its allowed min and max. The root can be anything. Its left child must be < root (so max = root value). Its right child must be > root (so min = root value). Propagate these constraints down.
To save a tree and rebuild it later, use pre-order traversal with null markers. Serialize: visit root, output value (or "null" for empty nodes), recurse left, recurse right. Deserialize: read values in order, using a queue or index pointer.
The LCA of two nodes is the deepest node that has both as descendants. Recursive approach: if the current node is null, or equals either target, return it. Recurse left and right. If both return non-null, current node is the LCA. If only one returns non-null, that side has both targets. For BSTs, it's simpler: if both targets are smaller, go left; if both larger, go right; otherwise, current node is the LCA.
"Does any root-to-leaf path sum to target?" โ DFS, subtract from target at each node, check if target == 0 at leaves. "Find all such paths?" โ DFS with backtracking, collect paths. "Maximum path sum (any-to-any)?" โ at each node, compute the best path through it (left + node + right) and update a global maximum. Return the best single-branch path upward.
Problems: Path Sum (#112), Path Sum II (#113), Binary Tree Maximum Path Sum (#124).
Trees are one of the most common interview topics. These problems cover the essential patterns.
| # | Problem | Difficulty | Pattern |
|---|---|---|---|
| 104 | Maximum Depth of Binary Tree | Easy | Recursive DFS |
| 226 | Invert Binary Tree | Easy | Recursive swap |
| 100 | Same Tree | Easy | Parallel DFS |
| 98 | Validate Binary Search Tree | Medium | Range passing |
| 102 | Binary Tree Level Order Traversal | Medium | BFS with queue |
| 230 | Kth Smallest Element in BST | Medium | In-order + counter |
| 236 | Lowest Common Ancestor | Medium | Recursive LCA |
| 124 | Binary Tree Maximum Path Sum | Hard | DFS + global max tracking |
| 297 | Serialize and Deserialize Binary Tree | Hard | Pre-order + null markers |