On This Page

What Is BFS?

Breadth-First Search (BFS) is a graph traversal algorithm that explores nodes level by level. Starting from a source node, it visits all direct neighbors first, then their neighbors, then their neighbors, and so on โ€” like ripples spreading out from a stone dropped in water.

Origin Story

BFS was first described by Konrad Zuse in 1945 in his rejected PhD thesis (it was rejected because post-war Germany had more pressing concerns than graph theory). It was independently rediscovered by Edward F. Moore in 1959, who used it to find the shortest path through a maze โ€” the algorithm is sometimes called "Moore's algorithm" in older textbooks. The name "breadth-first" contrasts with "depth-first" (DFS): BFS goes wide before going deep, DFS goes deep before going wide.

The Key Data Structure: Queue

The data structure that makes BFS work is a queue โ€” a FIFO (First In, First Out) container. The queue ensures that nodes discovered earlier get processed before nodes discovered later, which creates the "layer by layer" behavior. If you swapped the queue for a stack (LIFO โ€” Last In, First Out), you'd get DFS instead. The only algorithmic difference between BFS and iterative DFS is the data structure: queue vs stack.

In Python, use collections.deque for the queue โ€” it has O(1) popleft(). A regular list's pop(0) is O(n) because it shifts all remaining elements. This matters when you're processing millions of nodes.

Why BFS Matters

Where BFS Shows Up in the Real World

How It Works

The Algorithm Step by Step

  1. Start: Put the source node in a queue. Mark it as visited.
  2. Dequeue: Take the front node from the queue. Process it (print it, check if it's the target, etc.).
  3. Enqueue neighbors: For each unvisited neighbor of the current node, mark it as visited and add it to the back of the queue.
  4. Repeat steps 2-3 until the queue is empty (traversed everything reachable) or you found your target (shortest path search).

The Visited Set โ€” A Critical Detail

The visited set prevents infinite loops in graphs with cycles. Without it, node A enqueues node B, which enqueues node A, which enqueues node B... forever.

A subtle but important point: mark a node as visited when it's added to the queue, not when it's removed from the queue. If you wait until removal, the same node can be added to the queue multiple times by different neighbors before being processed. This wastes time and can cause incorrect distance calculations.

BFS on Trees vs Graphs

On trees (connected acyclic graphs), you don't need a visited set โ€” trees have no cycles, so you'll never revisit a node. But most graph problems require it. When in doubt, use the visited set. The extra space is O(V) either way.

BFS vs DFS โ€” When to Choose Which

๐Ÿ’ก Multi-Source BFS Sometimes you need BFS from multiple starting points simultaneously โ€” like "distance from nearest 0 in a matrix" or "how long until all oranges rot." Add ALL source nodes to the queue before starting. The algorithm handles the rest โ€” distances spread outward from all sources at once, as if there's a virtual super-source connected to all starting nodes. This is still O(V + E), same as single-source BFS.

Interactive Visualization

Click any node to start BFS from it. Watch the traversal spread level by level โ€” each level gets a different shade. The queue contents update in real time on the right side.

BFS Traversal

Common Mistakes

โš ๏ธ Mistake #1: Marking visited on dequeue instead of enqueue If you mark nodes as visited when you remove them from the queue (instead of when you add them), the same node gets enqueued multiple times by different neighbors. This wastes time, blows up memory, and can report wrong distances. Always mark visited BEFORE adding to the queue.
โš ๏ธ Mistake #2: Using a list instead of a deque in Python list.pop(0) is O(n) because it shifts every remaining element left. On a graph with 1 million nodes, that's a million shifts per dequeue โ€” turning your O(V+E) algorithm into O(Vยฒ). Use collections.deque which has O(1) popleft().
โš ๏ธ Mistake #3: Forgetting to handle disconnected components BFS from a single source only visits nodes reachable from that source. If the graph has disconnected components, you need to loop over all nodes and run BFS from each unvisited one. For "number of islands" on a grid, this means scanning every cell and starting a new BFS whenever you hit an unvisited land cell.
โš ๏ธ Mistake #4: Using BFS for weighted shortest paths BFS finds the shortest path by number of edges, not by edge weight. If edges have different weights, BFS can give the wrong answer โ€” it might find a path with fewer edges but higher total weight. Use Dijkstra's algorithm for weighted graphs (or Bellman-Ford if weights can be negative).
โš ๏ธ Mistake #5: Not counting levels correctly For level-order problems, you need to know where one level ends and the next begins. The trick: at the start of each level, snapshot the queue size with level_size = len(queue). Process exactly that many nodes. Everything added during processing belongs to the next level. If you skip this step, you'll process nodes from different levels together.

Complexity Analysis

RepresentationTimeSpaceNotes
Adjacency ListO(V + E)O(V)V = vertices, E = edges. Preferred.
Adjacency MatrixO(Vยฒ)O(V)Must check all V neighbors per node
Grid (m ร— n)O(m ร— n)O(m ร— n)Each cell visited once
Implicit graph (states)O(|states|)O(|states|)Word ladder, lock combinations

With an adjacency list (a dictionary mapping each node to its list of neighbors), BFS visits each vertex once and examines each edge once. That gives O(V + E) time. The queue holds at most O(V) nodes (worst case: an entire level of the graph, or all nodes in a star graph).

With an adjacency matrix (a Vร—V grid where matrix[i][j] = 1 means an edge from i to j), checking all neighbors of a node takes O(V) instead of O(degree). Total time becomes O(Vยฒ) regardless of how many edges exist.

On grids, V = mร—n cells and E โ‰ˆ 4ร—mร—n (each cell has up to 4 neighbors), so time and space are both O(mร—n). In practice, grid BFS is fast because the constant factor is small.

For implicit graphs (where nodes are states and edges are transitions, like word ladder or lock combinations), the complexity depends on the size of the state space. Word ladder with 5-letter words and a 10,000-word dictionary: up to 10,000 states, each with up to 5ร—25 = 125 potential neighbors to check.

Implementation

Standard BFS

Python

from collections import deque

def bfs(graph, start):
    """Basic BFS traversal. Returns nodes in visit order.
    graph = adjacency list {node: [neighbors]}.
    Uses deque for O(1) popleft โ€” never use list.pop(0)."""
    visited = {start}           # Mark visited on ENQUEUE
    queue = deque([start])      # FIFO queue
    order = []

    while queue:
        node = queue.popleft()  # Process front of queue
        order.append(node)

        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)  # Mark BEFORE enqueueing
                queue.append(neighbor)

    return order
      

BFS with Level Tracking

Python

from collections import deque

def bfs_levels(graph, start):
    """Returns nodes grouped by level (distance from start).
    Level 0 = [start], Level 1 = start's neighbors, etc.
    The key trick: snapshot len(queue) at each level's start."""
    visited = {start}
    queue = deque([start])
    levels = []

    while queue:
        level_size = len(queue)  # How many nodes at this level
        current_level = []

        for _ in range(level_size):
            node = queue.popleft()
            current_level.append(node)

            for neighbor in graph[node]:
                if neighbor not in visited:
                    visited.add(neighbor)
                    queue.append(neighbor)

        levels.append(current_level)

    return levels
# Result: [[A], [B, C], [D, E, F], ...]
# levels[i] contains all nodes at distance i from start
      

Shortest Path (Unweighted) with Path Reconstruction

Python

from collections import deque

def shortest_path(graph, start, end):
    """Find the shortest path (fewest edges) from start to end.
    Returns the path as a list, or None if no path exists.
    
    Instead of storing full paths (expensive), track predecessors
    and reconstruct the path at the end."""
    if start == end:
        return [start]

    visited = {start}
    queue = deque([start])
    predecessor = {start: None}  # Track how we reached each node

    while queue:
        node = queue.popleft()

        for neighbor in graph.get(node, []):
            if neighbor not in visited:
                visited.add(neighbor)
                predecessor[neighbor] = node
                queue.append(neighbor)

                if neighbor == end:
                    # Reconstruct path by following predecessors back
                    path = []
                    current = end
                    while current is not None:
                        path.append(current)
                        current = predecessor[current]
                    return path[::-1]  # Reverse: start โ†’ end

    return None  # No path exists
      
JavaScript

function shortestPath(graph, start, end) {
  if (start === end) return [start];

  const visited = new Set([start]);
  const queue = [start];
  const predecessor = new Map([[start, null]]);

  while (queue.length > 0) {
    const node = queue.shift();

    for (const neighbor of (graph[node] || [])) {
      if (!visited.has(neighbor)) {
        visited.add(neighbor);
        predecessor.set(neighbor, node);
        queue.push(neighbor);

        if (neighbor === end) {
          // Reconstruct path
          const path = [];
          let current = end;
          while (current !== null) {
            path.push(current);
            current = predecessor.get(current);
          }
          return path.reverse();
        }
      }
    }
  }
  return null;
}
      

Grid BFS (Island/Matrix Problems)

Python

from collections import deque

def grid_bfs(grid, start_row, start_col):
    """BFS on a 2D grid. Returns set of all reachable cells.
    Useful for flood fill, connected components, shortest path.
    
    Treats each cell as a node. Neighbors are the 4 adjacent cells
    (up, down, left, right). '#' cells are walls (impassable)."""
    rows, cols = len(grid), len(grid[0])
    visited = set()
    visited.add((start_row, start_col))
    queue = deque([(start_row, start_col, 0)])  # (row, col, distance)

    # 4-directional movement: right, left, down, up
    directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]

    while queue:
        r, c, dist = queue.popleft()

        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            # Check bounds, not visited, and not a wall
            if (0 <= nr < rows and 0 <= nc < cols
                    and (nr, nc) not in visited
                    and grid[nr][nc] != '#'):
                visited.add((nr, nc))
                queue.append((nr, nc, dist + 1))

    return visited
      

Multi-Source BFS (Rotting Oranges Pattern)

Python

from collections import deque

def oranges_rotting(grid):
    """How many minutes until all oranges rot?
    Rotten oranges (2) spread to adjacent fresh oranges (1) each minute.
    
    Multi-source BFS: start from ALL rotten oranges simultaneously.
    Each BFS level = one minute of spreading."""
    rows, cols = len(grid), len(grid[0])
    queue = deque()
    fresh_count = 0

    # Enqueue ALL rotten oranges as starting points
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == 2:
                queue.append((r, c, 0))  # (row, col, time)
            elif grid[r][c] == 1:
                fresh_count += 1

    if fresh_count == 0:
        return 0  # Nothing to rot

    directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
    max_time = 0

    while queue:
        r, c, time = queue.popleft()

        for dr, dc in directions:
            nr, nc = r + dr, c + dc
            if (0 <= nr < rows and 0 <= nc < cols
                    and grid[nr][nc] == 1):
                grid[nr][nc] = 2  # Mark as rotten (acts as visited)
                fresh_count -= 1
                max_time = time + 1
                queue.append((nr, nc, time + 1))

    return max_time if fresh_count == 0 else -1
      

Grid BFS โ€” JavaScript

JavaScript

function gridBFS(grid, startRow, startCol) {
  /**
   * BFS on a 2D grid. Returns distance to every reachable cell.
   * grid[r][c] === '#' means a wall. Otherwise passable.
   * Uses a queue of [row, col, distance] tuples.
   */
  const rows = grid.length, cols = grid[0].length;
  const dist = Array.from({ length: rows }, () => Array(cols).fill(-1));
  dist[startRow][startCol] = 0;

  const queue = [[startRow, startCol, 0]];
  const dirs = [[0,1],[0,-1],[1,0],[-1,0]]; // right, left, down, up
  let head = 0; // Use index instead of shift() to avoid O(n) shifts

  while (head < queue.length) {
    const [r, c, d] = queue[head++];

    for (const [dr, dc] of dirs) {
      const nr = r + dr, nc = c + dc;
      if (nr >= 0 && nr < rows && nc >= 0 && nc < cols
          && dist[nr][nc] === -1 && grid[nr][nc] !== '#') {
        dist[nr][nc] = d + 1;
        queue.push([nr, nc, d + 1]);
      }
    }
  }
  return dist; // dist[r][c] = shortest distance, or -1 if unreachable
}

// Example: find distances in a 5x5 maze
const maze = [
  ['.','.','.','#','.'],
  ['.','.','#','.','.'],
  ['.','.','.','.','.'],
  ['#','#','.','.','#'],
  ['.','.','.','.','.'],
];
const distances = gridBFS(maze, 0, 0);
console.log(distances[4][4]); // shortest path from (0,0) to (4,4)
      

Bidirectional BFS

Python

from collections import deque

def bidirectional_bfs(graph, start, end):
    """Find shortest path using BFS from BOTH ends simultaneously.
    
    Regular BFS explores O(b^d) nodes where b=branching factor, d=distance.
    Bidirectional explores O(2 * b^(d/2)) โ€” exponentially fewer.
    
    Particularly effective for word ladder and social network distance
    where the graph is wide but the path is relatively short.
    """
    if start == end:
        return [start]

    # Forward BFS from start, backward BFS from end
    front_visited = {start: None}   # node โ†’ predecessor
    back_visited = {end: None}      # node โ†’ predecessor
    front_queue = deque([start])
    back_queue = deque([end])

    while front_queue and back_queue:
        # Expand the SMALLER frontier (optimization)
        if len(front_queue) <= len(back_queue):
            meeting = _expand(graph, front_queue, front_visited, back_visited)
        else:
            meeting = _expand(graph, back_queue, back_visited, front_visited)

        if meeting is not None:
            return _build_path(meeting, front_visited, back_visited)

    return None  # No path exists

def _expand(graph, queue, visited, other_visited):
    """Expand one level of BFS. Returns meeting node if frontiers meet."""
    level_size = len(queue)
    for _ in range(level_size):
        node = queue.popleft()
        for neighbor in graph.get(node, []):
            if neighbor in other_visited:
                # Frontiers met! Record predecessor and return
                visited[neighbor] = node
                return neighbor
            if neighbor not in visited:
                visited[neighbor] = node
                queue.append(neighbor)
    return None

def _build_path(meeting, front_visited, back_visited):
    """Reconstruct path from start โ†’ meeting โ† end."""
    # Forward path: start โ†’ meeting
    path = []
    node = meeting
    while node is not None:
        path.append(node)
        node = front_visited.get(node)
    path.reverse()

    # Backward path: meeting โ†’ end
    node = back_visited.get(meeting)
    while node is not None:
        path.append(node)
        node = back_visited.get(node)

    return path
      

0-1 BFS (Deque-Based Weighted BFS)

Python

from collections import deque

def zero_one_bfs(graph, start, n):
    """Shortest path when edges have weight 0 or 1.
    graph[u] = list of (v, weight) where weight is 0 or 1.
    
    Use a deque instead of a priority queue:
    - Weight 0 edges: add to FRONT (like they're free)
    - Weight 1 edges: add to BACK (like regular BFS)
    
    This gives O(V + E) instead of Dijkstra's O(E log V)."""
    dist = [float('inf')] * n
    dist[start] = 0
    dq = deque([start])

    while dq:
        u = dq.popleft()

        for v, w in graph[u]:
            new_dist = dist[u] + w
            if new_dist < dist[v]:
                dist[v] = new_dist
                if w == 0:
                    dq.appendleft(v)  # Free edge โ†’ front of deque
                else:
                    dq.append(v)      # Cost-1 edge โ†’ back of deque

    return dist
      

Real-World Applications

1. Social Network "Degrees of Separation"

LinkedIn's "1st connection," "2nd connection," "3rd connection" labels are literally BFS levels. Starting from your profile (level 0), your direct connections are level 1, their connections are level 2, and so on. Facebook's "People You May Know" feature runs BFS from your profile to depth 2-3 and recommends highly-connected nodes at those levels. The "Six Degrees of Kevin Bacon" game is BFS on the movie-actor graph โ€” the answer is the BFS distance between two actors.

2. GPS Navigation (Unweighted)

When Google Maps finds the route with the fewest turns or the fewest stops on public transit, that's BFS on a road/transit graph. The "fewest transfers" option on subway systems is pure BFS โ€” each station is a node, each direct route between stations is an edge, and BFS finds the minimum number of transfers. For weighted shortest path (actual distance or travel time), Dijkstra's algorithm takes over โ€” but Dijkstra is essentially BFS with a priority queue.

3. Web Crawlers

Googlebot and other web crawlers discover pages using BFS. Starting from a set of seed URLs, the crawler fetches each page, extracts all links, and adds unvisited URLs to a queue. BFS ensures that pages close to the seeds (high-quality, well-linked pages) are discovered before obscure deep pages. The crawler's frontier queue can hold billions of URLs โ€” real crawlers use distributed BFS across thousands of machines.

4. Network Broadcasting and Peer-to-Peer

When a message spreads through a network (gossip protocols in distributed systems, epidemic routing in DTN networks), the spreading pattern follows BFS. Each node that receives the message forwards it to all neighbors that haven't seen it yet. BitTorrent's peer discovery uses a BFS-like protocol on its distributed hash table (DHT) to find peers that have pieces of the file you want.

Classic Example: Word Ladder

Given two words and a dictionary, find the shortest transformation sequence where you change one letter at a time and each intermediate word must be in the dictionary. This is BFS on an implicit graph โ€” words are nodes, and an edge connects two words that differ by exactly one letter.

Python

from collections import deque

def word_ladder(begin_word, end_word, word_list):
    """Find shortest transformation: begin_word โ†’ end_word,
    changing one letter at a time, each step must be in word_list.
    
    This is BFS on an implicit graph:
    - Nodes: words in the dictionary
    - Edges: words differing by exactly one letter
    - We want the shortest path (fewest transformations)
    
    Example: "hit" โ†’ "hot" โ†’ "dot" โ†’ "dog" โ†’ "cog" (4 steps)"""
    word_set = set(word_list)
    if end_word not in word_set:
        return 0  # Target not in dictionary

    queue = deque([(begin_word, 1)])  # (current_word, steps)
    visited = {begin_word}

    while queue:
        word, steps = queue.popleft()

        # Try changing each letter to every letter a-z
        for i in range(len(word)):
            for c in 'abcdefghijklmnopqrstuvwxyz':
                if c == word[i]:
                    continue  # Skip same letter

                # Build the new word with one letter changed
                new_word = word[:i] + c + word[i+1:]

                if new_word == end_word:
                    return steps + 1  # Found it!

                if new_word in word_set and new_word not in visited:
                    visited.add(new_word)
                    queue.append((new_word, steps + 1))

    return 0  # No transformation sequence exists

# Example
words = ["hot", "dot", "dog", "lot", "log", "cog"]
result = word_ladder("hit", "cog", words)
print(f"Shortest transformation: {result} steps")
# hit โ†’ hot โ†’ dot โ†’ dog โ†’ cog = 5 steps (including start word)
      

Word ladder is a perfect example of BFS on an implicit graph โ€” you never build the full graph data structure. Instead, you generate neighbors on the fly by trying every possible single-letter change. The state space is the dictionary, and BFS explores it level by level to find the shortest transformation. This same pattern applies to any problem where states are complex objects (lock combinations, puzzle configurations, chemical formulas) and transitions are well-defined operations.

Interview Patterns

๐Ÿ’ก "Shortest" + "Unweighted" = BFS If the problem asks for the shortest/minimum number of steps, moves, or transformations and all steps cost the same, BFS is almost always the answer. Word ladder, minimum knight moves, shortest path in a maze, open the lock โ€” all BFS. If steps have different costs, you need Dijkstra instead.
๐Ÿ’ก The Level-Size Trick To process nodes level by level (for level-order traversal, counting BFS levels, or finding the depth where something happens), snapshot the queue size at the start of each level: level_size = len(queue). Process exactly that many nodes in a for loop. Everything added during that loop belongs to the next level. This is the most important BFS implementation detail.
๐Ÿ’ก Multi-Source BFS When the problem has multiple starting points (all rotten oranges, all gates, all 0s in a matrix), add them ALL to the queue before starting. This is equivalent to adding a virtual source node connected to all starting points. You get correct distances from the nearest source in one BFS pass. Don't run BFS from each source separately โ€” that's O(sources ร— V) instead of O(V).
๐Ÿ’ก 0-1 BFS for Binary Weights When edges have weight 0 or 1 (e.g., "move for free on roads, cost 1 to break a wall"), use a deque instead of a regular queue. Add weight-0 neighbors to the front and weight-1 neighbors to the back. This gives O(V+E) time instead of Dijkstra's O(E log V). The deque maintains the sorted-by-distance invariant that regular BFS has for unweighted graphs.
๐Ÿ’ก Bidirectional BFS For shortest path between two specific nodes, run BFS from BOTH ends simultaneously. Alternate expanding from the start and the end. When the two frontiers meet, you've found the shortest path. The time savings can be dramatic: if the shortest path has length d and each level has branching factor b, regular BFS explores O(b^d) nodes while bidirectional explores O(2 ร— b^(d/2)) โ€” an exponential improvement. Word ladder with a large dictionary is a classic application.
๐Ÿ’ก BFS for Topological Sort (Kahn's Algorithm) Track the in-degree (number of incoming edges) of each node. Enqueue all nodes with in-degree 0. Process each node by decrementing the in-degree of its neighbors. When a neighbor's in-degree hits 0, enqueue it. If you process all nodes, you have a valid topological ordering. If not, the graph has a cycle. This is BFS-based topological sort โ€” useful for course scheduling, build systems, and dependency resolution.

Practice Problems

ProblemDifficultyPatternLink
Binary Tree Level Order TraversalMediumLevel-by-level BFSLC #102
Number of IslandsMediumGrid BFS / connected componentsLC #200
Rotting OrangesMediumMulti-source BFSLC #994
Word LadderHardImplicit graph BFSLC #127
01 MatrixMediumMulti-source BFSLC #542
Shortest Path in Binary MatrixMedium8-directional grid BFSLC #1091
Open the LockMediumState-space BFSLC #752
Course Schedule IIMediumTopological sort (Kahn's)LC #210