DSA Lab | Network Flow

What Is Network Flow?

A flow network is a directed graph where each edge has a capacity — the maximum amount of "stuff" it can carry. The stuff could be water, data packets, trucks, or an abstract unit. There's a designated source node (where flow originates) and a sink node (where flow ends up). The max flow problem asks: what's the maximum total flow you can push from source to sink without exceeding any edge's capacity?

Picture a system of water pipes. Each pipe has a maximum throughput. Water enters the system at a single source and exits at a single sink. You want to maximize total water delivery. Every intermediate junction (node) must conserve water — whatever flows in must flow out. That's the whole problem.

Network flow has been studied since the 1950s, originally motivated by Cold War military logistics. The Ford-Fulkerson method (1956) was developed by L.R. Ford Jr. and D.R. Fulkerson at the RAND Corporation to analyze Soviet railway capacity. The max-flow min-cut theorem they proved became one of the most important results in combinatorial optimization.

Formal Constraints

Capacity constraint: Flow on each edge must be non-negative and can't exceed the edge's capacity. Formally: 0 ≤ f(u,v) ≤ c(u,v) for every edge (u,v).
Flow conservation: For every node except source and sink, the total flow entering equals the total flow leaving. Nothing is created, stored, or destroyed in transit. Formally: Σ f(w,v) = Σ f(v,w) for all v ∉ {s, t}.
Skew symmetry: f(u,v) = -f(v,u). This is a bookkeeping convention that simplifies the math — sending 5 units from u to v is equivalent to "negative 5 units" from v to u.

The Max-Flow Min-Cut Theorem

This is the fundamental theorem of network flow, and it's one of those rare results that's both profound and practical.

A cut is a partition of the vertices into two groups: S (containing the source) and T (containing the sink). The capacity of a cut is the sum of capacities of edges going from S to T (not T to S — direction matters). A minimum cut (min cut) is the cut with the smallest capacity.

The theorem states: max flow = min cut. The maximum amount of flow you can push through the network equals the capacity of the narrowest bottleneck (the minimum cut). This is elegant because it connects two seemingly different questions — an optimization problem (max flow) and a structural property (min cut) — and shows they have the same answer.

This means you get two results for the price of one: solving max flow also gives you the min cut (by looking at which nodes are reachable from the source in the final residual graph — the graph showing remaining capacity on each edge).

Where Flow Problems Hide

Here's the surprising part: many problems that don't look like flow at all can be reduced to max flow. This is what makes network flow so powerful.

Maximum bipartite matching: Given workers and jobs, assign the maximum number of workers to jobs (each worker to one job, each job to one worker). Model as a flow network: source connects to all workers, all jobs connect to sink, edges from workers to compatible jobs. All capacities = 1. Max flow = max matching.
Image segmentation: Classify pixels as foreground or background. Source represents "foreground," sink represents "background." Edge weights encode how much a pixel "looks like" foreground/background and how similar neighboring pixels are. Min cut divides pixels optimally.
Airline scheduling: Assign aircraft to flights such that every flight is covered and aircraft don't teleport between airports.
Network reliability: The minimum number of edges to remove to disconnect node A from node B is exactly the min cut between A and B (with all capacities = 1).
Project selection: Given projects with profits and costs, and dependencies between them, which subset maximizes net profit? Reducible to min cut.
Baseball elimination: Can team X still win the league? This is a max flow problem (if the flow saturates all edges from the source, elimination is possible).
König's theorem (bipartite graphs): Max matching = min vertex cover. This connects matching, flow, and covering in bipartite graphs.

How It Works

Ford-Fulkerson Method — The Big Picture

The Ford-Fulkerson method is more of a framework than a specific algorithm — it describes what to do but leaves how to find augmenting paths unspecified.

Initialize: Start with zero flow on every edge.
Find an augmenting path — any path from source to sink in the residual graph where every edge has remaining capacity > 0.
Augment: Push flow along this path. The amount you can push is the bottleneck — the minimum remaining capacity on any edge in the path.
Update the residual graph: For each edge on the path, decrease its remaining capacity by the bottleneck amount. Also increase the reverse edge's capacity by the same amount.
Repeat until no augmenting path exists. The total flow pushed is the max flow.

The Residual Graph — Why Reverse Edges Matter

This is the most confusing part of flow algorithms for beginners, so let's be precise.

For every original edge (u → v) with capacity c carrying flow f:

The residual graph has a forward edge u → v with remaining capacity (c − f). This represents "how much more flow can still go through this pipe."
The residual graph has a reverse edge v → u with capacity f. This represents "how much flow we can undo."

Why do we need reverse edges? Because early flow decisions might not be globally optimal. The algorithm might push flow along one path and later need to "redirect" some of that flow along a different path to achieve higher total flow. Reverse edges enable this redirection. Without them, the algorithm can get stuck at a suboptimal flow.

Concrete example: Consider a diamond graph: S→A (cap 1), S→B (cap 1), A→B (cap 1), A→T (cap 1), B→T (cap 1). Max flow is 2 (S→A→T and S→B→T). But if Ford-Fulkerson first finds the path S→A→B→T and pushes 1 unit, the only remaining augmenting path is S→B→reverse(B→A)→A→T — it "undoes" the A→B flow using the reverse edge. Without the reverse edge, it would stop at flow = 1 (suboptimal).

Edmonds-Karp: BFS-based Ford-Fulkerson

Ford-Fulkerson with DFS to find augmenting paths has a problem: the time complexity is O(E × max_flow), which can be terrible if capacities are large. Worse, with irrational capacities, DFS-based Ford-Fulkerson might not even terminate.

Edmonds-Karp (1972) fixes this by using BFS (breadth-first search) to find the shortest augmenting path (fewest edges). This guarantees:

At most O(VE) augmenting paths are found (each BFS is O(E)).
Total time: O(VE²), completely independent of capacity values.
Always terminates, even with irrational capacities.

Why does BFS help? By always finding the shortest path, Edmonds-Karp ensures that the shortest-path distance from source to sink in the residual graph can only increase over time. It can increase at most V times, and between consecutive increases, at most E augmentations happen. That's where O(VE) comes from.

Dinic's Algorithm — Faster in Practice

Dinic's algorithm (1970, by Yefim Dinitz) improves on Edmonds-Karp by finding multiple augmenting paths per BFS. The algorithm:

BFS from source to build a level graph — assign each node a level (distance from source in the residual graph). Only keep edges that go from level i to level i+1.
Find blocking flows — use DFS to find augmenting paths in the level graph. A "blocking flow" saturates at least one edge on every possible source-to-sink path in the level graph.
Repeat: rebuild the level graph and find blocking flows until the sink is unreachable.

Time: O(V²E) in general, but O(E√V) for unit-capacity graphs (which covers bipartite matching). In practice, Dinic's is much faster than Edmonds-Karp due to finding multiple augmenting paths per BFS phase.

Finding the Min Cut

After running any max-flow algorithm to completion:

Do a BFS/DFS from the source in the final residual graph (only following edges with remaining capacity > 0).
All reachable vertices form set S. All unreachable vertices form set T.
The min cut consists of all original edges going from S to T. These edges are fully saturated (residual capacity = 0 in the forward direction).

Common Mistakes

⚠️ Forgetting Reverse Edges The #1 network flow bug. If you don't add reverse edges to the residual graph, your algorithm will find a suboptimal flow and stop early. Every edge (u→v) in the original graph needs a corresponding reverse edge (v→u) in the residual graph, initially with capacity 0. When you push flow f along u→v, the reverse edge v→u gains f capacity.

⚠️ Modifying the Graph During BFS BFS finds the path, but don't update capacities while searching. Find the complete path first, compute the bottleneck, then update all edges in a second pass. Updating during BFS can corrupt the search.

⚠️ Using DFS Without Capacity-Independence Ford-Fulkerson with DFS has time complexity O(E × max_flow). If capacities are huge (like 10⁹), this is far too slow. Use Edmonds-Karp (BFS) or Dinic's for polynomial guarantees. In interviews, always mention this distinction — it shows you understand the subtlety.

⚠️ Wrong Capacity Setup for Matching When reducing bipartite matching to max flow, all capacities should be 1 (or appropriate for the problem). If you set capacities wrong, a single worker might get assigned to multiple jobs. Also remember to add the super-source and super-sink — they're not part of the original problem but are required for the flow formulation.

⚠️ Confusing Edge Capacity and Flow The residual capacity of an edge is (capacity − current_flow), not just the capacity. Some implementations track flow separately from capacity; others track only the residual capacity directly. Pick one convention and be consistent. Mixing them up causes incorrect augmentation amounts.

Interactive Visualization

Flow Network — Augmenting Paths

Max Flow: 0

Operations & Complexity

Algorithm	Time	Space	Notes
Ford-Fulkerson (DFS)	O(E × max_flow)	O(V + E)	Depends on capacity values; may not terminate
Edmonds-Karp (BFS)	O(VE²)	O(V + E)	Polynomial, capacity-independent
Dinic's Algorithm	O(V²E)	O(V + E)	O(E√V) for unit capacities
Push-Relabel (FIFO)	O(V³)	O(V + E)	Best for very dense graphs
Hopcroft-Karp (matching)	O(E√V)	O(V + E)	Bipartite matching specifically
Min Cut (via Max Flow)	Same as max flow		BFS from source in final residual graph
Min Cost Max Flow	O(V²E) to O(VE²)	O(V + E)	SPFA/Bellman-Ford based

Implementation

Edmonds-Karp (BFS-based Max Flow)

Python


from collections import defaultdict, deque

def edmonds_karp(graph, source, sink):
    """Edmonds-Karp algorithm: BFS-based max flow.
    graph: dict of dict. graph[u][v] = capacity of edge u→v.
    Use defaultdict(lambda: defaultdict(int)) for easy construction.
    Returns: maximum flow value.
    Time: O(VE²), Space: O(V + E)."""

    def bfs(source, sink, parent):
        """Find shortest augmenting path using BFS.
        Returns True if a path exists, False otherwise.
        parent dict records the path for later traversal."""
        visited = {source}
        queue = deque([source])
        while queue:
            u = queue.popleft()
            for v in graph[u]:
                if v not in visited and graph[u][v] > 0:
                    visited.add(v)
                    parent[v] = u
                    if v == sink:
                        return True  # found a path
                    queue.append(v)
        return False  # sink unreachable

    max_flow = 0

    while True:
        parent = {}
        if not bfs(source, sink, parent):
            break  # no more augmenting paths — done

        # Trace the path to find the bottleneck
        path_flow = float('inf')
        v = sink
        while v != source:
            u = parent[v]
            path_flow = min(path_flow, graph[u][v])
            v = u

        # Update residual graph along the path
        v = sink
        while v != source:
            u = parent[v]
            graph[u][v] -= path_flow  # decrease forward edge
            graph[v][u] += path_flow  # increase reverse edge
            v = u

        max_flow += path_flow

    return max_flow

# Usage example:
# g = defaultdict(lambda: defaultdict(int))
# g['S']['A'] = 10; g['S']['B'] = 8
# g['A']['B'] = 5;  g['A']['T'] = 7
# g['B']['T'] = 10
# print(edmonds_karp(g, 'S', 'T'))  # max flow value

Dinic's Algorithm

Python


from collections import deque

class Dinic:
    """Dinic's max flow algorithm.
    Faster than Edmonds-Karp in practice.
    O(V²E) general, O(E√V) for unit-capacity graphs.
    
    Use add_edge to build the graph, then call max_flow."""
    
    def __init__(self, n):
        self.n = n
        self.graph = [[] for _ in range(n)]  # adjacency list of edge indices
        self.edges = []  # list of [to, capacity, rev_index]

    def add_edge(self, u, v, cap):
        """Add directed edge u→v with capacity cap."""
        # Forward edge
        self.graph[u].append(len(self.edges))
        self.edges.append([v, cap, len(self.edges) + 1])
        # Reverse edge (capacity 0 initially)
        self.graph[v].append(len(self.edges))
        self.edges.append([u, 0, len(self.edges) - 1])

    def bfs(self, s, t):
        """Build level graph using BFS. Returns True if t is reachable."""
        self.level = [-1] * self.n
        self.level[s] = 0
        queue = deque([s])
        while queue:
            u = queue.popleft()
            for idx in self.graph[u]:
                v, cap, _ = self.edges[idx]
                if self.level[v] == -1 and cap > 0:
                    self.level[v] = self.level[u] + 1
                    queue.append(v)
        return self.level[t] != -1

    def dfs(self, u, t, pushed):
        """Find blocking flow using DFS with current-edge optimization."""
        if u == t:
            return pushed
        while self.iter[u] < len(self.graph[u]):
            idx = self.graph[u][self.iter[u]]
            v, cap, rev = self.edges[idx]
            if self.level[v] == self.level[u] + 1 and cap > 0:
                d = self.dfs(v, t, min(pushed, cap))
                if d > 0:
                    self.edges[idx][1] -= d      # decrease forward
                    self.edges[rev][1] += d      # increase reverse
                    return d
            self.iter[u] += 1
        return 0

    def max_flow(self, s, t):
        """Compute max flow from s to t."""
        flow = 0
        while self.bfs(s, t):
            self.iter = [0] * self.n  # current edge pointer per node
            while True:
                f = self.dfs(s, t, float('inf'))
                if f == 0:
                    break
                flow += f
        return flow

# Usage:
# d = Dinic(6)  # 6 nodes: 0=source, 5=sink
# d.add_edge(0, 1, 10)
# d.add_edge(0, 2, 8)
# d.add_edge(1, 2, 5)
# d.add_edge(1, 5, 7)
# d.add_edge(2, 5, 10)
# print(d.max_flow(0, 5))

JavaScript


function edmondsKarp(capacity, source, sink, n) {
  // capacity: 2D array n×n, capacity[u][v] = edge capacity
  // Uses residual capacity directly (no separate flow array)
  const residual = capacity.map(row => [...row]);
  let maxFlow = 0;

  while (true) {
    // BFS to find shortest augmenting path
    const parent = new Array(n).fill(-1);
    parent[source] = source;
    const queue = [source];
    let head = 0;

    while (head < queue.length && parent[sink] === -1) {
      const u = queue[head++];
      for (let v = 0; v < n; v++) {
        if (parent[v] === -1 && residual[u][v] > 0) {
          parent[v] = u;
          queue.push(v);
        }
      }
    }

    if (parent[sink] === -1) break; // no more paths

    // Find bottleneck along the path
    let aug = Infinity;
    for (let v = sink; v !== source; v = parent[v])
      aug = Math.min(aug, residual[parent[v]][v]);

    // Update residual capacities
    for (let v = sink; v !== source; v = parent[v]) {
      residual[parent[v]][v] -= aug;
      residual[v][parent[v]] += aug;
    }
    maxFlow += aug;
  }
  return maxFlow;
}

Finding the Min Cut After Max Flow

Python


def find_min_cut(graph, source):
    """After running Edmonds-Karp (which modifies graph to residual),
    find the min cut by BFS from source in the residual graph.
    
    Vertices reachable from source = set S.
    Vertices not reachable = set T.
    Cut edges: all original edges from S to T.
    
    Returns (S, T) partition."""
    visited = set()
    queue = deque([source])
    visited.add(source)
    while queue:
        u = queue.popleft()
        for v in graph[u]:
            if v not in visited and graph[u][v] > 0:
                visited.add(v)
                queue.append(v)
    
    S = visited
    T = set(graph.keys()) - S
    
    # The cut edges are original edges from S to T
    # (they have residual capacity 0 in forward direction)
    return S, T

Bipartite Matching via Max Flow

Python


def max_bipartite_matching(left_nodes, right_nodes, edges):
    """Reduce maximum bipartite matching to max flow.
    left_nodes: list of left-side node ids
    right_nodes: list of right-side node ids
    edges: list of (left, right) pairs representing valid assignments
    
    Returns the maximum number of matched pairs."""
    graph = defaultdict(lambda: defaultdict(int))
    SOURCE = '__source__'
    SINK = '__sink__'
    
    # Source → each left node (capacity 1: each worker gets one job)
    for l in left_nodes:
        graph[SOURCE][l] = 1
    
    # Each right node → sink (capacity 1: each job gets one worker)
    for r in right_nodes:
        graph[r][SINK] = 1
    
    # Left → right edges (capacity 1)
    for l, r in edges:
        graph[l][r] = 1
    
    return edmonds_karp(graph, SOURCE, SINK)

# Example: 3 workers, 3 jobs
# Worker A can do jobs 1, 2
# Worker B can do job 2
# Worker C can do jobs 2, 3
# matching = max_bipartite_matching(
#     ['A', 'B', 'C'], ['1', '2', '3'],
#     [('A','1'), ('A','2'), ('B','2'), ('C','2'), ('C','3')]
# )
# Result: 3 (A→1, B→2, C→3)

Real-World Example: Load Balancing Across Servers

Network flow models server-to-task assignment in data centers. Each server has a capacity (max concurrent requests). Each task type has a demand (how many instances need to run). Some servers can handle certain task types. How do you assign tasks to servers to maximize total throughput?

Python


def balance_load(servers, tasks, compatibility):
    """Assign tasks to servers to maximize throughput.
    
    servers: dict of server_name -> capacity (max concurrent tasks)
    tasks: dict of task_type -> demand (how many instances needed)
    compatibility: dict of server_name -> [list of task types it can run]
    
    Returns: max tasks that can be assigned, and the assignment."""
    graph = defaultdict(lambda: defaultdict(int))
    SOURCE = '__source__'
    SINK = '__sink__'
    
    # Source → each task type (capacity = demand)
    for task, demand in tasks.items():
        graph[SOURCE][f'task_{task}'] = demand
    
    # Each server → sink (capacity = server's max capacity)
    for server, cap in servers.items():
        graph[f'server_{server}'][SINK] = cap
    
    # Task → compatible servers (capacity = min of demand and server cap)
    for server, compatible_tasks in compatibility.items():
        for task in compatible_tasks:
            if task in tasks:
                # Capacity: how many of this task the server could handle
                graph[f'task_{task}'][f'server_{server}'] = min(
                    tasks[task], servers[server]
                )
    
    max_assigned = edmonds_karp(graph, SOURCE, SINK)
    total_demand = sum(tasks.values())
    
    print(f"Assigned {max_assigned}/{total_demand} tasks")
    if max_assigned < total_demand:
        print(f"⚠️  {total_demand - max_assigned} tasks unassigned (need more capacity)")
    return max_assigned

# Example: 3 servers, 3 task types
servers = {'web-1': 10, 'web-2': 8, 'gpu-1': 4}
tasks = {'api': 12, 'ml-inference': 4, 'batch': 6}
compat = {
    'web-1': ['api', 'batch'],
    'web-2': ['api', 'batch'],
    'gpu-1': ['ml-inference', 'batch']
}
balance_load(servers, tasks, compat)
# Assigned 22/22 tasks: web-1 handles 10 api, web-2 handles 2 api + 6 batch, gpu-1 handles 4 ml-inference

This same formulation is used by Kubernetes schedulers, CDN load balancers, and cloud platforms. The "servers" might be pods, edge nodes, or availability zones. The "tasks" might be containers, user requests, or data shards.

Interview Patterns

💡 Bipartite Matching = Max Flow Any maximum bipartite matching problem can be solved as a max flow problem. Add a super-source connected to all left nodes (capacity 1 each), a super-sink connected to all right nodes (capacity 1 each), and edges between compatible left-right pairs (capacity 1). Max flow = maximum matching. This is how you solve assignment problems at scale. For interviews, also know the Hungarian algorithm (O(n³)) for minimum-cost matching.

💡 Min Cut for Separation Problems When a problem asks "what's the minimum number of edges to remove to disconnect X from Y," that's a min cut problem. By the max-flow min-cut theorem, solve max flow and you get the answer. The actual cut edges are found by BFS from source in the residual graph after max flow terminates — cut edges are those crossing from reachable to unreachable vertices.

💡 Recognize Hidden Flow Problems Network flow problems are rarely presented as "find the max flow." They're disguised as: maximum matching, minimum vertex cover (König's theorem), maximum independent set (on bipartite graphs), minimum edge/vertex-disjoint paths, or minimum path cover on DAGs. The reduction usually involves: (1) identify source and sink, (2) model constraints as capacities, (3) model the objective as total flow.

💡 Vertex Capacities via Node Splitting If a vertex has a capacity (maximum flow through it), split it into two nodes: v_in and v_out, connected by an edge with the vertex's capacity. All incoming edges go to v_in, all outgoing edges leave from v_out. This converts vertex capacities to edge capacities, which standard flow algorithms handle.

💡 Min Cost Max Flow for Optimization When you need max flow but also want to minimize cost (each edge has both a capacity and a cost per unit of flow), use min cost max flow (MCMF). The algorithm repeatedly finds the shortest (cheapest) augmenting path using Bellman-Ford or SPFA. This solves assignment problems with preferences, transportation problems with costs, and many competitive programming problems.

Practice Problems

#	Problem	Difficulty	Key Concept
785	Is Graph Bipartite?	Medium	Bipartite check (flow prereq)
1494	Parallel Courses II	Hard	Flow-like scheduling
1349	Maximum Students Taking Exam	Hard	Bipartite matching / flow
1066	Campus Bikes II	Medium	Assignment / matching
2850	Minimum Moves to Spread Stones	Medium	Min cost flow / matching
1595	Min Cost to Connect Two Groups	Hard	Min cost bipartite matching
1947	Max Compatibility Score Sum	Medium	Assignment (bitmask DP or flow)

Network Flow

On This Page

What Is Network Flow?

Formal Constraints

The Max-Flow Min-Cut Theorem

Where Flow Problems Hide

How It Works

Ford-Fulkerson Method — The Big Picture

The Residual Graph — Why Reverse Edges Matter

Edmonds-Karp: BFS-based Ford-Fulkerson

Dinic's Algorithm — Faster in Practice

Finding the Min Cut

Common Mistakes

Interactive Visualization

Operations & Complexity

Implementation

Edmonds-Karp (BFS-based Max Flow)

Dinic's Algorithm

Finding the Min Cut After Max Flow

Bipartite Matching via Max Flow

Real-World Example: Load Balancing Across Servers

Interview Patterns

Practice Problems