DSA Lab | Binary Search

What Is Binary Search?

Imagine looking up a word in a physical dictionary. You don't start at page one — you flip to the middle, see if your word comes before or after, then flip to the middle of the remaining half. That's binary search. It's the same thing a librarian does, what you do when guessing a number in "higher/lower," and how git bisect finds the commit that introduced a bug.

Origin Story

The idea of binary search is ancient — people have been using divide-and-conquer lookup for centuries with sorted books and catalogs. But the first formal description of the algorithm for computers came from John Mauchly in 1946. The first published correct implementation didn't appear until 1962, despite the algorithm being "well understood" for years. That gap between "I get the concept" and "I can write it without bugs" says everything about binary search.

Jon Bentley reported in his 1986 book Programming Pearls that when he asked professional programmers to implement binary search, about 90% got it wrong. Most of the bugs were off-by-one errors. And here's the kicker: Java's own Arrays.binarySearch had an integer overflow bug in the midpoint calculation that went undetected for nine years (2006, spotted by Joshua Bloch). The code used (low + high) / 2 which overflows when both values are near Integer.MAX_VALUE.

The Core Requirement

Your data must be sorted (or at least monotonic — a function where the output consistently increases or decreases as the input increases). Given a sorted array of n elements, binary search finds any element in at most log₂(n) steps. For a million elements, that's about 20 comparisons. For a billion, about 30. For a trillion, 40. The power of halving is staggering.

The term monotonic means "always going in one direction." A function f is monotonically increasing if f(x) ≤ f(x+1) for all x. Binary search works on any monotonic predicate — it doesn't have to be a sorted array. This insight is what unlocks "binary search on the answer" problems.

How It Works

The Basic Algorithm

You maintain two pointers — left and right — marking the boundaries of the current search space. Calculate mid as the halfway point. Compare the target to the element at mid:

If they match, you're done.
If target is smaller, move right to mid - 1 (discard the upper half).
If target is larger, move left to mid + 1 (discard the lower half).

Repeat until left > right (element not found) or you find a match.

Why this terminates: Every step shrinks the search space by at least one element (because we move left past mid or right past mid). Eventually left > right, meaning the search space is empty. With n elements, halving at most log₂(n) times gets you to a single element.

The Variants

Lower bound (bisect_left) — find the first position where you could insert a value and keep things sorted. Instead of returning when you find a match, you keep searching left. The loop condition is left < right (strict), and when arr[mid] ≥ target, you set right = mid (not mid - 1). Useful for "find the first occurrence of X" or "how many elements are less than X."

Upper bound (bisect_right) — find the first position strictly greater than the target. When arr[mid] ≤ target, move left forward. The result minus 1 gives you "the last occurrence of X." The difference between upper_bound and lower_bound gives you "how many copies of X exist."

Search on answer — sometimes you're not searching an array at all. You're searching a range of possible answers and checking "is this answer feasible?" If feasibility is monotonic (all values below some threshold fail, all values above it succeed), binary search finds that threshold. This shows up constantly in competitive programming. Example: "What's the minimum speed to eat all bananas in H hours?" You binary search on speed values and check if that speed is fast enough.

Rotated array — a sorted array that's been shifted (like [4,5,6,7,0,1,2]). One half is always sorted — the half where arr[left] ≤ arr[mid] is the sorted half (for left half). Compare the target to the endpoints of the sorted half to decide which side to search. The trick is handling the boundary cases correctly (what if arr[left] == arr[mid]?).

Peak finding — the array isn't sorted at all, but you want to find a local maximum. If arr[mid] < arr[mid+1], the peak is to the right (the values are still climbing). If arr[mid] < arr[mid-1], the peak is to the left. This is binary search without sorting — the only requirement is that a peak exists (guaranteed when endpoints are -∞).

Interactive Visualization

Enter a target value (or use the random one) and watch binary search eliminate half the array each step. The dimmed elements are the ones we've ruled out.

Binary Search Visualizer

Steps: 0 | Range: [0, 15] | Status: Ready

Common Mistakes

⚠️ Mistake #1: Integer overflow in midpoint calculation Writing mid = (left + right) / 2 overflows when left + right exceeds the maximum integer value. Use mid = left + (right - left) / 2 instead. This is the bug that lived in Java's standard library for nine years. In Python, integers have arbitrary precision so it doesn't overflow, but the safe form is a good habit regardless.

⚠️ Mistake #2: Wrong loop condition (left <= right vs left < right) For exact-match search, use left <= right and set right = mid - 1 or left = mid + 1. For lower/upper bound, use left < right and set right = mid (not mid - 1). Mixing these two templates is the #1 source of infinite loops and missed elements. Pick one template and stick with it.

⚠️ Mistake #3: Forgetting the input must be sorted Binary search on unsorted data doesn't crash — it just gives wrong answers silently. If a problem hands you unsorted input and asks for a search, either sort first (adding O(n log n)) or use a hash set instead (O(n) time, O(n) space). Binary search without sorted data is just random guessing with extra steps.

⚠️ Mistake #4: Off-by-one on boundary updates When you find arr[mid] < target, the next left should be mid + 1, not mid. Setting left = mid can create an infinite loop when left == mid (which happens when the search space has two elements). Trace through a 2-element example on paper before submitting.

⚠️ Mistake #5: Not handling "not found" correctly When the target doesn't exist, a standard binary search returns -1. But a lower_bound returns the insertion point — the index where the target would go. These are different things. If you need "find the element" and you're using a lower_bound template, you still need to check whether the element at the returned index actually equals your target.

Complexity Analysis

Every step cuts the search space in half. Starting with n elements: n → n/2 → n/4 → ... → 1. That's log₂(n) steps. Each step does O(1) work (one comparison, one pointer update).

Operation	Time	Space	Notes
Standard search	O(log n)	O(1)	Iterative version
Recursive search	O(log n)	O(log n)	Call stack depth
Lower/upper bound	O(log n)	O(1)	Same idea, different termination
Rotated array search	O(log n)	O(1)	Extra comparison per step
Search on answer	O(log(range) × check)	O(1)	check = cost of feasibility test

💡 Why mid = left + (right - left) // 2? Not (left + right) // 2. When left and right are both large (like 2 billion), their sum overflows a 32-bit integer. The subtraction form avoids this. It's the bug Java had for years. Mathematically identical, computationally safer.

Implementation

Standard Binary Search

Python


def binary_search(arr, target):
    """Find target in sorted array. Returns index or -1.
    Uses left <= right template — good for exact match."""
    left, right = 0, len(arr) - 1

    while left <= right:
        # Overflow-safe midpoint calculation
        mid = left + (right - left) // 2

        if arr[mid] == target:
            return mid          # Found it
        elif arr[mid] < target:
            left = mid + 1      # Target is in the right half
        else:
            right = mid - 1     # Target is in the left half

    return -1  # Not found — left is now the insertion point

JavaScript


function binarySearch(arr, target) {
  let left = 0, right = arr.length - 1;

  while (left <= right) {
    const mid = left + Math.floor((right - left) / 2);

    if (arr[mid] === target) return mid;
    if (arr[mid] < target) left = mid + 1;
    else right = mid - 1;
  }

  return -1;
}

Lower Bound (First Occurrence / Insertion Point)

Python


def lower_bound(arr, target):
    """Find the leftmost index where target could be inserted.
    Equivalent to Python's bisect.bisect_left().
    Uses left < right template — right starts at len(arr), not len-1."""
    left, right = 0, len(arr)  # Note: right = len, not len-1

    while left < right:        # Note: strict <, not <=
        mid = left + (right - left) // 2
        if arr[mid] < target:
            left = mid + 1
        else:
            # arr[mid] >= target: this could be the answer, keep it
            right = mid        # Don't skip past mid — it might be the one

    return left  # Insertion point (or index of first occurrence)

# To check if target actually exists at the returned index:
# idx = lower_bound(arr, target)
# found = idx < len(arr) and arr[idx] == target

Upper Bound (Past the Last Occurrence)

Python


def upper_bound(arr, target):
    """Find the first index strictly greater than target.
    Equivalent to Python's bisect.bisect_right().
    Last occurrence of target is at upper_bound - 1."""
    left, right = 0, len(arr)

    while left < right:
        mid = left + (right - left) // 2
        if arr[mid] <= target:   # Note: <= not <
            left = mid + 1
        else:
            right = mid

    return left

# Count of target in sorted array:
# count = upper_bound(arr, target) - lower_bound(arr, target)

Search in Rotated Sorted Array

Python


def search_rotated(nums, target):
    """Search in a sorted array that's been rotated.
    e.g., [4,5,6,7,0,1,2] is [0,1,2,4,5,6,7] rotated at index 3.
    Key insight: one half is ALWAYS sorted."""
    left, right = 0, len(nums) - 1

    while left <= right:
        mid = left + (right - left) // 2
        if nums[mid] == target:
            return mid

        # Figure out which half is sorted
        if nums[left] <= nums[mid]:
            # Left half [left..mid] is sorted
            if nums[left] <= target < nums[mid]:
                right = mid - 1  # Target is in the sorted left half
            else:
                left = mid + 1   # Target must be in the right half
        else:
            # Right half [mid..right] is sorted
            if nums[mid] < target <= nums[right]:
                left = mid + 1   # Target is in the sorted right half
            else:
                right = mid - 1  # Target must be in the left half

    return -1

Binary Search on the Answer

Python


def min_eating_speed(piles, h):
    """Koko eats bananas. piles[i] = bananas in pile i.
    She can eat at most k bananas/hour (one pile per hour).
    Find minimum k to finish all piles in h hours.
    
    The 'answer' is k. If k works, any k' > k also works (monotonic).
    Binary search on k from 1 to max(piles)."""
    import math

    def can_finish(k):
        """Check if eating speed k lets us finish in h hours."""
        hours_needed = sum(math.ceil(pile / k) for pile in piles)
        return hours_needed <= h

    left, right = 1, max(piles)

    while left < right:
        mid = left + (right - left) // 2
        if can_finish(mid):
            right = mid        # mid works, but maybe something smaller does too
        else:
            left = mid + 1     # mid doesn't work, need to go faster

    return left  # Minimum speed that works

Real-World Example: Git Bisect

Git has a built-in binary search tool called git bisect. Say your test suite was passing on commit #100 but fails on commit #500. Somewhere in those 400 commits, someone introduced a bug. You could check each commit one by one (linear search — 400 checks), or you could use binary search:

Python


# Simulating git bisect logic
def git_bisect(commits, test_fn):
    """Find the first commit where test_fn returns False (the bug).
    commits = list of commit hashes, oldest first.
    test_fn(commit) = True if the commit is good, False if bad.
    
    This is just lower_bound where we search for the first 'bad' commit.
    Real git bisect does exactly this on the commit graph."""
    
    left, right = 0, len(commits) - 1
    
    print(f"Bisecting {right - left + 1} commits...")
    steps = 0
    
    while left < right:
        mid = left + (right - left) // 2
        steps += 1
        result = test_fn(commits[mid])
        
        if result:  # Good commit
            print(f"  Step {steps}: commit {commits[mid][:8]} is GOOD "
                  f"— bug is after this")
            left = mid + 1
        else:  # Bad commit
            print(f"  Step {steps}: commit {commits[mid][:8]} is BAD "
                  f"— bug is at or before this")
            right = mid
    
    print(f"\nFound the bad commit in {steps} steps: {commits[left]}")
    print(f"(Linear search would have needed up to {len(commits)} steps)")
    return commits[left]

# For 400 commits, binary search needs at most 9 steps.
# log₂(400) ≈ 8.6 — rounds up to 9. That's the power of halving.

This is binary search in the wild. The "sorted array" is the commit history (good commits before bad commits — a monotonic property). The "comparison" is running the test suite. Each step cuts the suspect range in half. What took 400 test runs linearly takes 9 with binary search.

Interview Patterns

💡 "Binary search on the answer" Many problems don't look like binary search at first. "What's the minimum capacity to ship packages in D days?" The capacity is the answer — and if capacity X works, any capacity > X also works. That monotonicity is your signal. Binary search the answer space and check feasibility at each midpoint. Look for keywords like "minimum maximum," "maximum minimum," or "smallest value such that..."

💡 Template confusion There are at least three common binary search templates, and mixing them causes bugs. The key differences: left <= right vs left < right, and right = mid vs right = mid - 1. Pick one and internalize it. The "left < right, right = mid" variant is cleanest for lower/upper bound. The "left <= right, right = mid - 1" variant is cleanest for exact match.

💡 Recognizing hidden binary search Anytime you see "minimum of maximum" or "maximum of minimum" in a problem statement, think binary search on the answer. Also look for: sorted arrays (obvious), rotated arrays, "find the peak", "find first bad version", matrix search where rows and columns are sorted.

💡 Binary search on a 2D matrix If a matrix has sorted rows and sorted columns (but not globally sorted), you can't treat it as a flat array. Instead, start at the top-right corner: if the value is too big, move left; if too small, move down. This isn't binary search per se — it's O(m + n) — but it uses the same elimination logic. For a matrix that IS globally sorted (each row starts where the previous ended), treat it as a flat sorted array and binary search with index math: row = mid / cols, col = mid % cols.

💡 Using the standard library Python has bisect.bisect_left and bisect.bisect_right. C++ has std::lower_bound and std::upper_bound. Java has Arrays.binarySearch and Collections.binarySearch. In interviews, it's usually fine to use these — but be ready to implement from scratch if asked, and know the exact semantics (what they return on not-found, whether the result is an index or an iterator).

Practice Problems

704. Binary Search — Easy. The classic. Make sure your implementation doesn't have off-by-one bugs.
278. First Bad Version — Easy. Lower bound in disguise. Minimize API calls.
35. Search Insert Position — Easy. Literally bisect_left. Good for internalizing the lower bound template.
33. Search in Rotated Sorted Array — Medium. The classic variant. Know which half is sorted and narrow accordingly.
34. Find First and Last Position — Medium. Two binary searches: lower bound + upper bound.
162. Find Peak Element — Medium. The array isn't sorted, but it's still binary searchable. Think about what direction the slope tells you.
1011. Capacity to Ship Packages — Medium. Binary search on the answer. Classic "min of max" problem.
4. Median of Two Sorted Arrays — Hard. Binary search on the partition point. One of the hardest binary search problems out there.

Binary Search & Variants

On This Page

What Is Binary Search?

Origin Story

The Core Requirement

How It Works

The Basic Algorithm

The Variants

Interactive Visualization

Common Mistakes

Complexity Analysis

Implementation

Standard Binary Search

Lower Bound (First Occurrence / Insertion Point)

Upper Bound (Past the Last Occurrence)

Search in Rotated Sorted Array

Binary Search on the Answer

Real-World Example: Git Bisect

Interview Patterns

Practice Problems