Cut your search space in half every step. One of the most powerful ideas in computer science โ and one of the easiest to get subtly wrong.
๐ข BeginnerImagine looking up a word in a physical dictionary. You don't start at page one โ you flip to the middle, see if your word comes before or after, then flip to the middle of the remaining half. That's binary search. It's the same thing a librarian does, what you do when guessing a number in "higher/lower," and how git bisect finds the commit that introduced a bug.
The idea of binary search is ancient โ people have been using divide-and-conquer lookup for centuries with sorted books and catalogs. But the first formal description of the algorithm for computers came from John Mauchly in 1946. The first published correct implementation didn't appear until 1962, despite the algorithm being "well understood" for years. That gap between "I get the concept" and "I can write it without bugs" says everything about binary search.
Jon Bentley reported in his 1986 book Programming Pearls that when he asked professional programmers to implement binary search, about 90% got it wrong. Most of the bugs were off-by-one errors. And here's the kicker: Java's own Arrays.binarySearch had an integer overflow bug in the midpoint calculation that went undetected for nine years (2006, spotted by Joshua Bloch). The code used (low + high) / 2 which overflows when both values are near Integer.MAX_VALUE.
Your data must be sorted (or at least monotonic โ a function where the output consistently increases or decreases as the input increases). Given a sorted array of n elements, binary search finds any element in at most logโ(n) steps. For a million elements, that's about 20 comparisons. For a billion, about 30. For a trillion, 40. The power of halving is staggering.
The term monotonic means "always going in one direction." A function f is monotonically increasing if f(x) โค f(x+1) for all x. Binary search works on any monotonic predicate โ it doesn't have to be a sorted array. This insight is what unlocks "binary search on the answer" problems.
You maintain two pointers โ left and right โ marking the boundaries of the current search space. Calculate mid as the halfway point. Compare the target to the element at mid:
Repeat until left > right (element not found) or you find a match.
Why this terminates: Every step shrinks the search space by at least one element (because we move left past mid or right past mid). Eventually left > right, meaning the search space is empty. With n elements, halving at most logโ(n) times gets you to a single element.
Lower bound (bisect_left) โ find the first position where you could insert a value and keep things sorted. Instead of returning when you find a match, you keep searching left. The loop condition is left < right (strict), and when arr[mid] โฅ target, you set right = mid (not mid - 1). Useful for "find the first occurrence of X" or "how many elements are less than X."
Upper bound (bisect_right) โ find the first position strictly greater than the target. When arr[mid] โค target, move left forward. The result minus 1 gives you "the last occurrence of X." The difference between upper_bound and lower_bound gives you "how many copies of X exist."
Search on answer โ sometimes you're not searching an array at all. You're searching a range of possible answers and checking "is this answer feasible?" If feasibility is monotonic (all values below some threshold fail, all values above it succeed), binary search finds that threshold. This shows up constantly in competitive programming. Example: "What's the minimum speed to eat all bananas in H hours?" You binary search on speed values and check if that speed is fast enough.
Rotated array โ a sorted array that's been shifted (like [4,5,6,7,0,1,2]). One half is always sorted โ the half where arr[left] โค arr[mid] is the sorted half (for left half). Compare the target to the endpoints of the sorted half to decide which side to search. The trick is handling the boundary cases correctly (what if arr[left] == arr[mid]?).
Peak finding โ the array isn't sorted at all, but you want to find a local maximum. If arr[mid] < arr[mid+1], the peak is to the right (the values are still climbing). If arr[mid] < arr[mid-1], the peak is to the left. This is binary search without sorting โ the only requirement is that a peak exists (guaranteed when endpoints are -โ).
Enter a target value (or use the random one) and watch binary search eliminate half the array each step. The dimmed elements are the ones we've ruled out.
mid = (left + right) / 2 overflows when left + right exceeds the maximum integer value. Use mid = left + (right - left) / 2 instead. This is the bug that lived in Java's standard library for nine years. In Python, integers have arbitrary precision so it doesn't overflow, but the safe form is a good habit regardless.
left <= right and set right = mid - 1 or left = mid + 1. For lower/upper bound, use left < right and set right = mid (not mid - 1). Mixing these two templates is the #1 source of infinite loops and missed elements. Pick one template and stick with it.
Every step cuts the search space in half. Starting with n elements: n โ n/2 โ n/4 โ ... โ 1. That's logโ(n) steps. Each step does O(1) work (one comparison, one pointer update).
| Operation | Time | Space | Notes |
|---|---|---|---|
| Standard search | O(log n) | O(1) | Iterative version |
| Recursive search | O(log n) | O(log n) | Call stack depth |
| Lower/upper bound | O(log n) | O(1) | Same idea, different termination |
| Rotated array search | O(log n) | O(1) | Extra comparison per step |
| Search on answer | O(log(range) ร check) | O(1) | check = cost of feasibility test |
(left + right) // 2. When left and right are both large (like 2 billion), their sum overflows a 32-bit integer. The subtraction form avoids this. It's the bug Java had for years. Mathematically identical, computationally safer.
def binary_search(arr, target):
"""Find target in sorted array. Returns index or -1.
Uses left <= right template โ good for exact match."""
left, right = 0, len(arr) - 1
while left <= right:
# Overflow-safe midpoint calculation
mid = left + (right - left) // 2
if arr[mid] == target:
return mid # Found it
elif arr[mid] < target:
left = mid + 1 # Target is in the right half
else:
right = mid - 1 # Target is in the left half
return -1 # Not found โ left is now the insertion point
function binarySearch(arr, target) {
let left = 0, right = arr.length - 1;
while (left <= right) {
const mid = left + Math.floor((right - left) / 2);
if (arr[mid] === target) return mid;
if (arr[mid] < target) left = mid + 1;
else right = mid - 1;
}
return -1;
}
def lower_bound(arr, target):
"""Find the leftmost index where target could be inserted.
Equivalent to Python's bisect.bisect_left().
Uses left < right template โ right starts at len(arr), not len-1."""
left, right = 0, len(arr) # Note: right = len, not len-1
while left < right: # Note: strict <, not <=
mid = left + (right - left) // 2
if arr[mid] < target:
left = mid + 1
else:
# arr[mid] >= target: this could be the answer, keep it
right = mid # Don't skip past mid โ it might be the one
return left # Insertion point (or index of first occurrence)
# To check if target actually exists at the returned index:
# idx = lower_bound(arr, target)
# found = idx < len(arr) and arr[idx] == target
def upper_bound(arr, target):
"""Find the first index strictly greater than target.
Equivalent to Python's bisect.bisect_right().
Last occurrence of target is at upper_bound - 1."""
left, right = 0, len(arr)
while left < right:
mid = left + (right - left) // 2
if arr[mid] <= target: # Note: <= not <
left = mid + 1
else:
right = mid
return left
# Count of target in sorted array:
# count = upper_bound(arr, target) - lower_bound(arr, target)
def search_rotated(nums, target):
"""Search in a sorted array that's been rotated.
e.g., [4,5,6,7,0,1,2] is [0,1,2,4,5,6,7] rotated at index 3.
Key insight: one half is ALWAYS sorted."""
left, right = 0, len(nums) - 1
while left <= right:
mid = left + (right - left) // 2
if nums[mid] == target:
return mid
# Figure out which half is sorted
if nums[left] <= nums[mid]:
# Left half [left..mid] is sorted
if nums[left] <= target < nums[mid]:
right = mid - 1 # Target is in the sorted left half
else:
left = mid + 1 # Target must be in the right half
else:
# Right half [mid..right] is sorted
if nums[mid] < target <= nums[right]:
left = mid + 1 # Target is in the sorted right half
else:
right = mid - 1 # Target must be in the left half
return -1
def min_eating_speed(piles, h):
"""Koko eats bananas. piles[i] = bananas in pile i.
She can eat at most k bananas/hour (one pile per hour).
Find minimum k to finish all piles in h hours.
The 'answer' is k. If k works, any k' > k also works (monotonic).
Binary search on k from 1 to max(piles)."""
import math
def can_finish(k):
"""Check if eating speed k lets us finish in h hours."""
hours_needed = sum(math.ceil(pile / k) for pile in piles)
return hours_needed <= h
left, right = 1, max(piles)
while left < right:
mid = left + (right - left) // 2
if can_finish(mid):
right = mid # mid works, but maybe something smaller does too
else:
left = mid + 1 # mid doesn't work, need to go faster
return left # Minimum speed that works
Git has a built-in binary search tool called git bisect. Say your test suite was passing on commit #100 but fails on commit #500. Somewhere in those 400 commits, someone introduced a bug. You could check each commit one by one (linear search โ 400 checks), or you could use binary search:
# Simulating git bisect logic
def git_bisect(commits, test_fn):
"""Find the first commit where test_fn returns False (the bug).
commits = list of commit hashes, oldest first.
test_fn(commit) = True if the commit is good, False if bad.
This is just lower_bound where we search for the first 'bad' commit.
Real git bisect does exactly this on the commit graph."""
left, right = 0, len(commits) - 1
print(f"Bisecting {right - left + 1} commits...")
steps = 0
while left < right:
mid = left + (right - left) // 2
steps += 1
result = test_fn(commits[mid])
if result: # Good commit
print(f" Step {steps}: commit {commits[mid][:8]} is GOOD "
f"โ bug is after this")
left = mid + 1
else: # Bad commit
print(f" Step {steps}: commit {commits[mid][:8]} is BAD "
f"โ bug is at or before this")
right = mid
print(f"\nFound the bad commit in {steps} steps: {commits[left]}")
print(f"(Linear search would have needed up to {len(commits)} steps)")
return commits[left]
# For 400 commits, binary search needs at most 9 steps.
# logโ(400) โ 8.6 โ rounds up to 9. That's the power of halving.
This is binary search in the wild. The "sorted array" is the commit history (good commits before bad commits โ a monotonic property). The "comparison" is running the test suite. Each step cuts the suspect range in half. What took 400 test runs linearly takes 9 with binary search.
left <= right vs left < right, and right = mid vs right = mid - 1. Pick one and internalize it. The "left < right, right = mid" variant is cleanest for lower/upper bound. The "left <= right, right = mid - 1" variant is cleanest for exact match.
bisect.bisect_left and bisect.bisect_right. C++ has std::lower_bound and std::upper_bound. Java has Arrays.binarySearch and Collections.binarySearch. In interviews, it's usually fine to use these โ but be ready to implement from scratch if asked, and know the exact semantics (what they return on not-found, whether the result is an index or an iterator).