2.1 Sorting & Searching
Deep Dive: Binary Search is a Decision Machine
Beginner tip: Binary search is like guessing a number between 1 and 100. Your friend says "higher" or "lower" after each guess. If you always guess the middle, you find the answer in at most 7 guesses (log2 of 100). That is binary search -- cut the possibilities in half each step. It works on anything sorted or anything with a yes/no boundary.
Classic Binary Search -- Trace It
nums = [-1, 0, 3, 5, 9, 12], target = 9:
lo=0 hi=5 mid=2 nums[2]=3 < 9 -> lo=3
lo=3 hi=5 mid=4 nums[4]=9 == 9 -> FOUND at index 4
Only 2 steps to search 6 elements. For 1,000,000 elements: at most 20 steps.
Binary Search on Rotated Array -- Which Half is Sorted?
nums = [4, 5, 6, 7, 0, 1, 2], target = 0:
lo=0 hi=6 mid=3 nums[3]=7
Is left half sorted? nums[0]=4 <= nums[3]=7 -> YES
Is target in [4..7)? No (0 < 4) -> go right. lo=4
lo=4 hi=6 mid=5 nums[5]=1
Is left half sorted? nums[4]=0 <= nums[5]=1 -> YES
Is target in [0..1)? YES -> go left. hi=4
lo=4 hi=4 nums[4]=0 == 0 -> FOUND
The trick: One half is ALWAYS sorted. Check if target is in the sorted half. If yes, search there. If no, search the other half.
Binary Search on Answer -- Koko Eating Bananas
"What is the minimum speed k such that Koko finishes all piles in h hours?"
Piles = [3, 6, 7, 11], h = 8
Can she finish at speed 4? ceil(3/4)+ceil(6/4)+ceil(7/4)+ceil(11/4) = 1+2+2+3 = 8 <= 8 -> YES
Can she finish at speed 3? 1+2+3+4 = 10 > 8 -> NO
Can she finish at speed 5? 1+2+2+3 = 8 <= 8 -> YES but 4 also works
Binary search: lo=1, hi=11. Find smallest k where hours(k) <= h.
The search space is NOT an array -- it is the range of possible answers.
Binary Search on the Answer Space -- Walked Through
The Koko trace above is the punchline. Here is the full mental model so you can apply it to any "minimum X such that..." problem.
Setup for Koko Eating Bananas: piles = [3, 6, 7, 11], h = 8. We want the minimum eating speed k such that Koko can finish all piles within h hours.
Step 1 -- define the answer space.
Lowest possible speed: 1 (must eat at least 1 banana/hour)
Highest sensible speed: max(piles) = 11 (any faster doesn't help -- one pile per hour is already enough)
Search space: [1, 11]
Notice: we are NOT searching piles. We are searching the values k could take.
Step 2 -- write the monotonic predicate.
def can_eat_all(speed, piles, h):
hours = sum((p + speed - 1) // speed for p in piles) # ceil(p/speed)
return hours <= h
The key property: can_eat_all is monotone in speed. If speed k works, every speed > k also works. That False, False, ..., False, True, True, ..., True shape is exactly what binary search needs -- we are hunting the boundary.
Step 3 -- binary search the boundary. Find the smallest k where the predicate flips to True.
lo = 1, hi = 11
mid = (1 + 11) // 2 = 6
can_eat_all(6, ...) = ceil(3/6) + ceil(6/6) + ceil(7/6) + ceil(11/6)
= 1 + 1 + 2 + 2 = 6 <= 8 -> True
True means 6 works -- try smaller. hi = mid = 6.
mid = (1 + 6) // 2 = 3
can_eat_all(3, ...) = 1 + 2 + 3 + 4 = 10 > 8 -> False
False means 3 too slow. lo = mid + 1 = 4.
mid = (4 + 6) // 2 = 5
can_eat_all(5, ...) = 1 + 2 + 2 + 3 = 8 <= 8 -> True
hi = mid = 5.
mid = (4 + 5) // 2 = 4
can_eat_all(4, ...) = 1 + 2 + 2 + 3 = 8 <= 8 -> True
hi = mid = 4.
lo == hi == 4 -> answer is 4.
The recipe. Every "binary search on the answer" problem follows the same three moves:
- Bound the answer space. What is the smallest meaningful answer? The largest?
[lo, hi]. - Write a
can_do(x)predicate that returns True/False and is monotone inx. - Binary search the boundary where False flips to True (or vice versa).
The lightbulb moment: we are not searching the input array, we are searching the output values. Once you see that, problems like Split Array Largest Sum, Capacity to Ship Packages in D Days, and Minimum Number of Days to Make M Bouquets all collapse to the same template.
Quick Reference (Cheat Sheet)
Before You Start -- How to Think About Sorting & Binary Search
Sorting: When and Which
Sorting is rarely the answer itself--it's a preprocessing step that unlocks better algorithms.
After sorting, you can:
- Use Two Pointers (O(n) instead of O(n^2))
- Use Binary Search (O(log n) instead of O(n))
- Detect duplicates in O(n)
- Merge intervals easily
Binary Search: Beyond "Find Element in Sorted Array"
Binary Search = Eliminate half the search space each step. It works whenever there's a monotonic property.
The Universal Template
def binary_search(lo, hi, condition):
while lo < hi:
mid = lo + (hi - lo) // 2 # avoid overflow
if condition(mid):
hi = mid # answer is mid or left
else:
lo = mid + 1 # answer is right
return lo
Binary Search on Answer -- The FAANG Favorite
When the problem says: "What is the MINIMUM X such that...?"
1. Define the search space: [min_possible, max_possible]
2. Write a function: can_achieve(X) -> True/False
3. Binary search for the smallest X where can_achieve(X) = True
Examples:
- "Minimum capacity to ship in D days" -> BS on capacity
- "Koko eating bananas" -> BS on speed
- "Split array largest sum" -> BS on the max sum
When You See a Search Problem, Ask Yourself
1. Is the data sorted? -> Classic binary search
2. Is there a monotonic condition? -> Binary search on answer
3. Is the sorted array ROTATED? -> Modified BS (check which half is sorted)
4. "Kth element"? -> Quick Select O(n) avg, or sort O(n log n)
Problems
| # | Problem | Pattern | Status |
|---|---|---|---|
| 1 | Binary Search | Classic | [ ] |
| 2 | Search in Rotated Sorted Array | Modified BS | [ ] |
| 3 | Find Minimum in Rotated Sorted Array | Modified BS | [ ] |
| 4 | Koko Eating Bananas | BS on Answer | [ ] |
| 5 | Split Array Largest Sum | BS on Answer | [ ] |
| 6 | Median of Two Sorted Arrays | BS partition | [ ] |
| 7 | Search a 2D Matrix | Row + Col BS | [ ] |
| 8 | Merge Sort (implementation) | Divide & Conquer | [ ] |
| 9 | Quick Select (Kth element) | Partition | [ ] |
| 10 | Sort Colors | Three Pointers | [ ] |
| 11 | Time Based Key-Value Store | BS on timestamps | [ ] |