DSA Crash Course
17 FAANG Patterns

Pattern 9: Two Heaps

Mark when done:

Deep Dive: Streaming Median in Action

Find Median from Data Stream

Add numbers one at a time: 5, 2, 8, 1, 4

Add 5: lo=[-5]     hi=[]       median=5
Add 2: lo=[-5]     hi=[2]  WRONG! 2<5, should be in lo.
       Rebalance:  lo=[-2] hi=[5]  -> no, rethink:
       
Correct flow (always push to lo first, then balance):
Add 5: lo=[-5]     hi=[]       median=5
Add 2: push to lo: [-5,-2], pop max to hi: lo=[-2] hi=[5]  median=(2+5)/2=3.5
Add 8: push to lo: [-5,-2], -> nah. Push to hi first? 

Simpler: always push to lo, then move lo-top to hi, then rebalance sizes:
Add 5: lo=[-5]        hi=[]         median=5
Add 2: lo=[-5,-2]     hi=[]         rebal: lo=[-2] hi=[5]      med=3.5
Add 8: lo=[-5,-2]     push hi=[5,8] rebal: lo=[-5,-2] hi=[5,8] WRONG sizes
       -> lo=[-5,-2,X] need: push 8 to lo->hi flow:
       
Standard flow: push to maxheap, move top to minheap, rebalance:
Add 5: lo=[-5]        hi=[]       -> rebal (hi smaller): move 5 back. lo=[] hi=[5]. 
       Actually: lo=[-5] hi=[] -> |lo|>|hi|+1? no (1>1? no) -> ok. median=-(-5)=5
Add 2: push lo: [-5,-2]. top=-(-5)=5. move to hi: lo=[-2] hi=[5]. med=(2+5)/2=3.5
Add 8: push lo: [-8,-2]. top=-(-8)=8. move to hi: lo=[-2] hi=[5,8]. |hi|>|lo|: move 5 back. lo=[-5,-2] hi=[8]. med=5
Add 1: push lo: [-5,-2,-1]. top=5. move to hi: lo=[-2,-1] hi=[5,8]. med=(2+5)/2=3.5
Add 4: push lo: [-4,-2,-1]. top=4. move to hi: lo=[-2,-1] hi=[4,5,8]. |hi|>|lo|: move 4 back. lo=[-4,-2,-1] hi=[5,8]. med=4

Key invariant: max-heap top <= min-heap top. Sizes differ by at most 1. Median always accessible from tops.


Signal: "Running median", balance two halves, "smaller half / larger half" Template: See MedianFinder in templates.py

When to Use

  • "Find median from data stream"
  • Any problem requiring balanced partition of elements into two halves
  • "Maximize minimum" or "minimize maximum" with streaming data

Key Idea

Max-heap for the LOWER half (smaller elements)
Min-heap for the UPPER half (larger elements)
Balance: sizes differ by at most 1
Median = top of larger heap, or average of both tops

Re-solve (TIMED)

ProblemSourceTarget TimeDone?
Find Median from Data StreamPhase 1.820 min[ ]

New Practice Problems

Add problems to solutions/ and enrich as needed.


First-time Recognition Signals

When you read a brand-new problem statement, this pattern is the right tool if you see:

  • "Find the median as numbers stream in" — the canonical two-heaps problem.
  • "Maintain a balanced split into a smaller half and a larger half" with cheap O(1) access to both extremes — max-heap on the low side, min-heap on the high side.
  • "Sliding window median" — two heaps plus lazy deletion (or a SortedList / multiset).
  • "IPO / project scheduling: pick K items, each next pick depends on a threshold from previous picks" — one heap for "currently affordable", another for "waiting to become affordable".

Anti-signals (looks like this pattern, isn't)

  • "Median of a static array" — just sort, or use Quickselect; two heaps are overkill.
  • "K largest / K smallest elements" — a single heap of size K (Top-K pattern), not two heaps.
  • "Range median queries on a static array" — that is a Wavelet Tree / Merge Sort Tree problem, not Two Heaps.