DSA Crash Course
Advanced Data Structures

4.3 Segment Trees & BIT

Mark when done:

Deep Dive: When Prefix Sum is Not Enough

Beginner tip: A segment tree is a special tree that answers range queries (like "sum of elements 3 to 7") in O(log n) while also supporting fast updates. Think of it as a hierarchy of pre-computed summaries. You rarely need to implement one from scratch in interviews, but knowing WHEN to reach for it is valuable.

The Problem: Range Sum WITH Updates

arr = [1, 3, 5, 7, 9, 11]

With prefix sum:
  sum(2..4) = prefix[5] - prefix[2] = O(1)  FAST
  update(3, 10): must rebuild entire prefix array = O(n)  SLOW

With segment tree:
  sum(2..4) = O(log n)  slightly slower
  update(3, 10) = O(log n)  MUCH faster

If you have Q queries and Q updates on an array of size N:

  • Prefix sum: O(N + Q + QN) = O(QN) for updates
  • Segment tree: O(N + Qlog N + Qlog N) = O(Q*log N) for both

Segment Tree Structure

graph TD
  R["[0..5] sum=36"] --> L["[0..2] sum=9"]
  R --> RR["[3..5] sum=27"]
  L --> L1["[0..1] sum=4"]
  L --> L2["[2..2] = 5"]
  L1 --> A0["arr[0]=1"]
  L1 --> A1["arr[1]=3"]
  RR --> R1["[3..4] sum=16"]
  RR --> R2["[5..5] = 11"]
  R1 --> A3["arr[3]=7"]
  R1 --> A4["arr[4]=9"]
  classDef leaf fill:#dcfce7,stroke:#16a34a,color:#166534
  class A0,A1,A3,A4,L2,R2 leaf

Above is a segment tree built over [1, 3, 5, 7, 9, 11]. Every internal node stores the sum of its leaves' range — sum(0..5) = 36, sum(0..2) = 9, etc.

Query sum(2..5) visits only the bold-bordered nodes: [2..2] = 5, [3..4] = 16, [5..5] = 11. Total = 32. Three nodes, not six — that's the O(log n) win.

Update arr[3] = 6 (was 7) touches exactly four nodes on the path from leaf to root: arr[3], [3..4], [3..5], [0..5]. Each is updated by the delta (-1).

Array: [1, 3, 5, 7]

Segment tree (stores range sums):
           [16]          sum(0..3)
          /    \
       [4]      [12]     sum(0..1), sum(2..3)
      /   \    /    \
    [1]   [3] [5]   [7]  leaves = array elements

Query sum(1..2): need sum of indices 1,2
  Node [4] covers 0..1 -> only need right child [3]
  Node [12] covers 2..3 -> only need left child [5]
  Answer: 3 + 5 = 8. Visited 4 nodes, not all 4 elements.

Before You Start -- How to Think About Segment Trees

Why They Exist

The bottleneck:

  • Prefix sum gives O(1) range queries but O(n) for updates.
  • Raw array gives O(1) updates but O(n) for range queries.
  • Segment Tree: O(log n) for BOTH updates AND range queries.

When You Need a Segment Tree

1. "Range sum + update values" -> Segment Tree or BIT
2. "Range min/max + update" -> Segment Tree
3. "Count of elements in range" -> Segment Tree with merge sort or BIT
4. "Interval scheduling / calendar" -> Segment Tree

If there are NO updates, just use prefix sum. Simpler and faster.

Mental Model

A segment tree is a binary tree where:
  - Each LEAF = one array element
  - Each INTERNAL NODE = aggregate of its children's range
  
For range sum over [2..5]:
  Don't scan 4 elements. Query 2-3 nodes in the tree. O(log n).

Complexity Comparison

OperationArrayPrefix SumSegment Tree
Point UpdateO(1)O(n)O(log n)
Range QueryO(n)O(1)O(log n)

These are rarely asked in standard FAANG interviews. Focus on understanding when to use them, not memorizing the implementation.

Visual: Segment Tree Over [1, 3, 5, 7, 9, 11]

Each internal node stores the sum of the range its children cover. Leaves are the array values.

Indices:    0   1   2   3   4   5
Values:     1   3   5   7   9  11

                     [0..5]=36
                    /         \
              [0..2]=9        [3..5]=27
              /     \         /        \
         [0..1]=4  [2..2]=5  [3..4]=16  [5..5]=11
         /    \              /     \
     [0..0]=1 [1..1]=3   [3..3]=7  [4..4]=9

Total nodes: about 2n. Height: ceil(log2 n) -- six leaves give depth 4.

Trace: sum(2, 5) -- Range Sum Query in O(log n)

We want arr[2] + arr[3] + arr[4] + arr[5] = 5 + 7 + 9 + 11 = 32.

The query starts at the root and recurses, but only into nodes whose range overlaps [2..5]. The trick: any node whose range is fully inside [2..5] returns its stored sum immediately -- no descent needed.

visit [0..5]   partial overlap  -> recurse into both children
  visit [0..2] partial overlap  -> recurse
    visit [0..1] disjoint        -> return 0  (skip whole subtree!)
    visit [2..2] fully inside    -> return 5  (DONE, no descent)
  visit [3..5] fully inside     -> return 27 (DONE, no descent)
                                  total = 0 + 5 + 27 = 32

Visited nodes: [0..5], [0..2], [0..1], [2..2], [3..5]. Four "useful" nodes plus one cheap disjoint check -- O(log n) work, not O(range length).

Trace: update(3, 6) -- Set arr[3] = 6 in O(log n)

Old arr[3] = 7, new value is 6, delta is -1. Walk from the leaf up to the root, updating every node whose range contains index 3.

[3..3]: 7  ->  6      (the leaf itself)
[3..4]: 16 -> 15      (was 7+9, now 6+9)
[3..5]: 27 -> 26      (was 16+11, now 15+11)
[0..5]: 36 -> 35      (root)

Exactly log2(n) + 1 = 4 nodes touched. Every other node is untouched -- their ranges do not contain index 3.

Where the code lives: the matching SegmentTree template (build / query / update) belongs in templates.py alongside the other reference implementations. Use this README to understand when and why; lift the template when you actually need to code one.

Problems

#ProblemStatus
1Range Sum Query - Mutable[ ]
2Count of Smaller Numbers After Self[ ]
3Range Minimum Query[ ]
4My Calendar (Interval scheduling)[ ]

Notes