How to Think -- The DSA Problem-Solving Framework

Read this BEFORE you start solving problems. Come back and re-read every week.

The #1 Mistake

Most people see a problem and immediately start coding. This is backwards.

Wrong: Read problem -> Code -> Debug -> Realize wrong approach -> Start over Right: Read problem -> Think -> Identify pattern -> Plan -> Code -> Verify

The thinking phase IS the hard part. The coding is just translation.

Step 0: Read the Problem (Really Read It)

Most wrong answers come from misreading the problem, not from bad algorithms.

Read the problem twice. Out loud if needed.
Underline every constraint (1 <= n <= 10^5, "sorted", "unique", "contiguous").
Identify the input type (array, string, tree, graph, number).
Identify the output type (single value, boolean, list, modified input).
Read every example -- trace through them manually.
Ask yourself: "What is the simplest case? What is the trickiest case?"

Step 1: Constraints Tell You the Algorithm

This is the single most important insight in DSA. Before you think about ANY approach, look at the constraint on n:

Constraint (n <=)	Target Complexity	Techniques That Fit
10	O(n!) or O(2^n)	Backtracking, Brute Force, Permutations
20-25	O(2^n)	Bitmask DP, Meet in Middle
100	O(n^3)	Floyd-Warshall, 3 nested loops
1,000	O(n^2)	DP, Brute Force with optimization
10^5	O(n log n)	Sorting, Binary Search, Segment Tree
10^6	O(n)	Two Pointers, Prefix Sum, HashMap, Sliding Window
10^9+	O(log n) or O(1)	Math, Binary Search on Answer

Example: If n = 10^5 and your approach is O(n^2), it WILL time out. You need O(n log n) or better. This eliminates approaches immediately.

Step 2: Brute Force First (Always)

Never skip brute force. Even if you know a clever solution.

What is the dumbest approach? Check all possibilities.
What is its time complexity? Usually O(n^2) or O(2^n).
Why is it slow? Identify the redundant work.
What work is being repeated? This tells you which DS/pattern eliminates it.

The gap between brute force and optimal IS the data structure / algorithm.

Brute force O(n^2) scanning pairs?  -> HashMap makes lookup O(1)       -> O(n)
Brute force O(n^2) checking subarrays? -> Sliding Window reuses work   -> O(n)
Brute force O(2^n) recomputing?     -> Memoization caches results     -> O(n) or O(n^2)
Brute force O(n) for each query?    -> Prefix Sum precomputes         -> O(1) per query
Brute force O(n) searching sorted?  -> Binary Search halves space     -> O(log n)

Step 3: Pattern Match -- What Type of Problem Is This?

After reading the problem and thinking about brute force, ask these questions in order:

1. Is the input SORTED or can I sort it?
   YES -> Binary Search / Two Pointers

2. Am I looking for a SUBARRAY or SUBSTRING?
   YES -> Sliding Window (if contiguous) or Prefix Sum (if sum-based)

3. Am I looking for TOP K / KTH element?
   YES -> Heap

4. Am I exploring ALL POSSIBILITIES (combinations, permutations)?
   YES -> Backtracking

5. Does the problem have OVERLAPPING SUBPROBLEMS + OPTIMAL SUBSTRUCTURE?
   YES -> Dynamic Programming

6. Is there a TREE or GRAPH structure?
   YES -> BFS (shortest/level-order) or DFS (all paths/deep explore)

7. Are there DEPENDENCIES or ORDERING constraints?
   YES -> Topological Sort

8. Am I tracking CONNECTED COMPONENTS dynamically?
   YES -> Union-Find

9. Does the problem involve INTERVALS that overlap?
   YES -> Sort + Merge / Greedy

10. Can I eliminate HALF the search space each step?
    YES -> Binary Search (even on answer space)

Step 4: Plan Before Coding

Before writing a single line of code:

State your approach in plain English. "I will sort the intervals by start time, then merge overlapping ones."
State the time complexity. "This is O(n log n) for sort + O(n) for merge = O(n log n)."
State the space complexity. "O(n) for the result array."
Walk through one example by hand. Trace your approach step by step on the first example.
Think about edge cases:
- Empty input? Single element?
- All same values? All different?
- Negative numbers? Zeros?
- Very large input (overflow)?
- Already sorted? Reverse sorted?

Step 5: Code -- Translate, Don't Invent

If Steps 1-4 are done right, coding is mechanical translation.

Write clean, readable code with meaningful variable names.
Use the template for the pattern (see each module's README).
Don't optimize prematurely -- get it working first.
Handle edge cases at the top of the function.

Step 6: Verify -- Don't Just Submit

Before clicking submit (or telling the interviewer you're done):

Trace through your code with the first example. Line by line.
Test edge cases mentally: empty input, single element, all same values.
Check off-by-one errors: loop boundaries, array indices, < vs <=.
Check your base cases: does dp[0], if root is None, etc. make sense?

The Thinking Hierarchy for Every New Problem

  +-----------------------------------------------------+
  |  1. READ -- What does the problem actually ask?      |
  |  2. CONSTRAINTS -- What complexity is expected?       |
  |  3. BRUTE FORCE -- What's the naive approach?        |
  |  4. BOTTLENECK -- Why is brute force slow?           |
  |  5. PATTERN -- Which DS/algorithm removes the        |
  |     bottleneck?                                     |
  |  6. PLAN -- Walk through an example by hand          |
  |  7. CODE -- Translate the plan                       |
  |  8. VERIFY -- Trace, edge cases, submit              |
  '-----------------------------------------------------+

How to Think About Data Structures

Every data structure exists because some operation was too slow:

If you need...	But this is slow...	Use this DS	Because...
Find element by value	Scanning array O(n)	HashMap	O(1) lookup
Find min/max always	Scanning array O(n)	Heap	O(1) access, O(log n) insert
Check if element exists	Scanning array O(n)	HashSet	O(1) lookup
Find in sorted data	Scanning O(n)	Binary Search	O(log n)
LIFO processing	--	Stack	Nested structures, undo
FIFO processing	--	Queue	Level-order, BFS
Hierarchical relations	Flat array	Tree	O(log n) search if balanced
Arbitrary connections	Tree (no cycles)	Graph	Cycles, multiple parents
Range queries + updates	Prefix sum (slow update)	Segment Tree	O(log n) both
Prefix-based search	HashMap (no prefix)	Trie	O(length) prefix search
Dynamic connectivity	BFS/DFS each time	Union-Find	~O(1) amortized

How to Think About Algorithm Paradigms

Paradigm	Core Question	When Brute Force Fails Because...
Two Pointers	"Can I use sorted order to skip pairs?"	Checking all pairs is O(n^2)
Sliding Window	"Can I reuse the previous window?"	Checking all subarrays is O(n^2)
Binary Search	"Can I eliminate half each step?"	Linear scan is O(n)
BFS/DFS	"How do I traverse connected nodes?"	Need structured exploration
Backtracking	"Can I prune invalid branches early?"	Generating all possibilities is O(2^n)
Greedy	"Is the local best always global best?"	Need proof that greedy works
DP	"Am I recomputing the same subproblem?"	Recursive solution has overlapping calls
Divide & Conquer	"Can I split, solve halves, and merge?"	Problem has recursive structure

Review Protocol: After Every Problem

Don't just solve and move on. After each problem, spend 5 minutes:

Could I have identified the pattern faster? What was the signal I missed?
What was the hardest part? (Reading, pattern ID, implementation, edge cases?)
Can I solve this again in 15 minutes tomorrow? If not, it goes in the "re-do" list.
What similar problems would use the same pattern? Connect it to what you know.
Write one sentence in the Notes column of your progress tracker.

Spaced Repetition Schedule

When	What
Same day	Solve the problem
Next day	Re-solve from scratch (no peeking)
Day 3	Review notes, can you explain approach in 30 seconds?
Day 7	Re-solve timed (should be < 15 min for easy, < 25 for medium)
Day 14	Do a similar problem from the same pattern

If you can't re-solve on Day 7, the problem goes back to Day 1.

The Pattern Identification Workout

When you can identify the right pattern from a one-sentence problem description, you've internalised the schema. Try these 10 prompts cold — write down your pattern guess BEFORE looking at the answers below. Self-score X/10.

Read the prompt ? write your pattern ? check. No code, no examples, just the constraint sketch.

Given a sorted array and a target, find a pair summing to target. Return their indices. → Pattern: ___
Given a binary tree, find the maximum value at each level. → Pattern: ___
Given an array nums and integer k, return the length of the longest substring where the count of any character changing it to one character requires at most k replacements. → Pattern: ___
Given n tasks each with a duration, schedule them on a single machine to minimise total completion time. All n = 20. → Pattern: ___
Given an array of intervals [start, end], return the minimum number of intervals to remove so the rest don't overlap. → Pattern: ___
Given a directed graph of courses with prerequisites, return a valid order to take them. → Pattern: ___
Given a list of n integers, find the K most frequent elements. n = 1e5, k = 50. → Pattern: ___
Given the head of a linked list, determine if it has a cycle. O(1) extra space. → Pattern: ___
Given a non-empty array of integers where every element appears twice except one, find that one. O(1) space. → Pattern: ___
Given a m × n grid of 0s and 1s where 1 means rotten orange and 0 empty, return the minimum number of minutes until no fresh orange is adjacent to a rotten one. → Pattern: ___

Answers

Click after you've tried all 10

Two Pointers (opposite direction) — sorted + find pair = classic L/R convergence.
BFS (level order) — "at each level" is the BFS giveaway.
Sliding Window (variable) — longest substring with at-most-k changes = expand-right, shrink-left while the "non-dominant character count" exceeds k.
Bitmask DP — n = 20 + "schedule / assign" subset signal = dp[mask].
Greedy — sort by end time, take earliest-ending non-conflicting interval. (Activity Selection.)
Topological Sort — "dependency graph + valid order" is the topo-sort definition.
Heap (Top-K) — "K most frequent" = max-heap of size K (or min-heap of size K + replace).
Two Pointers (Fast & Slow) — Floyd's cycle detection.
Bit Manipulation (XOR) — XOR cancels duplicates; the unique element survives.
Multi-Source BFS — multiple starting positions (all initial rotten oranges) into the queue at once.

Score 8+ = you've internalised the schemas. 5-7 = re-read each pattern's "Signal" section in the matching Phase 3 README. <5 = re-do Phase 3 drill days 23-36 before moving to mixed practice.