How to Think — Problem-Solving Framework
How to Think -- The DSA Problem-Solving Framework
Read this BEFORE you start solving problems. Come back and re-read every week.
The #1 Mistake
Most people see a problem and immediately start coding. This is backwards.
Wrong: Read problem -> Code -> Debug -> Realize wrong approach -> Start over Right: Read problem -> Think -> Identify pattern -> Plan -> Code -> Verify
The thinking phase IS the hard part. The coding is just translation.
Step 0: Read the Problem (Really Read It)
Most wrong answers come from misreading the problem, not from bad algorithms.
- Read the problem twice. Out loud if needed.
- Underline every constraint (
1 <= n <= 10^5, "sorted", "unique", "contiguous"). - Identify the input type (array, string, tree, graph, number).
- Identify the output type (single value, boolean, list, modified input).
- Read every example -- trace through them manually.
- Ask yourself: "What is the simplest case? What is the trickiest case?"
Step 1: Constraints Tell You the Algorithm
This is the single most important insight in DSA. Before you think about ANY approach, look at the constraint on n:
| Constraint (n <=) | Target Complexity | Techniques That Fit |
|---|---|---|
| 10 | O(n!) or O(2^n) | Backtracking, Brute Force, Permutations |
| 20-25 | O(2^n) | Bitmask DP, Meet in Middle |
| 100 | O(n^3) | Floyd-Warshall, 3 nested loops |
| 1,000 | O(n^2) | DP, Brute Force with optimization |
| 10^5 | O(n log n) | Sorting, Binary Search, Segment Tree |
| 10^6 | O(n) | Two Pointers, Prefix Sum, HashMap, Sliding Window |
| 10^9+ | O(log n) or O(1) | Math, Binary Search on Answer |
Example: If n = 10^5 and your approach is O(n^2), it WILL time out. You need O(n log n) or better. This eliminates approaches immediately.
Step 2: Brute Force First (Always)
Never skip brute force. Even if you know a clever solution.
- What is the dumbest approach? Check all possibilities.
- What is its time complexity? Usually O(n^2) or O(2^n).
- Why is it slow? Identify the redundant work.
- What work is being repeated? This tells you which DS/pattern eliminates it.
The gap between brute force and optimal IS the data structure / algorithm.
Brute force O(n^2) scanning pairs? -> HashMap makes lookup O(1) -> O(n)
Brute force O(n^2) checking subarrays? -> Sliding Window reuses work -> O(n)
Brute force O(2^n) recomputing? -> Memoization caches results -> O(n) or O(n^2)
Brute force O(n) for each query? -> Prefix Sum precomputes -> O(1) per query
Brute force O(n) searching sorted? -> Binary Search halves space -> O(log n)
Step 3: Pattern Match -- What Type of Problem Is This?
After reading the problem and thinking about brute force, ask these questions in order:
1. Is the input SORTED or can I sort it?
YES -> Binary Search / Two Pointers
2. Am I looking for a SUBARRAY or SUBSTRING?
YES -> Sliding Window (if contiguous) or Prefix Sum (if sum-based)
3. Am I looking for TOP K / KTH element?
YES -> Heap
4. Am I exploring ALL POSSIBILITIES (combinations, permutations)?
YES -> Backtracking
5. Does the problem have OVERLAPPING SUBPROBLEMS + OPTIMAL SUBSTRUCTURE?
YES -> Dynamic Programming
6. Is there a TREE or GRAPH structure?
YES -> BFS (shortest/level-order) or DFS (all paths/deep explore)
7. Are there DEPENDENCIES or ORDERING constraints?
YES -> Topological Sort
8. Am I tracking CONNECTED COMPONENTS dynamically?
YES -> Union-Find
9. Does the problem involve INTERVALS that overlap?
YES -> Sort + Merge / Greedy
10. Can I eliminate HALF the search space each step?
YES -> Binary Search (even on answer space)
Step 4: Plan Before Coding
Before writing a single line of code:
- State your approach in plain English. "I will sort the intervals by start time, then merge overlapping ones."
- State the time complexity. "This is O(n log n) for sort + O(n) for merge = O(n log n)."
- State the space complexity. "O(n) for the result array."
- Walk through one example by hand. Trace your approach step by step on the first example.
- Think about edge cases:
- Empty input? Single element?
- All same values? All different?
- Negative numbers? Zeros?
- Very large input (overflow)?
- Already sorted? Reverse sorted?
Step 5: Code -- Translate, Don't Invent
If Steps 1-4 are done right, coding is mechanical translation.
- Write clean, readable code with meaningful variable names.
- Use the template for the pattern (see each module's README).
- Don't optimize prematurely -- get it working first.
- Handle edge cases at the top of the function.
Step 6: Verify -- Don't Just Submit
Before clicking submit (or telling the interviewer you're done):
- Trace through your code with the first example. Line by line.
- Test edge cases mentally: empty input, single element, all same values.
- Check off-by-one errors: loop boundaries, array indices,
< vs <=. - Check your base cases: does
dp[0],if root is None, etc. make sense?
The Thinking Hierarchy for Every New Problem
+-----------------------------------------------------+
| 1. READ -- What does the problem actually ask? |
| 2. CONSTRAINTS -- What complexity is expected? |
| 3. BRUTE FORCE -- What's the naive approach? |
| 4. BOTTLENECK -- Why is brute force slow? |
| 5. PATTERN -- Which DS/algorithm removes the |
| bottleneck? |
| 6. PLAN -- Walk through an example by hand |
| 7. CODE -- Translate the plan |
| 8. VERIFY -- Trace, edge cases, submit |
'-----------------------------------------------------+
How to Think About Data Structures
Every data structure exists because some operation was too slow:
| If you need... | But this is slow... | Use this DS | Because... |
|---|---|---|---|
| Find element by value | Scanning array O(n) | HashMap | O(1) lookup |
| Find min/max always | Scanning array O(n) | Heap | O(1) access, O(log n) insert |
| Check if element exists | Scanning array O(n) | HashSet | O(1) lookup |
| Find in sorted data | Scanning O(n) | Binary Search | O(log n) |
| LIFO processing | -- | Stack | Nested structures, undo |
| FIFO processing | -- | Queue | Level-order, BFS |
| Hierarchical relations | Flat array | Tree | O(log n) search if balanced |
| Arbitrary connections | Tree (no cycles) | Graph | Cycles, multiple parents |
| Range queries + updates | Prefix sum (slow update) | Segment Tree | O(log n) both |
| Prefix-based search | HashMap (no prefix) | Trie | O(length) prefix search |
| Dynamic connectivity | BFS/DFS each time | Union-Find | ~O(1) amortized |
How to Think About Algorithm Paradigms
| Paradigm | Core Question | When Brute Force Fails Because... |
|---|---|---|
| Two Pointers | "Can I use sorted order to skip pairs?" | Checking all pairs is O(n^2) |
| Sliding Window | "Can I reuse the previous window?" | Checking all subarrays is O(n^2) |
| Binary Search | "Can I eliminate half each step?" | Linear scan is O(n) |
| BFS/DFS | "How do I traverse connected nodes?" | Need structured exploration |
| Backtracking | "Can I prune invalid branches early?" | Generating all possibilities is O(2^n) |
| Greedy | "Is the local best always global best?" | Need proof that greedy works |
| DP | "Am I recomputing the same subproblem?" | Recursive solution has overlapping calls |
| Divide & Conquer | "Can I split, solve halves, and merge?" | Problem has recursive structure |
Review Protocol: After Every Problem
Don't just solve and move on. After each problem, spend 5 minutes:
- Could I have identified the pattern faster? What was the signal I missed?
- What was the hardest part? (Reading, pattern ID, implementation, edge cases?)
- Can I solve this again in 15 minutes tomorrow? If not, it goes in the "re-do" list.
- What similar problems would use the same pattern? Connect it to what you know.
- Write one sentence in the Notes column of your progress tracker.
Spaced Repetition Schedule
| When | What |
|---|---|
| Same day | Solve the problem |
| Next day | Re-solve from scratch (no peeking) |
| Day 3 | Review notes, can you explain approach in 30 seconds? |
| Day 7 | Re-solve timed (should be < 15 min for easy, < 25 for medium) |
| Day 14 | Do a similar problem from the same pattern |
If you can't re-solve on Day 7, the problem goes back to Day 1.
The Pattern Identification Workout
When you can identify the right pattern from a one-sentence problem description, you've internalised the schema. Try these 10 prompts cold — write down your pattern guess BEFORE looking at the answers below. Self-score X/10.
Read the prompt ? write your pattern ? check. No code, no examples, just the constraint sketch.
- Given a sorted array and a target, find a pair summing to target. Return their indices. → Pattern: ___
- Given a binary tree, find the maximum value at each level. → Pattern: ___
- Given an array
numsand integerk, return the length of the longest substring where the count of any character changing it to one character requires at mostkreplacements. → Pattern: ___ - Given
ntasks each with a duration, schedule them on a single machine to minimise total completion time. Alln = 20. → Pattern: ___ - Given an array of intervals
[start, end], return the minimum number of intervals to remove so the rest don't overlap. → Pattern: ___ - Given a directed graph of courses with prerequisites, return a valid order to take them. → Pattern: ___
- Given a list of
nintegers, find the K most frequent elements. n = 1e5, k = 50. → Pattern: ___ - Given the head of a linked list, determine if it has a cycle. O(1) extra space. → Pattern: ___
- Given a non-empty array of integers where every element appears twice except one, find that one. O(1) space. → Pattern: ___
- Given a
m × ngrid of 0s and 1s where 1 means rotten orange and 0 empty, return the minimum number of minutes until no fresh orange is adjacent to a rotten one. → Pattern: ___
Answers
Click after you've tried all 10
- Two Pointers (opposite direction) — sorted + find pair = classic L/R convergence.
- BFS (level order) — "at each level" is the BFS giveaway.
- Sliding Window (variable) — longest substring with at-most-k changes = expand-right, shrink-left while the "non-dominant character count" exceeds k.
- Bitmask DP —
n = 20+ "schedule / assign" subset signal =dp[mask]. - Greedy — sort by end time, take earliest-ending non-conflicting interval. (Activity Selection.)
- Topological Sort — "dependency graph + valid order" is the topo-sort definition.
- Heap (Top-K) — "K most frequent" = max-heap of size K (or min-heap of size K + replace).
- Two Pointers (Fast & Slow) — Floyd's cycle detection.
- Bit Manipulation (XOR) — XOR cancels duplicates; the unique element survives.
- Multi-Source BFS — multiple starting positions (all initial rotten oranges) into the queue at once.
Score 8+ = you've internalised the schemas. 5-7 = re-read each pattern's "Signal" section in the matching Phase 3 README. <5 = re-do Phase 3 drill days 23-36 before moving to mixed practice.