Speed Optimization
Module 14.2: Speed Optimization
Section titled “Module 14.2: Speed Optimization”Estimated time: ~30 minutes
Prerequisite: Module 14.1 (Task Optimization)
Outcome: After this module, you will know techniques to reduce Claude Code response time, understand speed/quality tradeoffs, and optimize for different scenarios.
1. WHY — Why This Matters
Section titled “1. WHY — Why This Matters”You’re waiting. Claude is “thinking” for 2 minutes on what should be a simple task. Or worse, it’s been 5 minutes and you’re not sure if it’s stuck or working. Time adds up — 10 slow tasks per day × 3 extra minutes each = 30 minutes wasted daily.
Speed optimization gives you back that time. Clear prompts, clean context, right model choice — these compound into significant productivity gains.
2. CONCEPT — Core Ideas
Section titled “2. CONCEPT — Core Ideas”Speed Factors
Section titled “Speed Factors”| Factor | Slow | Fast |
|---|---|---|
| Prompt | Vague, ambiguous | Clear, specific |
| Context | 100K tokens | 10K tokens |
| Task | Complex, multi-step | Focused, single |
| Model | Opus (smartest) | Haiku (fastest) |
| Output | Long explanation | Just code |
The Speed Formula
Section titled “The Speed Formula”Response Time = f(Context Size, Task Complexity, Output Length, Model)
Optimize each:- Context: /clear frequently, exclude irrelevant files- Complexity: Break into simpler tasks (Module 14.1)- Output: "Code only, no explanation"- Model: Use fastest model that worksContext Management for Speed
Section titled “Context Management for Speed”Heavy context: Light context:─────────────────────────────────────────────────50 files loaded 3 relevant filesFull conversation history Fresh sessionAll project documentation Just what's needed ↓ ↓Result: 60 sec response Result: 10 sec responseModel Selection Strategy
Section titled “Model Selection Strategy”Task Complexity → Model Choice────────────────────────────────────────────────Simple (format, small edit) → Haiku (fastest)Medium (implement feature) → Sonnet (balanced)Complex (architecture, debug) → Opus (smartest)Output Optimization
Section titled “Output Optimization”- “Code only, no explanation” — saves output generation time
- “One file at a time” — faster than multiple files
- “Diff format” — faster than full file rewrite
Detailed Model Comparison by Task Type
Section titled “Detailed Model Comparison by Task Type”Beyond general speed, different models excel at different task types:
| Task Category | Haiku | Sonnet | Opus | Recommendation |
|---|---|---|---|---|
| Formatting/Linting | Excellent | Overkill | Overkill | Haiku — speed matters most, quality sufficient |
| Simple CRUD | Good | Excellent | Overkill | Sonnet — needs pattern understanding |
| Complex Feature | Poor | Good | Excellent | Opus — requires architectural reasoning |
| Bug Fixing | Simple bugs | Most bugs | Complex bugs | Match model to bug complexity |
| Code Review | Style issues | Logic issues | Architecture issues | Match model to review depth needed |
| Test Generation | Basic tests | Thorough tests | Edge case tests | Sonnet for most, Opus for critical paths |
| Documentation | Good | Excellent | Overkill | Sonnet — needs context understanding |
| Refactoring | Rename/move | Restructure | Architectural | Match model to refactoring scope |
Speed vs Quality vs Cost at a Glance
Section titled “Speed vs Quality vs Cost at a Glance” Speed Quality CostHaiku: ██████████ 10/10 ████░░░░░░ 4/10 █░░░░░░░░░ 1/10Sonnet: ██████░░░░ 6/10 ████████░░ 8/10 ████░░░░░░ 4/10Opus: ███░░░░░░░ 3/10 ██████████ 10/10 ████████░░ 8/10Rule of thumb: Default to Sonnet for daily work. Drop to Haiku for batch operations and simple tasks. Escalate to Opus only when Sonnet’s output quality is insufficient for the specific task. One Vietnamese startup reduced Claude costs from $1,200 to $380/month by following this approach.
3. DEMO — Step by Step
Section titled “3. DEMO — Step by Step”Scenario: Implement utility functions (string helpers).
Slow Approach
Section titled “Slow Approach”[Session with 50K context from previous work]
You: Create utility functions for our app. We need string helpers,date formatters, and validators. Explain each function andadd comprehensive documentation.
Claude: [Thinking... 90 seconds][Long response with explanations, 200+ lines]
Total time: ~3 minutesFast Approach
Section titled “Fast Approach”$ claude# Fresh session, clean contextYou: Create src/utils/strings.ts with these functions:- capitalize(str): Capitalize first letter- slugify(str): Convert to URL slug- truncate(str, len): Truncate with ellipsis
Code only, no explanation.
Claude: [Thinking... 15 seconds]export function capitalize(str: string): string { return str.charAt(0).toUpperCase() + str.slice(1);}
export function slugify(str: string): string { return str.toLowerCase().replace(/\s+/g, '-').replace(/[^\w-]/g, '');}
export function truncate(str: string, len: number): string { return str.length > len ? str.slice(0, len) + '...' : str;}Total time: ~20 seconds
Speed Comparison
Section titled “Speed Comparison”| Metric | Slow | Fast |
|---|---|---|
| Time | 3 minutes | 20 seconds |
| Speedup | - | 9x faster |
What changed:
- Fresh context (no 50K baggage)
- Specific scope (one file, three functions)
- “Code only” (no explanation overhead)
- Clear requirements (exact function signatures)
Parallel Execution
Section titled “Parallel Execution”# Run three independent tasks simultaneouslyclaude -p "Create src/utils/strings.ts: capitalize, slugify" &claude -p "Create src/utils/dates.ts: formatDate, parseDate" &claude -p "Create src/utils/validators.ts: isEmail, isURL" &wait
# Total: ~25 seconds (instead of 75 seconds sequential)4. PRACTICE — Try It Yourself
Section titled “4. PRACTICE — Try It Yourself”Exercise 1: Context Diet
Section titled “Exercise 1: Context Diet”Goal: Experience the impact of context size.
Instructions:
- Note your current context size
- Use
/clearand reload only essential files - Run the same task
- Compare response times
💡 Hint
The /cost command shows token usage. Compare before and after /clear.
✅ Solution
Typical results:
- Heavy context (50K tokens): 45-90 second responses
- Light context (5K tokens): 10-20 second responses
- Speedup: 3-5x faster with clean context
Exercise 2: Output Trimming
Section titled “Exercise 2: Output Trimming”Goal: Measure the impact of output length.
Instructions:
- Ask Claude to implement something with full explanation
- Time it
- Ask the same thing with “code only, no explanation”
- Compare times
💡 Hint
Output generation takes time. Less output = faster response.
✅ Solution
Typical results:
- With explanation: 30-60 seconds, 100+ lines output
- Code only: 10-20 seconds, 20 lines output
- Speedup: 2-3x faster
Exercise 3: Model Comparison
Section titled “Exercise 3: Model Comparison”Goal: Understand model speed/quality tradeoffs.
Instructions:
- Pick a medium-complexity task
- Try with different models if available
- Compare: time, quality, appropriateness
💡 Hint
Haiku is fastest but may miss nuance. Opus is smartest but slower. Sonnet balances both.
✅ Solution
For simple formatting: Haiku (fast, sufficient quality) For feature implementation: Sonnet (balanced) For complex debugging: Opus (worth the wait)
Match model to task complexity.
5. CHEAT SHEET
Section titled “5. CHEAT SHEET”Speed Techniques
Section titled “Speed Techniques”# Fresh context/clear
# Minimal output"Code only, no explanation""Just the function, no tests""Diff format only"
# Focused scope"Only modify [file]""Just the [component]"Model Selection
Section titled “Model Selection”| Task Type | Model | Why |
|---|---|---|
| Simple edits | Haiku | Fastest |
| Features | Sonnet | Balanced |
| Complex debug | Opus | Smartest |
Parallel Execution
Section titled “Parallel Execution”claude -p "task 1" &claude -p "task 2" &waitContext Management
Section titled “Context Management”/clearbetween unrelated tasks- Load only files you’re working on
- Exclude node_modules, build artifacts
6. PITFALLS — Common Mistakes
Section titled “6. PITFALLS — Common Mistakes”| ❌ Mistake | ✅ Correct Approach |
|---|---|
| Never clearing context | /clear for fresh starts |
| Always using Opus | Match model to task complexity |
| Asking for explanations you won’t read | ”Code only” for speed |
| Loading entire codebase | Load only relevant files |
| Sequential tasks that could be parallel | Use multiple sessions |
| Optimizing prematurely | Get it working first, then speed up |
| Sacrificing quality for speed | Speed should maintain quality |
7. REAL CASE — Production Story
Section titled “7. REAL CASE — Production Story”Scenario: Vietnamese agency had developers complaining Claude was “too slow” — 2-3 minute response times making it unusable for quick tasks.
Audit Findings:
- Average context: 80K tokens (accumulated over days)
- Asking for explanations on every task
- Using Opus for simple formatting
- Never using
/clear
Speed Optimization Protocol:
| Change | Before | After |
|---|---|---|
| Daily fresh session | Never | Every morning |
| Context clearing | Never | Between projects |
| Output style | With explanation | Code only (default) |
| Model matching | Always Opus | Task-appropriate |
Results:
- Average response: 2.5 min → 30 sec (5x faster)
- Developer satisfaction: “Claude feels snappy now”
- No quality reduction
Quote: “We were making Claude carry an 80K token backpack everywhere. No wonder it was slow. Traveling light made all the difference.”