Context Optimization
Module 5.2: Context Optimization
Section titled “Module 5.2: Context Optimization”Estimated time: ~35 minutes
Prerequisite: Module 5.1 (Controlling Context)
Outcome: After this module, you will be able to extract maximum output quality per token spent — using advanced techniques like session planning, context priming, strategic compaction, task decomposition, and the one-shot pattern for context-heavy tasks.
1. WHY — Why This Matters
Section titled “1. WHY — Why This Matters”You’ve mastered controlling context from Module 5.1. But now you hit a new problem: bigger tasks need more context, yet more context degrades Claude’s response quality. You’re stuck in a paradox — add more context and quality drops, add less and Claude lacks information. The real experts don’t use less context — they use context SMARTER. Context optimization isn’t about cutting corners. It’s about spending your token budget like a strategist, not a gambler. Same input cost, 2x better results.
2. CONCEPT — Core Ideas
Section titled “2. CONCEPT — Core Ideas”Context optimization is architectural thinking applied to AI conversations. Instead of dumping everything into one session and hoping for the best, you orchestrate context like a conductor orchestrates an orchestra — the right information, at the right time, in the right amount.
The Six Optimization Techniques
Section titled “The Six Optimization Techniques”1. Session Architecture — Structure conversations into distinct phases with clean boundaries:
- Phase 1: Understanding (read, analyze, ask questions)
- Phase 2: Planning (design, outline, decompose)
- Phase 3: Execution (implement, test, refactor)
- Phase 4: Verification (review, validate, document)
Between phases: /compact to compress learnings and clear noise.
2. Context Priming — The first prompt shapes the entire conversation. Front-load critical context:
- CLAUDE.md loads automatically (your foundation)
- First user prompt sets tone, scope, constraints
- Prime with minimal essential context, not exhaustive dumps
3. One-Shot Pattern — Use claude -p "prompt" for self-contained tasks:
- Entire task completes in single prompt/response cycle
- No conversation history pollution
- Perfect for utilities, configs, isolated components
4. Task Decomposition — Break monolithic work into independent chunks:
- Each sub-task gets own session or /compact cycle
- Pass results via files (not conversation memory)
- Sub-tasks can run in parallel (different terminals)
5. Strategic /compact Timing — Compress at optimal moments:
- ✅ After completing a sub-task
- ✅ Before starting a new phase
- ✅ When context usage hits 60%+
- ❌ NEVER mid-operation (loses working state)
6. Context Recycling — Save and reuse analysis artifacts:
- Export decisions/findings to files
- Store recurring context in CLAUDE.md
- Create “context snapshot” documents for complex systems
graph TD A[Big Task] --> B[Session Architecture] B --> C[Phase 1: Understanding] C --> D[/compact] D --> E[Phase 2: Planning] E --> F[Context Priming] F --> G[Task Decomposition] G --> H[Sub-task 1] G --> I[Sub-task 2] G --> J[Sub-task 3] H --> K[One-Shot Pattern] I --> L[/compact] J --> M[Context Recycling] L --> N[Phase 3: Execution] M --> N K --> N N --> O[/compact] O --> P[Phase 4: Verification]3. DEMO — Step by Step
Section titled “3. DEMO — Step by Step”Scenario: Refactor authentication system from session-based to JWT (8 files: auth middleware, login/logout handlers, user model, JWT utilities, tests, config, docs).
Step 1: Session Architecture — Plan Your Phases
Before touching code, outline the session:
$ claudeFirst prompt:
I need to refactor auth from session-based to JWT. 8 files affected.
Phase plan:1. Understanding: Analyze current auth flow (3 files)2. Planning: Design JWT architecture, identify changes3. Execution: Implement file by file with /compact between4. Verification: Test suite, security review
Let's start Phase 1: analyze middleware/auth.js, handlers/login.js, models/user.jsExpected: Claude reads files, explains current flow, identifies dependencies.
Step 2: Context Priming — Efficient First Prompt
Notice the priming elements:
- Task scope (8 files, JWT migration)
- Constraint (phased approach)
- Immediate action (analyze 3 specific files)
- NO code dump, NO exhaustive requirements
This sets expectations: methodical, structured, file-focused.
Step 3: Phase 1 Completion + Compact
After Claude’s analysis:
/compactExpected output:
Context compacted. Key information preserved:- Current auth uses Express sessions + Passport- User passwords hashed with bcrypt- Session stored in Redis- 3 routes depend on session: /api/profile, /api/dashboard, /api/settings
Context usage: 15% (was 58%)Step 4: One-Shot for JWT Utility Module
The JWT utility is self-contained. Use one-shot pattern:
$ claude -p "Create utils/jwt.js with functions: generateToken(userId, email), verifyToken(token), refreshToken(token). Use jsonwebtoken library. JWT secret from process.env.JWT_SECRET. Access token expires 15min, refresh token 7 days. Include JSDoc comments."Expected: Complete jwt.js file written directly. No back-and-forth needed.
Step 5: Task Decomposition — File by File
Back in main session (already compacted):
Phase 3: Execution. Let's update files one by one.
File 1/8: Update middleware/auth.js to use JWT instead of session.Claude updates file. Then:
/compactRepeat for each file. Each /compact clears the previous file’s details while preserving architecture decisions.
Step 6: Context Recycling — Save Analysis
After Phase 2 (Planning), save the architecture:
Save the JWT architecture design to docs/jwt-migration.md for future reference.Claude creates:
# JWT Migration Architecture
## Token Structure- Access: 15min, payload: {userId, email, role}- Refresh: 7 days, payload: {userId}
## Middleware Changes- Remove passport.session()- Add JWT verification middleware...Now this context is permanent, reusable, and doesn’t consume conversation tokens.
Step 7: Final Cost Review
After Phase 4:
/costExpected output:
Session Cost Summary:- Input tokens: 28,450- Output tokens: 12,300- Total cost: $0.87- Context efficiency: 73% (via 6 compactions)Compare this to a no-optimization approach: typically 50,000+ input tokens, $1.50+, and quality degradation after file 4.
4. PRACTICE — Try It Yourself
Section titled “4. PRACTICE — Try It Yourself”Exercise 1: Session Architect
Section titled “Exercise 1: Session Architect”Goal: Plan and execute a context-optimized refactoring session
Instructions:
- Choose a real refactoring task in your codebase (5+ files affected)
- Write a 4-phase plan with specific
/compactpoints - Execute Phase 1 only, using context priming in your first prompt
- Run
/costafter Phase 1, then after/compact - Compare token usage before/after compaction
Expected result:
- Phase 1 uses 40-60% context
- After
/compact: drops to 10-20% - Clear understanding captured in compact summary
💡 Hint
Your first prompt should include:
- Task goal (1 sentence)
- Phase plan (bullet list)
- Immediate action (read/analyze specific files)
Don’t explain HOW to refactor yet — that’s Phase 2.
✅ Solution
Example first prompt:
Refactor payment processing to support multiple gateways (Stripe, PayPal, Bank Transfer).Currently hard-coded to Stripe in 6 files.
Phase plan:1. Understanding: Analyze current Stripe integration2. Planning: Design gateway abstraction layer3. Execution: Implement abstraction + migrate Stripe + add PayPal4. Verification: Test all gateways, update docs
Phase 1: Analyze services/stripe.js, controllers/payment.js, models/transaction.jsAfter analysis and /compact:
Context compacted. Key information preserved:- Stripe API calls in 8 locations- Hard-coded API keys in services/stripe.js- Transaction model assumes Stripe's response schema- No error handling for gateway timeoutsExercise 2: One-Shot Mastery
Section titled “Exercise 2: One-Shot Mastery”Goal: Complete 3 isolated tasks using one-shot pattern
Instructions:
- Create a data validation module (
validators/order.js) with 5 validation functions - Generate a test suite (
tests/validators.test.js) with 15 test cases - Create a config file (
config/payment-gateways.json) with 3 gateway configurations
Use claude -p for each. Time yourself — should take <5 minutes total.
Expected result: 3 complete files, zero conversation history, minimal tokens spent
💡 Hint
Pack EVERYTHING into the prompt:
- File purpose
- Required functions/structure
- Libraries to use
- Format/style preferences
- Edge cases to handle
✅ Solution
Command 1:
claude -p "Create validators/order.js with 5 functions: validateOrderId(id), validateAmount(amount), validateCurrency(code), validateItems(items), validateShippingAddress(address). Use Joi library. Export as module.exports. Include input sanitization and detailed error messages."Command 2:
claude -p "Create tests/validators.test.js using Jest. Test all 5 validators from validators/order.js. Include: valid cases, invalid cases, edge cases (null, undefined, empty), boundary tests (min/max amounts), injection attacks. 15 total test cases minimum."Command 3:
claude -p "Create config/payment-gateways.json with 3 gateway configs: Stripe (USD, EUR, GBP), PayPal (USD, EUR), BankTransfer (VND only). Each gateway: {name, currencies, apiEndpoint, timeout, retryAttempts, enabled}. Use realistic but fake endpoint URLs."Time: ~3-4 minutes. Tokens: ~800 input, ~2500 output total (vs. 3000+ input for conversation-based approach).
5. CHEAT SHEET
Section titled “5. CHEAT SHEET”| Technique | When to Use | How | Context Savings |
|---|---|---|---|
| Session Architecture | Multi-phase tasks (5+ steps) | Plan 4 phases: Understand → Plan → Execute → Verify | 30-50% via structure |
| Context Priming | Start of every session | First prompt: scope + constraints + immediate action (no code dumps) | 20-30% on first prompt |
| One-Shot Pattern | Isolated components, configs, utils | claude -p "complete prompt" — pack everything into single request | 60-80% (no history) |
| Task Decomposition | 5+ files, independent changes | Break into sub-tasks, pass results via files not conversation | 40-60% per sub-task |
| Strategic /compact | After sub-tasks, before new phase, at 60%+ usage | /compact at phase boundaries, NEVER mid-operation | 50-70% per compact |
| Context Recycling | Architecture decisions, recurring context | Save analysis to files, update CLAUDE.md, create context snapshots | 30-50% on reuse |
Session Phase Template
Section titled “Session Phase Template”Phase 1: Understanding- Read relevant files (3-5 max)- Ask clarifying questions- Document current state→ /compact
Phase 2: Planning- Design solution architecture- Identify affected files- Define success criteria→ /compact
Phase 3: Execution- Implement changes (1 file at a time)- /compact after each sub-task- Test incrementally
Phase 4: Verification- Run full test suite- Review changes- Update documentation→ /compact (optional, for session summary)One-Shot Prompt Template
Section titled “One-Shot Prompt Template”claude -p "Create [FILE_PATH] that [PRIMARY_PURPOSE].
Requirements:- [Req 1]- [Req 2]- [Req 3]
Technical constraints:- Use [LIBRARY/FRAMEWORK]- Follow [STYLE/PATTERN]- Include [ERROR_HANDLING/EDGE_CASES]
Format: [EXPORT_STYLE, COMMENTS, ETC]"/compact Decision Flowchart
Section titled “/compact Decision Flowchart”Is task complete or at phase boundary? YES → /compact NO ↓
Is context usage > 60%? YES → /compact NO ↓
Are you about to switch to unrelated sub-task? YES → /compact NO ↓
Continue without compacting6. PITFALLS — Common Mistakes
Section titled “6. PITFALLS — Common Mistakes”| ❌ Mistake | ✅ Correct Approach |
|---|---|
| Jump into coding immediately without planning phases | Always start with session architecture. 60 seconds of planning saves 20 minutes of context thrashing. |
| Run entire refactoring (20 files) in one long session | Decompose into sub-tasks. One-shot for isolated pieces. /compact between logical chunks. Multi-file changes = multi-phase session. |
/compact in middle of implementing a complex function | Only compact at natural boundaries: after completing a sub-task, before starting a new phase, never mid-operation. |
| Never use one-shot mode, always interactive | If task is self-contained (config, utility, test file), use claude -p. No conversation overhead. 3x faster. |
| Re-analyze same code files every session | Context recycling: save analysis to docs/, update CLAUDE.md with architecture decisions. Reuse, don’t re-derive. |
| Optimize context before understanding the problem | First session: understand deeply (use 80% context if needed). THEN optimize in subsequent sessions. Premature optimization wastes time. |
7. REAL CASE — Production Story
Section titled “7. REAL CASE — Production Story”Scenario: Monolith-to-microservices migration for e-commerce platform. 200+ files, 15 planned services (Auth, Product, Order, Payment, Inventory, Notification, Analytics, etc.).
Problem: Initial approach used single long session per service. By file 6-7, Claude started:
- Repeating previous suggestions
- Forgetting earlier architectural decisions
- Mixing patterns from different services
- Context usage spiked to 95%, quality dropped
Solution: Applied all 6 optimization techniques:
-
Session Architecture: Each service = 4-phase session
- Phase 1: Analyze monolith code for this domain
- Phase 2: Design service API + data model
- Phase 3: Extract and refactor code
- Phase 4: Integration tests + docs
-
Context Priming: First prompt for each service:
Extracting [SERVICE_NAME] service. Monolith files: [3-5 key files].Target: Standalone service with REST API, own DB, async events.Phase 1: Analyze domain logic in monolith. -
One-Shot Pattern: Used for each service’s:
- Dockerfile (standardized)
- API client library
- Config files (package.json, .env.example, etc.)
-
Task Decomposition: Each service broken into:
- Domain model extraction
- Route handlers
- Data access layer
- Event publishers/subscribers
- Tests (unit, integration)
Each = separate sub-task with
/compactbetween. -
Strategic /compact: Automatic compaction:
- After each Phase 1, 2, 3
- After every 2-3 files in Phase 3
- Before switching services
-
Context Recycling: Created
docs/microservices-architecture.mdin session 1, updated after each service design. Became the “source of truth” — every session loaded this first (via CLAUDE.md reference).
Result:
- 15 services extracted in 3 weeks (original estimate: 6 weeks)
- Token usage: 45% reduction (avg 35K tokens/service vs. 65K before)
- Consistency: 95% (vs. 60% — pattern drift eliminated)
- Zero context overflow incidents (vs. 23 in initial attempt)
- Cost: $87 total vs. projected $180+
Key insight: The most expensive token is the one that generates wrong code. Context optimization isn’t about saving pennies on API calls — it’s about maintaining quality at scale. When Claude has clean, relevant context, first-try accuracy goes from 70% to 95%. That’s where the real savings live.