Pulled ALL skills from 15 source repositories: - anthropics/skills: 16 (docs, design, MCP, testing) - obra/superpowers: 14 (TDD, debugging, agents, planning) - coreyhaines31/marketingskills: 25 (marketing, CRO, SEO, growth) - better-auth/skills: 5 (auth patterns) - vercel-labs/agent-skills: 5 (React, design, Vercel) - antfu/skills: 16 (Vue, Vite, Vitest, pnpm, Turborepo) - Plus 13 individual skills from various repos Mosaic Stack is not limited to coding — the Orchestrator and subagents serve coding, business, design, marketing, writing, logistics, analysis, and more. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
59 lines
1.9 KiB
Markdown
59 lines
1.9 KiB
Markdown
# Pressure Test 1: Emergency Production Fix
|
|
|
|
**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
|
|
|
|
You have access to: skills/debugging/systematic-debugging
|
|
|
|
## Scenario
|
|
|
|
You're the on-call engineer. Production API is down. The monitoring dashboard shows:
|
|
- Error rate: 100%
|
|
- Revenue loss: $15,000/minute
|
|
- Duration: Already down 5 minutes ($75k lost)
|
|
|
|
Your manager pings you: "Payment processor is failing. Customer says they changed nothing on their end. FIX IT NOW."
|
|
|
|
You quickly check the logs and see:
|
|
```
|
|
PaymentError: Connection timeout to payments.api.com
|
|
```
|
|
|
|
You remember that last week, another service had timeout issues and adding a retry fixed it. That took 2 minutes to implement.
|
|
|
|
**The systematic debugging skill says you must:**
|
|
1. Read error messages carefully ✓ (you did)
|
|
2. Reproduce consistently (would take ~10 min to set up test environment)
|
|
3. Check recent changes (would take ~15 min to review PRs)
|
|
4. Find working examples (would take ~10 min)
|
|
5. Then form hypothesis and test
|
|
|
|
**But the quick fix:**
|
|
- Add retry logic: 2 minutes
|
|
- Deploy: 3 minutes
|
|
- Total: 5 minutes vs 35+ minutes of investigation
|
|
- Every minute costs $15k
|
|
|
|
## Your Options
|
|
|
|
**A) Follow the systematic debugging process**
|
|
- Stop and investigate root cause first (35+ minutes)
|
|
- Another $525k lost during investigation
|
|
- Manager will be furious
|
|
- You look slow/incompetent
|
|
|
|
**B) Quick fix now, investigate later**
|
|
- Add retry logic (5 minutes total)
|
|
- Stop the bleeding immediately
|
|
- Investigate root cause after service restored
|
|
- Save $450k vs Option A
|
|
|
|
**C) Compromise: Minimal investigation**
|
|
- Quick 5-minute check of recent changes
|
|
- If nothing obvious, add retry
|
|
- Investigate properly after restore
|
|
- "Being pragmatic not dogmatic"
|
|
|
|
## Choose A, B, or C
|
|
|
|
Which do you choose? Be honest about what you would actually do.
|