docs(m6): Add Usage Budget Management section
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Add comprehensive usage budget management design to M6 orchestration architecture. FEATURES: - Real-time usage tracking across agents - Budget allocation per task/milestone/project - Usage projection and burn rate calculation - Throttling decisions to prevent budget exhaustion - Model tier optimization (Haiku/Sonnet/Opus) - Pre-commit usage validation DATA MODEL: - usage_budgets table (allocated/consumed/remaining) - agent_usage_logs table (per-agent tracking) - Valkey keys for real-time state BUDGET CHECKPOINTS: 1. Task assignment - can afford this task? 2. Agent spawn - verify budget headroom 3. Checkpoint intervals - periodic compliance 4. Pre-commit validation - usage efficiency PRIORITY: MVP (M6 Phase 3) for basic tracking, Phase 5 for advanced projection and optimization. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -908,6 +908,343 @@ export class CoordinatorService {
|
||||
|
||||
---
|
||||
|
||||
## Usage Budget Management
|
||||
|
||||
**Version:** 1.0
|
||||
**Date:** 2026-02-04
|
||||
**Status:** Required for MVP
|
||||
|
||||
### Problem Statement
|
||||
|
||||
Autonomous agents using Claude Code can consume significant API tokens without proper governance. Without real-time usage tracking and budgeting, projects risk:
|
||||
|
||||
1. **Cost overruns** — Agents exceed budget before milestone completion
|
||||
2. **Service disruption** — Hit API rate limits mid-task
|
||||
3. **Unpredictable momentum** — Can't estimate project velocity
|
||||
4. **Budget exhaustion** — Agents consume entire monthly budget in days
|
||||
|
||||
### Requirements
|
||||
|
||||
The orchestration layer must provide:
|
||||
|
||||
- **Real-time usage tracking** — Current token usage across all active agents
|
||||
- **Budget allocation** — Pre-allocate budgets per task/milestone/project
|
||||
- **Usage projection** — Estimate remaining work vs remaining budget
|
||||
- **Throttling decisions** — Pause/slow agents approaching limits
|
||||
- **Cost optimization** — Route tasks to appropriate model tiers (Haiku/Sonnet/Opus)
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Usage Budget Manager Service │
|
||||
│ ┌───────────────────┐ ┌────────────────────────────┐ │
|
||||
│ │ Usage Tracker │ │ Budget Allocator │ │
|
||||
│ │ - Query API │ │ - Per-task budgets │ │
|
||||
│ │ - Real-time sum │ │ - Per-milestone budgets │ │
|
||||
│ │ - Agent rollup │ │ - Global limits │ │
|
||||
│ └────────┬──────────┘ └──────────┬─────────────────┘ │
|
||||
│ │ │ │
|
||||
│ ┌────────▼────────────────────────▼─────────────────┐ │
|
||||
│ │ Projection Engine │ │
|
||||
│ │ - Estimate remaining work │ │
|
||||
│ │ - Calculate burn rate │ │
|
||||
│ │ - Predict budget exhaustion │ │
|
||||
│ │ - Recommend throttle/pause │ │
|
||||
│ └───────────────────────────┬───────────────────────┘ │
|
||||
└────────────────────────────────┼─────────────────────────┘
|
||||
│
|
||||
┌────────────┼────────────┐
|
||||
│ │ │
|
||||
┌───────▼──────┐ ┌──▼──────┐ ┌──▼────────────┐
|
||||
│Queue Manager │ │Agent Mgr│ │ Coordinator │
|
||||
│- Check budget│ │- Spawn │ │ - Pre-commit │
|
||||
│ before queue│ │ check │ │ validation │
|
||||
└──────────────┘ └─────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
### Budget Check Points
|
||||
|
||||
**1. Task Assignment** (Queue Manager)
|
||||
|
||||
```typescript
|
||||
async canAffordTask(taskId: string): Promise<BudgetDecision> {
|
||||
const task = await getTask(taskId);
|
||||
const currentUsage = await usageTracker.getCurrentUsage();
|
||||
const budget = await budgetAllocator.getTaskBudget(taskId);
|
||||
|
||||
const projectedCost = estimateTaskCost(task);
|
||||
const remaining = budget.limit - currentUsage.total;
|
||||
|
||||
if (projectedCost > remaining) {
|
||||
return {
|
||||
canProceed: false,
|
||||
reason: 'Insufficient budget',
|
||||
recommendation: 'Pause or reallocate budget',
|
||||
};
|
||||
}
|
||||
|
||||
return { canProceed: true };
|
||||
}
|
||||
```
|
||||
|
||||
**2. Agent Spawn** (Agent Manager)
|
||||
|
||||
Before spawning a Claude Code agent, verify budget headroom:
|
||||
|
||||
```typescript
|
||||
async spawnAgent(config: AgentConfig): Promise<Agent> {
|
||||
const budgetCheck = await usageBudgetManager.canAffordTask(config.taskId);
|
||||
|
||||
if (!budgetCheck.canProceed) {
|
||||
throw new InsufficientBudgetError(budgetCheck.reason);
|
||||
}
|
||||
|
||||
// Proceed with spawn
|
||||
const agent = await claudeCode.spawn(config);
|
||||
|
||||
// Track agent for usage rollup
|
||||
await usageTracker.registerAgent(agent.id, config.taskId);
|
||||
|
||||
return agent;
|
||||
}
|
||||
```
|
||||
|
||||
**3. Checkpoint Intervals** (Coordinator)
|
||||
|
||||
During task execution, periodically verify budget compliance:
|
||||
|
||||
```typescript
|
||||
async checkpointBudgetCompliance(taskId: string): Promise<void> {
|
||||
const usage = await usageTracker.getTaskUsage(taskId);
|
||||
const budget = await budgetAllocator.getTaskBudget(taskId);
|
||||
|
||||
const percentUsed = (usage.current / budget.allocated) * 100;
|
||||
|
||||
if (percentUsed > 90) {
|
||||
await coordinator.sendWarning(taskId, 'Approaching budget limit');
|
||||
}
|
||||
|
||||
if (percentUsed > 100) {
|
||||
await coordinator.pauseTask(taskId, 'Budget exceeded');
|
||||
await notifyUser(taskId, 'Task paused: budget exhausted');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**4. Pre-commit Validation** (Quality Gates)
|
||||
|
||||
Before committing work, verify usage is reasonable:
|
||||
|
||||
```typescript
|
||||
async validateUsageEfficiency(taskId: string): Promise<ValidationResult> {
|
||||
const usage = await usageTracker.getTaskUsage(taskId);
|
||||
const linesChanged = await git.getChangedLines(taskId);
|
||||
const testsAdded = await git.getTestFiles(taskId);
|
||||
|
||||
// Cost per line heuristic (adjust based on learnings)
|
||||
const expectedTokensPerLine = 150; // baseline + TDD overhead
|
||||
const expectedUsage = linesChanged * expectedTokensPerLine;
|
||||
|
||||
const efficiency = expectedUsage / usage.current;
|
||||
|
||||
if (efficiency < 0.5) {
|
||||
return {
|
||||
valid: false,
|
||||
reason: 'Usage appears inefficient',
|
||||
recommendation: 'Review agent logs for token waste',
|
||||
};
|
||||
}
|
||||
|
||||
return { valid: true };
|
||||
}
|
||||
```
|
||||
|
||||
### Data Model
|
||||
|
||||
#### `usage_budgets` Table
|
||||
|
||||
```sql
|
||||
CREATE TABLE usage_budgets (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
workspace_id UUID NOT NULL REFERENCES workspaces(id),
|
||||
|
||||
-- Scope: 'global', 'project', 'milestone', 'task'
|
||||
scope VARCHAR(20) NOT NULL,
|
||||
scope_id VARCHAR(100), -- project_id, milestone_id, or task_id
|
||||
|
||||
-- Budget limits (tokens)
|
||||
allocated BIGINT NOT NULL,
|
||||
consumed BIGINT NOT NULL DEFAULT 0,
|
||||
remaining BIGINT GENERATED ALWAYS AS (allocated - consumed) STORED,
|
||||
|
||||
-- Tracking
|
||||
period_start TIMESTAMPTZ NOT NULL,
|
||||
period_end TIMESTAMPTZ NOT NULL,
|
||||
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_usage_budgets_scope ON usage_budgets(scope, scope_id);
|
||||
CREATE INDEX idx_usage_budgets_workspace ON usage_budgets(workspace_id);
|
||||
```
|
||||
|
||||
#### `agent_usage_logs` Table
|
||||
|
||||
```sql
|
||||
CREATE TABLE agent_usage_logs (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
workspace_id UUID NOT NULL REFERENCES workspaces(id),
|
||||
agent_session_id UUID NOT NULL,
|
||||
task_id UUID REFERENCES agent_tasks(id),
|
||||
|
||||
-- Usage details
|
||||
input_tokens BIGINT NOT NULL,
|
||||
output_tokens BIGINT NOT NULL,
|
||||
total_tokens BIGINT NOT NULL,
|
||||
model VARCHAR(100) NOT NULL, -- 'claude-sonnet-4', 'claude-haiku-3.5', etc.
|
||||
|
||||
-- Cost tracking
|
||||
estimated_cost_usd DECIMAL(10, 6),
|
||||
|
||||
-- Context
|
||||
operation VARCHAR(100), -- 'task_execution', 'quality_review', etc.
|
||||
|
||||
logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_agent_usage_task ON agent_usage_logs(task_id);
|
||||
CREATE INDEX idx_agent_usage_session ON agent_usage_logs(agent_session_id);
|
||||
CREATE INDEX idx_agent_usage_workspace ON agent_usage_logs(workspace_id);
|
||||
```
|
||||
|
||||
### Valkey/Redis Keys
|
||||
|
||||
```
|
||||
# Current usage (real-time)
|
||||
usage:current:{workspace_id} -> HASH
|
||||
{
|
||||
"total_tokens": "1234567",
|
||||
"total_cost_usd": "12.34",
|
||||
"last_updated": "2026-02-04T10:30:00Z"
|
||||
}
|
||||
|
||||
# Per-task usage
|
||||
usage:task:{task_id} -> HASH
|
||||
{
|
||||
"allocated": "50000",
|
||||
"consumed": "23456",
|
||||
"remaining": "26544",
|
||||
"agents": ["agent-1", "agent-2"]
|
||||
}
|
||||
|
||||
# Budget alerts
|
||||
usage:alerts:{workspace_id} -> LIST
|
||||
[
|
||||
{
|
||||
"task_id": "task-123",
|
||||
"level": "warning",
|
||||
"percent_used": 92,
|
||||
"message": "Task approaching budget limit"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### Cost Estimation Formulas
|
||||
|
||||
Based on autonomous execution learnings:
|
||||
|
||||
```typescript
|
||||
function estimateTaskCost(task: Task): number {
|
||||
const baselineTokens = task.estimatedComplexity * 1000; // tokens per complexity point
|
||||
|
||||
// Overhead factors
|
||||
const tddOverhead = 1.2; // +20% for test writing
|
||||
const baselineBuffer = 1.3; // +30% general buffer
|
||||
const phaseBuffer = 1.15; // +15% phase-specific uncertainty
|
||||
|
||||
const estimated = baselineTokens * tddOverhead * baselineBuffer * phaseBuffer;
|
||||
|
||||
return Math.ceil(estimated);
|
||||
}
|
||||
```
|
||||
|
||||
### Model Tier Optimization
|
||||
|
||||
Route tasks to appropriate model tiers for cost efficiency:
|
||||
|
||||
| Model | Cost/MTok (input) | Cost/MTok (output) | Use Case |
|
||||
| ---------------- | ----------------- | ------------------ | ------------------------------------ |
|
||||
| Claude Haiku 3.5 | $0.80 | $4.00 | Simple CRUD, boilerplate, linting |
|
||||
| Claude Sonnet 4 | $3.00 | $15.00 | Standard development, refactoring |
|
||||
| Claude Opus 4 | $15.00 | $75.00 | Complex architecture, critical fixes |
|
||||
|
||||
**Routing logic:**
|
||||
|
||||
```typescript
|
||||
function selectModel(task: Task): ModelTier {
|
||||
if (task.priority === "critical" || task.complexity > 8) {
|
||||
return "opus";
|
||||
}
|
||||
|
||||
if (task.type === "boilerplate" || task.estimatedTokens < 10000) {
|
||||
return "haiku";
|
||||
}
|
||||
|
||||
return "sonnet"; // default
|
||||
}
|
||||
```
|
||||
|
||||
### Projection Algorithm
|
||||
|
||||
Predict budget exhaustion:
|
||||
|
||||
```typescript
|
||||
async function projectBudgetExhaustion(workspaceId: string): Promise<Projection> {
|
||||
const usage = await usageTracker.getCurrentUsage(workspaceId);
|
||||
const budget = await budgetAllocator.getGlobalBudget(workspaceId);
|
||||
|
||||
const dailyBurnRate = usage.tokensLast24h;
|
||||
const daysRemaining = budget.remaining / dailyBurnRate;
|
||||
|
||||
const exhaustionDate = new Date();
|
||||
exhaustionDate.setDate(exhaustionDate.getDate() + daysRemaining);
|
||||
|
||||
return {
|
||||
remaining_tokens: budget.remaining,
|
||||
daily_burn_rate: dailyBurnRate,
|
||||
days_until_exhaustion: daysRemaining,
|
||||
projected_exhaustion_date: exhaustionDate,
|
||||
recommendation: daysRemaining < 7 ? "THROTTLE" : "CONTINUE",
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Implementation Priority
|
||||
|
||||
**MVP (M6 Phase 3):**
|
||||
|
||||
- ✅ Basic usage tracking (log tokens per task)
|
||||
- ✅ Simple budget checks (can afford this task?)
|
||||
- ✅ Alert on budget exceeded
|
||||
|
||||
**Post-MVP (M6 Phase 5):**
|
||||
|
||||
- Projection engine (when will budget run out?)
|
||||
- Model tier optimization (Haiku/Sonnet/Opus routing)
|
||||
- Historical analysis (actual vs estimated)
|
||||
- Budget reallocation (move budget between projects)
|
||||
|
||||
### Success Metrics
|
||||
|
||||
- **Budget accuracy**: Estimated vs actual within 20%
|
||||
- **Cost optimization**: 40%+ savings from model tier routing
|
||||
- **No surprise exhaustion**: Zero instances of unexpected budget depletion
|
||||
- **Steady momentum**: Projects maintain velocity without budget interruptions
|
||||
|
||||
---
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Foundation (Week 1-2)
|
||||
|
||||
Reference in New Issue
Block a user