[ORCH-135] Usage Budget Management & Cost Governance #329

Closed
opened 2026-02-04 13:53:50 +00:00 by jason.woltje · 0 comments
Owner

[ORCH-135] Usage Budget Management & Cost Governance

Milestone: M6-AgentOrchestration (0.0.6)
Priority: High
Phase: Phase 3 (MVP), Phase 5 (Advanced)
Estimated Tokens: ~150K (Sonnet)

Problem Statement

Autonomous agents using Claude Code can consume significant API tokens without proper governance. Without real-time usage tracking and budgeting, projects risk:

  1. Cost overruns — Agents exceed budget before milestone completion
  2. Service disruption — Hit API rate limits mid-task
  3. Unpredictable momentum — Can't estimate project velocity
  4. Budget exhaustion — Agents consume entire monthly budget in days

Requirements

Implement a usage budget management system that provides:

  • Real-time usage tracking across all active agents
  • Budget allocation per task/milestone/project
  • Usage projection and burn rate calculation
  • Throttling decisions to prevent budget exhaustion
  • Model tier optimization (Haiku/Sonnet/Opus routing)
  • Pre-commit usage validation

Acceptance Criteria

MVP (M6 Phase 3)

  • Database schema: usage_budgets and agent_usage_logs tables
  • Valkey keys for real-time usage state
  • Usage tracking: Log tokens per agent/task
  • Budget checks: "Can afford this task?" at assignment
  • Alerts: Notify when budget 90% consumed
  • Hard stop: Pause agents when budget exceeded

Post-MVP (M6 Phase 5)

  • Projection engine: Predict budget exhaustion date
  • Model tier routing: Optimize Haiku/Sonnet/Opus selection
  • Historical analysis: Actual vs estimated accuracy
  • Budget reallocation: Move budget between projects
  • Cost reports: Per-task, per-milestone, per-project

Architecture

Budget Check Points

  1. Task Assignment (Queue Manager) — Verify budget before queueing
  2. Agent Spawn (Agent Manager) — Check headroom before spawning
  3. Checkpoint Intervals (Coordinator) — Periodic compliance checks
  4. Pre-commit Validation (Quality Gates) — Usage efficiency check

Data Model

usage_budgets Table:

CREATE TABLE usage_budgets (
  id UUID PRIMARY KEY,
  workspace_id UUID NOT NULL,
  scope VARCHAR(20) NOT NULL, -- 'global', 'project', 'milestone', 'task'
  scope_id VARCHAR(100),
  allocated BIGINT NOT NULL,
  consumed BIGINT NOT NULL DEFAULT 0,
  remaining BIGINT GENERATED ALWAYS AS (allocated - consumed) STORED,
  period_start TIMESTAMPTZ NOT NULL,
  period_end TIMESTAMPTZ NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

agent_usage_logs Table:

CREATE TABLE agent_usage_logs (
  id UUID PRIMARY KEY,
  workspace_id UUID NOT NULL,
  agent_session_id UUID NOT NULL,
  task_id UUID REFERENCES agent_tasks(id),
  input_tokens BIGINT NOT NULL,
  output_tokens BIGINT NOT NULL,
  total_tokens BIGINT NOT NULL,
  model VARCHAR(100) NOT NULL,
  estimated_cost_usd DECIMAL(10, 6),
  operation VARCHAR(100),
  logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Cost Estimation

Based on autonomous execution learnings:

function estimateTaskCost(task: Task): number {
  const baselineTokens = task.estimatedComplexity * 1000;
  const tddOverhead = 1.20; // +20% for test writing
  const baselineBuffer = 1.30; // +30% general buffer
  const phaseBuffer = 1.15; // +15% phase-specific uncertainty

  return Math.ceil(baselineTokens * tddOverhead * baselineBuffer * phaseBuffer);
}

Model Tier Optimization

Model Cost/MTok (input) Cost/MTok (output) Use Case
Claude Haiku 3.5 $0.80 $4.00 Simple CRUD, boilerplate
Claude Sonnet 4 $3.00 $15.00 Standard development
Claude Opus 4 $15.00 $75.00 Complex architecture

Routing logic: Select model based on task priority and complexity.

Implementation Plan

Phase 3 (MVP) - Week 1

  1. Create database migrations
  2. Implement UsageBudgetManager service
  3. Implement UsageTracker service
  4. Add budget checks to Queue Manager
  5. Add budget checks to Agent Manager
  6. Implement alert system (90% threshold)
  7. Implement hard stop (100% threshold)
  8. Unit tests (85%+ coverage)

Phase 5 (Advanced) - Week 2

  1. Implement projection engine
  2. Implement model tier router
  3. Add historical analysis
  4. Add budget reallocation API
  5. Create cost reporting dashboard
  6. Integration tests

Success Metrics

  • Budget accuracy: Estimated vs actual within 20%
  • Cost optimization: 40%+ savings from model tier routing
  • No surprise exhaustion: Zero instances of unexpected budget depletion
  • Steady momentum: Projects maintain velocity without budget interruptions

References

  • Design Doc: docs/design/agent-orchestration.md (Section 8)
  • Related: #98 (Queue Manager), #99 (Coordinator), #100 (Recovery)
  • Evolution Log: jarvis-brain/EVOLUTION.md (L-XXX)

Dependencies

  • #98 (Queue Manager) - for task assignment checks
  • #99 (Coordinator Service) - for checkpoint validation
  • #102 (Gateway Integration) - for agent spawn hooks

Notes

This feature is critical for sustainable autonomous development. Without it, agents can exhaust budgets before milestone completion, disrupting project momentum.

Based on learnings from M4.1 (86% budget usage) and M4.2 (130% budget usage), proper budget governance is required for predictable velocity.

# [ORCH-135] Usage Budget Management & Cost Governance **Milestone:** M6-AgentOrchestration (0.0.6) **Priority:** High **Phase:** Phase 3 (MVP), Phase 5 (Advanced) **Estimated Tokens:** ~150K (Sonnet) ## Problem Statement Autonomous agents using Claude Code can consume significant API tokens without proper governance. Without real-time usage tracking and budgeting, projects risk: 1. **Cost overruns** — Agents exceed budget before milestone completion 2. **Service disruption** — Hit API rate limits mid-task 3. **Unpredictable momentum** — Can't estimate project velocity 4. **Budget exhaustion** — Agents consume entire monthly budget in days ## Requirements Implement a usage budget management system that provides: - ✅ Real-time usage tracking across all active agents - ✅ Budget allocation per task/milestone/project - ✅ Usage projection and burn rate calculation - ✅ Throttling decisions to prevent budget exhaustion - ✅ Model tier optimization (Haiku/Sonnet/Opus routing) - ✅ Pre-commit usage validation ## Acceptance Criteria ### MVP (M6 Phase 3) - [ ] Database schema: `usage_budgets` and `agent_usage_logs` tables - [ ] Valkey keys for real-time usage state - [ ] Usage tracking: Log tokens per agent/task - [ ] Budget checks: "Can afford this task?" at assignment - [ ] Alerts: Notify when budget 90% consumed - [ ] Hard stop: Pause agents when budget exceeded ### Post-MVP (M6 Phase 5) - [ ] Projection engine: Predict budget exhaustion date - [ ] Model tier routing: Optimize Haiku/Sonnet/Opus selection - [ ] Historical analysis: Actual vs estimated accuracy - [ ] Budget reallocation: Move budget between projects - [ ] Cost reports: Per-task, per-milestone, per-project ## Architecture ### Budget Check Points 1. **Task Assignment** (Queue Manager) — Verify budget before queueing 2. **Agent Spawn** (Agent Manager) — Check headroom before spawning 3. **Checkpoint Intervals** (Coordinator) — Periodic compliance checks 4. **Pre-commit Validation** (Quality Gates) — Usage efficiency check ### Data Model **`usage_budgets` Table:** ```sql CREATE TABLE usage_budgets ( id UUID PRIMARY KEY, workspace_id UUID NOT NULL, scope VARCHAR(20) NOT NULL, -- 'global', 'project', 'milestone', 'task' scope_id VARCHAR(100), allocated BIGINT NOT NULL, consumed BIGINT NOT NULL DEFAULT 0, remaining BIGINT GENERATED ALWAYS AS (allocated - consumed) STORED, period_start TIMESTAMPTZ NOT NULL, period_end TIMESTAMPTZ NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); ``` **`agent_usage_logs` Table:** ```sql CREATE TABLE agent_usage_logs ( id UUID PRIMARY KEY, workspace_id UUID NOT NULL, agent_session_id UUID NOT NULL, task_id UUID REFERENCES agent_tasks(id), input_tokens BIGINT NOT NULL, output_tokens BIGINT NOT NULL, total_tokens BIGINT NOT NULL, model VARCHAR(100) NOT NULL, estimated_cost_usd DECIMAL(10, 6), operation VARCHAR(100), logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); ``` ### Cost Estimation Based on autonomous execution learnings: ```typescript function estimateTaskCost(task: Task): number { const baselineTokens = task.estimatedComplexity * 1000; const tddOverhead = 1.20; // +20% for test writing const baselineBuffer = 1.30; // +30% general buffer const phaseBuffer = 1.15; // +15% phase-specific uncertainty return Math.ceil(baselineTokens * tddOverhead * baselineBuffer * phaseBuffer); } ``` ### Model Tier Optimization | Model | Cost/MTok (input) | Cost/MTok (output) | Use Case | |-------|-------------------|---------------------|----------| | Claude Haiku 3.5 | $0.80 | $4.00 | Simple CRUD, boilerplate | | Claude Sonnet 4 | $3.00 | $15.00 | Standard development | | Claude Opus 4 | $15.00 | $75.00 | Complex architecture | **Routing logic:** Select model based on task priority and complexity. ## Implementation Plan ### Phase 3 (MVP) - Week 1 1. Create database migrations 2. Implement `UsageBudgetManager` service 3. Implement `UsageTracker` service 4. Add budget checks to Queue Manager 5. Add budget checks to Agent Manager 6. Implement alert system (90% threshold) 7. Implement hard stop (100% threshold) 8. Unit tests (85%+ coverage) ### Phase 5 (Advanced) - Week 2 1. Implement projection engine 2. Implement model tier router 3. Add historical analysis 4. Add budget reallocation API 5. Create cost reporting dashboard 6. Integration tests ## Success Metrics - **Budget accuracy**: Estimated vs actual within 20% - **Cost optimization**: 40%+ savings from model tier routing - **No surprise exhaustion**: Zero instances of unexpected budget depletion - **Steady momentum**: Projects maintain velocity without budget interruptions ## References - Design Doc: `docs/design/agent-orchestration.md` (Section 8) - Related: #98 (Queue Manager), #99 (Coordinator), #100 (Recovery) - Evolution Log: `jarvis-brain/EVOLUTION.md` (L-XXX) ## Dependencies - #98 (Queue Manager) - for task assignment checks - #99 (Coordinator Service) - for checkpoint validation - #102 (Gateway Integration) - for agent spawn hooks ## Notes This feature is critical for sustainable autonomous development. Without it, agents can exhaust budgets before milestone completion, disrupting project momentum. Based on learnings from M4.1 (86% budget usage) and M4.2 (130% budget usage), proper budget governance is required for predictable velocity.
jason.woltje added the phase-3p1 labels 2026-02-04 13:53:50 +00:00
jason.woltje added this to the M6-AgentOrchestration (0.0.6) milestone 2026-02-04 13:54:03 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#329