[ORCH-135] Usage Budget Management & Cost Governance #329

New Issue

jason.woltje · 2026-02-04T13:53:50Z

jason.woltje commented

2026-02-04 13:53:50 +00:00

[ORCH-135] Usage Budget Management & Cost Governance

Milestone: M6-AgentOrchestration (0.0.6)
Priority: High
Phase: Phase 3 (MVP), Phase 5 (Advanced)
Estimated Tokens: ~150K (Sonnet)

Problem Statement

Autonomous agents using Claude Code can consume significant API tokens without proper governance. Without real-time usage tracking and budgeting, projects risk:

Cost overruns — Agents exceed budget before milestone completion
Service disruption — Hit API rate limits mid-task
Unpredictable momentum — Can't estimate project velocity
Budget exhaustion — Agents consume entire monthly budget in days

Requirements

Implement a usage budget management system that provides:

✅ Real-time usage tracking across all active agents
✅ Budget allocation per task/milestone/project
✅ Usage projection and burn rate calculation
✅ Throttling decisions to prevent budget exhaustion
✅ Model tier optimization (Haiku/Sonnet/Opus routing)
✅ Pre-commit usage validation

Acceptance Criteria

MVP (M6 Phase 3)

Database schema: usage_budgets and agent_usage_logs tables
Valkey keys for real-time usage state
Usage tracking: Log tokens per agent/task
Budget checks: "Can afford this task?" at assignment
Alerts: Notify when budget 90% consumed
Hard stop: Pause agents when budget exceeded

Post-MVP (M6 Phase 5)

Projection engine: Predict budget exhaustion date
Model tier routing: Optimize Haiku/Sonnet/Opus selection
Historical analysis: Actual vs estimated accuracy
Budget reallocation: Move budget between projects
Cost reports: Per-task, per-milestone, per-project

Architecture

Budget Check Points

Task Assignment (Queue Manager) — Verify budget before queueing
Agent Spawn (Agent Manager) — Check headroom before spawning
Checkpoint Intervals (Coordinator) — Periodic compliance checks
Pre-commit Validation (Quality Gates) — Usage efficiency check

Data Model

usage_budgets Table:

CREATE TABLE usage_budgets (
  id UUID PRIMARY KEY,
  workspace_id UUID NOT NULL,
  scope VARCHAR(20) NOT NULL, -- 'global', 'project', 'milestone', 'task'
  scope_id VARCHAR(100),
  allocated BIGINT NOT NULL,
  consumed BIGINT NOT NULL DEFAULT 0,
  remaining BIGINT GENERATED ALWAYS AS (allocated - consumed) STORED,
  period_start TIMESTAMPTZ NOT NULL,
  period_end TIMESTAMPTZ NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

agent_usage_logs Table:

CREATE TABLE agent_usage_logs (
  id UUID PRIMARY KEY,
  workspace_id UUID NOT NULL,
  agent_session_id UUID NOT NULL,
  task_id UUID REFERENCES agent_tasks(id),
  input_tokens BIGINT NOT NULL,
  output_tokens BIGINT NOT NULL,
  total_tokens BIGINT NOT NULL,
  model VARCHAR(100) NOT NULL,
  estimated_cost_usd DECIMAL(10, 6),
  operation VARCHAR(100),
  logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Cost Estimation

Based on autonomous execution learnings:

function estimateTaskCost(task: Task): number {
  const baselineTokens = task.estimatedComplexity * 1000;
  const tddOverhead = 1.20; // +20% for test writing
  const baselineBuffer = 1.30; // +30% general buffer
  const phaseBuffer = 1.15; // +15% phase-specific uncertainty

  return Math.ceil(baselineTokens * tddOverhead * baselineBuffer * phaseBuffer);
}

Model Tier Optimization

Model	Cost/MTok (input)	Cost/MTok (output)	Use Case
Claude Haiku 3.5	$0.80	$4.00	Simple CRUD, boilerplate
Claude Sonnet 4	$3.00	$15.00	Standard development
Claude Opus 4	$15.00	$75.00	Complex architecture

Routing logic: Select model based on task priority and complexity.

Implementation Plan

Phase 3 (MVP) - Week 1

Create database migrations
Implement UsageBudgetManager service
Implement UsageTracker service
Add budget checks to Queue Manager
Add budget checks to Agent Manager
Implement alert system (90% threshold)
Implement hard stop (100% threshold)
Unit tests (85%+ coverage)

Phase 5 (Advanced) - Week 2

Implement projection engine
Implement model tier router
Add historical analysis
Add budget reallocation API
Create cost reporting dashboard
Integration tests

Success Metrics

Budget accuracy: Estimated vs actual within 20%
Cost optimization: 40%+ savings from model tier routing
No surprise exhaustion: Zero instances of unexpected budget depletion
Steady momentum: Projects maintain velocity without budget interruptions

References

Design Doc: docs/design/agent-orchestration.md (Section 8)
Related: #98 (Queue Manager), #99 (Coordinator), #100 (Recovery)
Evolution Log: jarvis-brain/EVOLUTION.md (L-XXX)

Dependencies

#98 (Queue Manager) - for task assignment checks
#99 (Coordinator Service) - for checkpoint validation
#102 (Gateway Integration) - for agent spawn hooks

Notes

This feature is critical for sustainable autonomous development. Without it, agents can exhaust budgets before milestone completion, disrupting project momentum.

Based on learnings from M4.1 (86% budget usage) and M4.2 (130% budget usage), proper budget governance is required for predictable velocity.

# [ORCH-135] Usage Budget Management & Cost Governance **Milestone:** M6-AgentOrchestration (0.0.6) **Priority:** High **Phase:** Phase 3 (MVP), Phase 5 (Advanced) **Estimated Tokens:** ~150K (Sonnet) ## Problem Statement Autonomous agents using Claude Code can consume significant API tokens without proper governance. Without real-time usage tracking and budgeting, projects risk: 1. **Cost overruns** — Agents exceed budget before milestone completion 2. **Service disruption** — Hit API rate limits mid-task 3. **Unpredictable momentum** — Can't estimate project velocity 4. **Budget exhaustion** — Agents consume entire monthly budget in days ## Requirements Implement a usage budget management system that provides: - ✅ Real-time usage tracking across all active agents - ✅ Budget allocation per task/milestone/project - ✅ Usage projection and burn rate calculation - ✅ Throttling decisions to prevent budget exhaustion - ✅ Model tier optimization (Haiku/Sonnet/Opus routing) - ✅ Pre-commit usage validation ## Acceptance Criteria ### MVP (M6 Phase 3) - [ ] Database schema: `usage_budgets` and `agent_usage_logs` tables - [ ] Valkey keys for real-time usage state - [ ] Usage tracking: Log tokens per agent/task - [ ] Budget checks: "Can afford this task?" at assignment - [ ] Alerts: Notify when budget 90% consumed - [ ] Hard stop: Pause agents when budget exceeded ### Post-MVP (M6 Phase 5) - [ ] Projection engine: Predict budget exhaustion date - [ ] Model tier routing: Optimize Haiku/Sonnet/Opus selection - [ ] Historical analysis: Actual vs estimated accuracy - [ ] Budget reallocation: Move budget between projects - [ ] Cost reports: Per-task, per-milestone, per-project ## Architecture ### Budget Check Points 1. **Task Assignment** (Queue Manager) — Verify budget before queueing 2. **Agent Spawn** (Agent Manager) — Check headroom before spawning 3. **Checkpoint Intervals** (Coordinator) — Periodic compliance checks 4. **Pre-commit Validation** (Quality Gates) — Usage efficiency check ### Data Model **`usage_budgets` Table:** ```sql CREATE TABLE usage_budgets ( id UUID PRIMARY KEY, workspace_id UUID NOT NULL, scope VARCHAR(20) NOT NULL, -- 'global', 'project', 'milestone', 'task' scope_id VARCHAR(100), allocated BIGINT NOT NULL, consumed BIGINT NOT NULL DEFAULT 0, remaining BIGINT GENERATED ALWAYS AS (allocated - consumed) STORED, period_start TIMESTAMPTZ NOT NULL, period_end TIMESTAMPTZ NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); ``` **`agent_usage_logs` Table:** ```sql CREATE TABLE agent_usage_logs ( id UUID PRIMARY KEY, workspace_id UUID NOT NULL, agent_session_id UUID NOT NULL, task_id UUID REFERENCES agent_tasks(id), input_tokens BIGINT NOT NULL, output_tokens BIGINT NOT NULL, total_tokens BIGINT NOT NULL, model VARCHAR(100) NOT NULL, estimated_cost_usd DECIMAL(10, 6), operation VARCHAR(100), logged_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); ``` ### Cost Estimation Based on autonomous execution learnings: ```typescript function estimateTaskCost(task: Task): number { const baselineTokens = task.estimatedComplexity * 1000; const tddOverhead = 1.20; // +20% for test writing const baselineBuffer = 1.30; // +30% general buffer const phaseBuffer = 1.15; // +15% phase-specific uncertainty return Math.ceil(baselineTokens * tddOverhead * baselineBuffer * phaseBuffer); } ``` ### Model Tier Optimization | Model | Cost/MTok (input) | Cost/MTok (output) | Use Case | |-------|-------------------|---------------------|----------| | Claude Haiku 3.5 | $0.80 | $4.00 | Simple CRUD, boilerplate | | Claude Sonnet 4 | $3.00 | $15.00 | Standard development | | Claude Opus 4 | $15.00 | $75.00 | Complex architecture | **Routing logic:** Select model based on task priority and complexity. ## Implementation Plan ### Phase 3 (MVP) - Week 1 1. Create database migrations 2. Implement `UsageBudgetManager` service 3. Implement `UsageTracker` service 4. Add budget checks to Queue Manager 5. Add budget checks to Agent Manager 6. Implement alert system (90% threshold) 7. Implement hard stop (100% threshold) 8. Unit tests (85%+ coverage) ### Phase 5 (Advanced) - Week 2 1. Implement projection engine 2. Implement model tier router 3. Add historical analysis 4. Add budget reallocation API 5. Create cost reporting dashboard 6. Integration tests ## Success Metrics - **Budget accuracy**: Estimated vs actual within 20% - **Cost optimization**: 40%+ savings from model tier routing - **No surprise exhaustion**: Zero instances of unexpected budget depletion - **Steady momentum**: Projects maintain velocity without budget interruptions ## References - Design Doc: `docs/design/agent-orchestration.md` (Section 8) - Related: #98 (Queue Manager), #99 (Coordinator), #100 (Recovery) - Evolution Log: `jarvis-brain/EVOLUTION.md` (L-XXX) ## Dependencies - #98 (Queue Manager) - for task assignment checks - #99 (Coordinator Service) - for checkpoint validation - #102 (Gateway Integration) - for agent spawn hooks ## Notes This feature is critical for sustainable autonomous development. Without it, agents can exhaust budgets before milestone completion, disrupting project momentum. Based on learnings from M4.1 (86% budget usage) and M4.2 (130% budget usage), proper budget governance is required for predictable velocity.

jason.woltje added the phase-3 p1 labels 2026-02-04 13:53:50 +00:00

jason.woltje added this to the M6-AgentOrchestration (0.0.6) milestone 2026-02-04 13:54:03 +00:00

jason.woltje referenced this issue from a commit

2026-02-05 19:00:54 +00:00

feat(#329): Add usage budget management and cost governance

jason.woltje referenced a pull request that will close this issue

2026-02-05 19:01:09 +00:00

feat(#329): Add usage budget management and cost governance #336

jason.woltje closed this issue

2026-02-05 19:01:15 +00:00

jason.woltje referenced this issue from a commit

2026-02-05 19:16:01 +00:00

fix(#329): Harden BudgetService against security review findings

jason.woltje referenced this issue from a commit

2026-02-05 20:37:54 +00:00

Merge pull request 'feat(#329): Add usage budget management and cost governance' (#336) from feature/329-usage-budget into develop

Sign in to join this conversation.

Branches Tags

main

fix/orchestrator-widget-endpoints

fix/dashboard-widget-mock-data

fix/ci-glibc-image

fix/dockerfile-npmrc

fix/matrix-native-binary

fix/kaniko-cache

fix/base-image-kaniko-v2

fix/base-image-kaniko

feat/custom-base-image

ci/pnpm-cache

fix/interceptor-tests

fix/kanban-tests

feat/wire-chat

feat/usage-widget

fix/security-hardening

fix/project-domain-v2

feat/kanban-add-task

fix/project-domain-attach

fix/logs-page-clean

fix/workspace-members

fix/ci-lint-632

fix/file-manager-tags

fix/csrf-debug-log

fix/controller-type-imports

fix/system-admin-env

fix/gateway-cors-trusted-origins

feat/project-detail-page

fix/fleet-provider-form-dto-v2

fix/ms22-audit

fix/orchestrator-widgets

fix/fleet-provider-form-dto

fix/csrf-bearer-bypass

fix/ms22-missing-authmodule-imports

fix/container-lifecycle-config-module

fix/swarm-compose-ms22-vars

chore/ms22-p1-complete

feat/ms22-p1h-settings-ui

feat/ms22-p1f-onboarding-ui

feat/ms22-p1i-chat-proxy

feat/ms22-p1k-idle-reaper

feat/ms22-p1j-docker

feat/ms22-p1e-onboarding-api

feat/ms22-p1g-settings-api

feat/ms22-p1d-container-mgr

feat/ms22-p1c-config-api

chore/ms22-prd-tracking

feat/ms22-p1a-schema

feat/ms22-p1b-crypto

chore/ms22-p1-tasks

docs/ms22-architecture

feat/ms22-openclaw-docker

feat/ms22-openclaw-gateway-module

chore/ms21-complete

chore/ms21-final-tasks-done

fix/ms21-ui-001-qa

test/ms21-ui-tests

chore/ms21-tasks-sync

chore/ms22-phase0-complete

feat/ms22-ingest-clean

feat/ms21-ui-users-members

feat/ms22-task-agent

chore/tasks-final

chore/tasks-update

feat/ms21-session-invalidation

feat/ms21-rbac-settings

feat/ms21-teams-page

feat/ms21-users-page

feat/ms19-terminal-persistence

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: mosaic/stack#329