Implement Token Budget Tracker #138

Closed
opened 2026-01-30 23:43:14 +00:00 by jason.woltje · 0 comments
Owner

Track token usage and prevent premature done claims with significant budget remaining.

Objective: Detect when agents claim done with substantial token budget unused, indicating premature stopping.

Problem: Agents claim done after fixing P0 issues, leaving work incomplete despite budget remaining.

Token Budget System:

  • Track tokens used vs allocated per task
  • Flag suspicious patterns: done claimed with >20% budget remaining
  • Correlate with gate failures: done + budget remaining + gates failing = forced continue
  • Allow early completion only if gates pass AND work demonstrably complete

Budget Allocation:

  • Per-task budget based on estimated complexity
  • Track input tokens, output tokens, total cost
  • Compare against similar completed tasks
  • Learn optimal budget utilization over time

Anti-Gaming Detection:

  • Agent cannot waste tokens to hit threshold
  • Must correlate token usage with actual work progress
  • Gate results are primary signal, budget is secondary

Integration with Orchestrator:

  • Orchestrator checks budget before accepting done
  • Suspicious pattern + gate failures = reject done
  • Budget exhausted + gates failing = alert user, request more budget

Related: L-015, #134 (orchestrator), #136 (gates), #137 (forced continuation)

Acceptance Criteria:

  • Token usage tracked per agent session
  • Budget utilization calculated
  • Suspicious patterns detected
  • Integration with orchestrator decision logic
  • Does not prevent legitimate early completion
  • Alerts on budget exhaustion before work complete
Track token usage and prevent premature done claims with significant budget remaining. Objective: Detect when agents claim done with substantial token budget unused, indicating premature stopping. Problem: Agents claim done after fixing P0 issues, leaving work incomplete despite budget remaining. Token Budget System: - Track tokens used vs allocated per task - Flag suspicious patterns: done claimed with >20% budget remaining - Correlate with gate failures: done + budget remaining + gates failing = forced continue - Allow early completion only if gates pass AND work demonstrably complete Budget Allocation: - Per-task budget based on estimated complexity - Track input tokens, output tokens, total cost - Compare against similar completed tasks - Learn optimal budget utilization over time Anti-Gaming Detection: - Agent cannot waste tokens to hit threshold - Must correlate token usage with actual work progress - Gate results are primary signal, budget is secondary Integration with Orchestrator: - Orchestrator checks budget before accepting done - Suspicious pattern + gate failures = reject done - Budget exhausted + gates failing = alert user, request more budget Related: L-015, #134 (orchestrator), #136 (gates), #137 (forced continuation) Acceptance Criteria: - Token usage tracked per agent session - Budget utilization calculated - Suspicious patterns detected - Integration with orchestrator decision logic - Does not prevent legitimate early completion - Alerts on budget exhaustion before work complete
jason.woltje added the apiapip1 labels 2026-01-30 23:43:14 +00:00
jason.woltje added this to the M4-LLM (0.0.4) milestone 2026-01-30 23:45:34 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#138