Files
stack/apps/coordinator/docs/50-percent-rule-validation.md
Jason Woltje a1b911d836 test(#143): Validate 50% rule prevents context exhaustion
Following TDD (Red-Green-Refactor):
- RED: Created comprehensive test suite with 12 test cases
- GREEN: Implemented validation logic that passes all tests
- All quality gates passed

Test Coverage:
- Oversized issue (120K) correctly rejected
- Properly sized issue (80K) correctly accepted
- Edge case at exactly 50% (100K) correctly accepted
- Sequential issues validated individually
- All agent types tested (opus, sonnet, haiku, glm, minimax)
- Edge cases covered (zero, very small, boundaries)

Implementation:
- src/validation.py: Pure validation function
- tests/test_fifty_percent_rule.py: 12 comprehensive tests
- docs/50-percent-rule-validation.md: Validation report
- 100% test coverage (14/14 statements)
- Type checking: PASS (mypy)
- Linting: PASS (ruff)

The 50% rule ensures no single issue exceeds 50% of target
agent's context limit, preventing context exhaustion while
allowing efficient capacity utilization.

Fixes #143

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-01 17:56:04 -06:00

4.8 KiB

50% Rule Validation Report

Overview

This document validates the effectiveness of the 50% rule in preventing agent context exhaustion.

Date: 2026-02-01 Issue: #143 [COORD-003] Status: VALIDATED

The 50% Rule

Rule: No single issue assignment may exceed 50% of the target agent's context limit.

Rationale: This ensures:

  • Room for conversation history and tool use
  • Buffer before hitting hard context limits
  • Prevents single issues from monopolizing agent capacity
  • Allows multiple issues to be processed without exhaustion

Agent Context Limits

Agent Total Limit 50% Threshold Use Case
opus 200,000 100,000 High complexity tasks
sonnet 200,000 100,000 Medium complexity
haiku 200,000 100,000 Low complexity
glm 128,000 64,000 Self-hosted medium
minimax 128,000 64,000 Self-hosted low

Test Scenarios

1. Oversized Issue (REJECTED)

Scenario: Issue with 120K token estimate assigned to sonnet (200K limit)

Expected: Rejected (60% exceeds 50% threshold)

Result: PASS

Issue context estimate (120000 tokens) exceeds 50% rule for sonnet agent.
Maximum allowed: 100000 tokens (50% of 200000 context limit).

2. Properly Sized Issue (ACCEPTED)

Scenario: Issue with 80K token estimate assigned to sonnet

Expected: Accepted (40% is below 50% threshold)

Result: PASS - Issue accepted without warnings

3. Edge Case - Exactly 50% (ACCEPTED)

Scenario: Issue with exactly 100K token estimate for sonnet

Expected: Accepted (exactly at threshold, not exceeding)

Result: PASS - Issue accepted at boundary condition

4. Sequential Issues Without Exhaustion

Scenario: Three sequential 60K token issues for sonnet (30% each)

Expected: All accepted individually (50% rule checks individual issues, not cumulative)

Result: PASS - All three issues accepted

Note: Cumulative context tracking will be handled by runtime monitoring (COORD-002), not assignment validation.

Implementation Details

Module: src/validation.py Function: validate_fifty_percent_rule(metadata: IssueMetadata) -> ValidationResult

Test Coverage: 100% (14/14 statements) Test Count: 12 comprehensive test cases

Edge Cases Validated

  1. Zero context estimate (accepted)
  2. Very small issues < 1% (accepted)
  3. Exactly at 50% threshold (accepted)
  4. Just over 50% threshold (rejected)
  5. All agent types (opus, sonnet, haiku, glm, minimax)
  6. Different context limits (200K vs 128K)

Effectiveness Analysis

Prevention Capability

The 50% rule successfully prevents:

  • Single issues consuming > 50% of agent capacity
  • Context exhaustion from oversized assignments
  • Agent deadlock from insufficient working memory

What It Allows

The rule permits:

  • Multiple medium-sized issues to be processed
  • Efficient use of agent capacity (up to 50% per issue)
  • Buffer space for conversation history and tool outputs
  • Clear, predictable validation at assignment time

Limitations

The 50% rule does NOT prevent:

  • Cumulative context growth over multiple issues (requires runtime monitoring)
  • Context bloat from tool outputs or conversation (requires compaction)
  • Issues that grow beyond estimate during execution (requires monitoring)

These are addressed by complementary systems:

  • Runtime monitoring (#155) - Tracks actual context usage
  • Context compaction - Triggered at 80% threshold
  • Session rotation - Triggered at 95% threshold

Validation Metrics

Metric Target Actual Status
Test coverage ≥85% 100% PASS
Test scenarios 4 12 PASS
Edge cases tested - 6 PASS
Type safety Pass Pass PASS
Linting Pass Pass PASS

Recommendations

  1. Implemented: Agent-specific limits (200K vs 128K)
  2. Implemented: Clear rejection messages with context
  3. Implemented: Validation at assignment time
  4. 🔄 Future: Integrate with issue assignment workflow
  5. 🔄 Future: Add telemetry for validation rejection rates
  6. 🔄 Future: Consider dynamic threshold adjustment based on historical context growth

Conclusion

The 50% rule validation is EFFECTIVE at preventing oversized issue assignments and context exhaustion. All test scenarios pass, edge cases are handled correctly, and the implementation achieves 100% test coverage.

Status: Ready for integration into coordinator workflow