Following TDD (Red-Green-Refactor): - RED: Created comprehensive test suite with 12 test cases - GREEN: Implemented validation logic that passes all tests - All quality gates passed Test Coverage: - Oversized issue (120K) correctly rejected - Properly sized issue (80K) correctly accepted - Edge case at exactly 50% (100K) correctly accepted - Sequential issues validated individually - All agent types tested (opus, sonnet, haiku, glm, minimax) - Edge cases covered (zero, very small, boundaries) Implementation: - src/validation.py: Pure validation function - tests/test_fifty_percent_rule.py: 12 comprehensive tests - docs/50-percent-rule-validation.md: Validation report - 100% test coverage (14/14 statements) - Type checking: PASS (mypy) - Linting: PASS (ruff) The 50% rule ensures no single issue exceeds 50% of target agent's context limit, preventing context exhaustion while allowing efficient capacity utilization. Fixes #143 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4.8 KiB
50% Rule Validation Report
Overview
This document validates the effectiveness of the 50% rule in preventing agent context exhaustion.
Date: 2026-02-01 Issue: #143 [COORD-003] Status: ✅ VALIDATED
The 50% Rule
Rule: No single issue assignment may exceed 50% of the target agent's context limit.
Rationale: This ensures:
- Room for conversation history and tool use
- Buffer before hitting hard context limits
- Prevents single issues from monopolizing agent capacity
- Allows multiple issues to be processed without exhaustion
Agent Context Limits
| Agent | Total Limit | 50% Threshold | Use Case |
|---|---|---|---|
| opus | 200,000 | 100,000 | High complexity tasks |
| sonnet | 200,000 | 100,000 | Medium complexity |
| haiku | 200,000 | 100,000 | Low complexity |
| glm | 128,000 | 64,000 | Self-hosted medium |
| minimax | 128,000 | 64,000 | Self-hosted low |
Test Scenarios
1. Oversized Issue (REJECTED) ✅
Scenario: Issue with 120K token estimate assigned to sonnet (200K limit)
Expected: Rejected (60% exceeds 50% threshold)
Result: ✅ PASS
Issue context estimate (120000 tokens) exceeds 50% rule for sonnet agent.
Maximum allowed: 100000 tokens (50% of 200000 context limit).
2. Properly Sized Issue (ACCEPTED) ✅
Scenario: Issue with 80K token estimate assigned to sonnet
Expected: Accepted (40% is below 50% threshold)
Result: ✅ PASS - Issue accepted without warnings
3. Edge Case - Exactly 50% (ACCEPTED) ✅
Scenario: Issue with exactly 100K token estimate for sonnet
Expected: Accepted (exactly at threshold, not exceeding)
Result: ✅ PASS - Issue accepted at boundary condition
4. Sequential Issues Without Exhaustion ✅
Scenario: Three sequential 60K token issues for sonnet (30% each)
Expected: All accepted individually (50% rule checks individual issues, not cumulative)
Result: ✅ PASS - All three issues accepted
Note: Cumulative context tracking will be handled by runtime monitoring (COORD-002), not assignment validation.
Implementation Details
Module: src/validation.py
Function: validate_fifty_percent_rule(metadata: IssueMetadata) -> ValidationResult
Test Coverage: 100% (14/14 statements) Test Count: 12 comprehensive test cases
Edge Cases Validated
- ✅ Zero context estimate (accepted)
- ✅ Very small issues < 1% (accepted)
- ✅ Exactly at 50% threshold (accepted)
- ✅ Just over 50% threshold (rejected)
- ✅ All agent types (opus, sonnet, haiku, glm, minimax)
- ✅ Different context limits (200K vs 128K)
Effectiveness Analysis
Prevention Capability
The 50% rule successfully prevents:
- ❌ Single issues consuming > 50% of agent capacity
- ❌ Context exhaustion from oversized assignments
- ❌ Agent deadlock from insufficient working memory
What It Allows
The rule permits:
- ✅ Multiple medium-sized issues to be processed
- ✅ Efficient use of agent capacity (up to 50% per issue)
- ✅ Buffer space for conversation history and tool outputs
- ✅ Clear, predictable validation at assignment time
Limitations
The 50% rule does NOT prevent:
- Cumulative context growth over multiple issues (requires runtime monitoring)
- Context bloat from tool outputs or conversation (requires compaction)
- Issues that grow beyond estimate during execution (requires monitoring)
These are addressed by complementary systems:
- Runtime monitoring (#155) - Tracks actual context usage
- Context compaction - Triggered at 80% threshold
- Session rotation - Triggered at 95% threshold
Validation Metrics
| Metric | Target | Actual | Status |
|---|---|---|---|
| Test coverage | ≥85% | 100% | ✅ PASS |
| Test scenarios | 4 | 12 | ✅ PASS |
| Edge cases tested | - | 6 | ✅ PASS |
| Type safety | Pass | Pass | ✅ PASS |
| Linting | Pass | Pass | ✅ PASS |
Recommendations
- ✅ Implemented: Agent-specific limits (200K vs 128K)
- ✅ Implemented: Clear rejection messages with context
- ✅ Implemented: Validation at assignment time
- 🔄 Future: Integrate with issue assignment workflow
- 🔄 Future: Add telemetry for validation rejection rates
- 🔄 Future: Consider dynamic threshold adjustment based on historical context growth
Conclusion
The 50% rule validation is EFFECTIVE at preventing oversized issue assignments and context exhaustion. All test scenarios pass, edge cases are handled correctly, and the implementation achieves 100% test coverage.
Status: ✅ Ready for integration into coordinator workflow