# Context Estimator Formula-based context estimation for predicting token usage before issue assignment. ## Overview The context estimator predicts token requirements for issues based on: - **Files to modify** - Number of files expected to change - **Implementation complexity** - Complexity of the implementation - **Test requirements** - Level of testing needed - **Documentation** - Documentation requirements It applies a 30% safety buffer to account for iteration, debugging, and unexpected complexity. ## Formula ``` base = (files × 7000) + complexity + tests + docs total = base × 1.3 (30% safety buffer) ``` ### Component Allocations **Complexity Levels:** - `LOW` = 10,000 tokens (simple, straightforward) - `MEDIUM` = 20,000 tokens (moderate complexity, some edge cases) - `HIGH` = 30,000 tokens (complex logic, many edge cases) **Test Levels:** - `LOW` = 5,000 tokens (basic unit tests) - `MEDIUM` = 10,000 tokens (unit + integration tests) - `HIGH` = 15,000 tokens (unit + integration + E2E tests) **Documentation Levels:** - `NONE` = 0 tokens (no documentation needed) - `LIGHT` = 2,000 tokens (inline comments, basic docstrings) - `MEDIUM` = 3,000 tokens (API docs, usage examples) - `HEAVY` = 5,000 tokens (comprehensive docs, guides) **Files Context:** - Each file = 7,000 tokens (for reading and understanding) **Safety Buffer:** - 30% buffer (1.3x multiplier) for iteration and debugging ## Agent Recommendations Based on total estimated tokens: - **haiku** - < 30K tokens (fast, efficient for small tasks) - **sonnet** - 30K-80K tokens (balanced for medium tasks) - **opus** - > 80K tokens (powerful for complex tasks) ## Usage ### Quick Estimation (Convenience Function) ```python from issue_estimator import estimate_issue # Simple task result = estimate_issue( files=1, complexity="low", tests="low", docs="none" ) print(f"Estimated tokens: {result.total_estimate:,}") print(f"Recommended agent: {result.recommended_agent}") # Output: # Estimated tokens: 28,600 # Recommended agent: haiku ``` ### Detailed Estimation (Class-based) ```python from issue_estimator import ContextEstimator, EstimationInput from models import ComplexityLevel, TestLevel, DocLevel estimator = ContextEstimator() input_data = EstimationInput( files_to_modify=2, implementation_complexity=ComplexityLevel.MEDIUM, test_requirements=TestLevel.MEDIUM, documentation=DocLevel.LIGHT ) result = estimator.estimate(input_data) print(f"Files context: {result.files_context:,} tokens") print(f"Implementation: {result.implementation_tokens:,} tokens") print(f"Tests: {result.test_tokens:,} tokens") print(f"Docs: {result.doc_tokens:,} tokens") print(f"Base estimate: {result.base_estimate:,} tokens") print(f"Safety buffer: {result.buffer_tokens:,} tokens") print(f"Total estimate: {result.total_estimate:,} tokens") print(f"Recommended agent: {result.recommended_agent}") # Output: # Files context: 14,000 tokens # Implementation: 20,000 tokens # Tests: 10,000 tokens # Docs: 2,000 tokens # Base estimate: 46,000 tokens # Safety buffer: 13,800 tokens # Total estimate: 59,800 tokens # Recommended agent: sonnet ``` ### Validation Against Actual Usage ```python from issue_estimator import ContextEstimator, EstimationInput from models import ComplexityLevel, TestLevel, DocLevel estimator = ContextEstimator() input_data = EstimationInput( files_to_modify=2, implementation_complexity=ComplexityLevel.MEDIUM, test_requirements=TestLevel.MEDIUM, documentation=DocLevel.LIGHT ) # Validate against actual token usage validation = estimator.validate_against_actual( input_data, issue_number=154, actual_tokens=58000 ) print(f"Issue: #{validation.issue_number}") print(f"Estimated: {validation.estimated_tokens:,} tokens") print(f"Actual: {validation.actual_tokens:,} tokens") print(f"Error: {validation.percentage_error:.2%}") print(f"Within tolerance (±20%): {validation.within_tolerance}") # Output: # Issue: #154 # Estimated: 59,800 tokens # Actual: 58,000 tokens # Error: 3.10% # Within tolerance (±20%): True ``` ### Serialization Convert results to dictionaries for JSON serialization: ```python from issue_estimator import estimate_issue result = estimate_issue(files=2, complexity="medium") result_dict = result.to_dict() import json print(json.dumps(result_dict, indent=2)) # Output: # { # "files_context": 14000, # "implementation_tokens": 20000, # "test_tokens": 10000, # "doc_tokens": 2000, # "base_estimate": 46000, # "buffer_tokens": 13800, # "total_estimate": 59800, # "recommended_agent": "sonnet" # } ``` ## Examples ### Example 1: Quick Bug Fix ```python result = estimate_issue( files=1, complexity="low", tests="low", docs="none" ) # Total: 28,600 tokens → haiku ``` ### Example 2: Feature Implementation ```python result = estimate_issue( files=3, complexity="medium", tests="medium", docs="light" ) # Total: 63,700 tokens → sonnet ``` ### Example 3: Complex Integration ```python result = estimate_issue( files=10, complexity="high", tests="high", docs="heavy" ) # Total: 156,000 tokens → opus ``` ### Example 4: Configuration Change ```python result = estimate_issue( files=0, # No code files, just config complexity="low", tests="low", docs="light" ) # Total: 22,100 tokens → haiku ``` ## Running Tests ```bash # Install dependencies python3 -m venv venv source venv/bin/activate # or venv\Scripts\activate on Windows pip install pytest pytest-cov # Run tests pytest test_issue_estimator.py -v # Run with coverage pytest test_issue_estimator.py --cov=issue_estimator --cov=models --cov-report=term-missing # Expected: 100% coverage (35 tests passing) ``` ## Validation Results The estimator has been validated against historical issues: | Issue | Description | Estimated | Formula Result | Accuracy | | ----- | ------------------- | --------- | -------------- | ------------------------------------- | | #156 | Create bot user | 15,000 | 22,100 | Formula is more conservative (better) | | #154 | Context estimator | 46,800 | 59,800 | Accounts for iteration | | #141 | Integration testing | ~80,000 | 94,900 | Accounts for E2E complexity | The formula tends to be conservative (estimates higher than initial rough estimates), which is intentional to prevent underestimation. ## Integration with Coordinator The estimator is used by the coordinator to: 1. **Pre-estimate issues** - Calculate token requirements before assignment 2. **Agent selection** - Recommend appropriate agent (haiku/sonnet/opus) 3. **Resource planning** - Allocate token budgets 4. **Accuracy tracking** - Validate estimates against actual usage ### Coordinator Integration Example ```python # In coordinator code from issue_estimator import estimate_issue # Parse issue metadata issue_data = parse_issue_description(issue_number) # Estimate tokens result = estimate_issue( files=issue_data.get("files_to_modify", 1), complexity=issue_data.get("complexity", "medium"), tests=issue_data.get("tests", "medium"), docs=issue_data.get("docs", "light") ) # Assign to appropriate agent assign_to_agent( issue_number=issue_number, agent=result.recommended_agent, token_budget=result.total_estimate ) ``` ## Design Decisions ### Why 7,000 tokens per file? Based on empirical analysis: - Average file: 200-400 lines - With context (imports, related code): ~500-800 lines - At ~10 tokens per line: 5,000-8,000 tokens - Using 7,000 as a conservative middle ground ### Why 30% safety buffer? Accounts for: - Iteration and refactoring (10-15%) - Debugging and troubleshooting (5-10%) - Unexpected edge cases (5-10%) - Total: ~30% ### Why these complexity levels? - **LOW (10K)** - Straightforward CRUD, simple logic - **MEDIUM (20K)** - Business logic, state management, algorithms - **HIGH (30K)** - Complex algorithms, distributed systems, optimization ### Why these test levels? - **LOW (5K)** - Basic happy path tests - **MEDIUM (10K)** - Happy + sad paths, edge cases - **HIGH (15K)** - Comprehensive E2E, integration, performance ## API Reference ### Classes #### `ContextEstimator` Main estimator class. **Methods:** - `estimate(input_data: EstimationInput) -> EstimationResult` - Estimate tokens - `validate_against_actual(input_data, issue_number, actual_tokens) -> ValidationResult` - Validate estimate #### `EstimationInput` Input parameters for estimation. **Fields:** - `files_to_modify: int` - Number of files to modify - `implementation_complexity: ComplexityLevel` - Complexity level - `test_requirements: TestLevel` - Test level - `documentation: DocLevel` - Documentation level #### `EstimationResult` Result of estimation. **Fields:** - `files_context: int` - Tokens for file context - `implementation_tokens: int` - Tokens for implementation - `test_tokens: int` - Tokens for tests - `doc_tokens: int` - Tokens for documentation - `base_estimate: int` - Sum before buffer - `buffer_tokens: int` - Safety buffer tokens - `total_estimate: int` - Final estimate with buffer - `recommended_agent: str` - Recommended agent (haiku/sonnet/opus) **Methods:** - `to_dict() -> dict` - Convert to dictionary #### `ValidationResult` Result of validation against actual usage. **Fields:** - `issue_number: int` - Issue number - `estimated_tokens: int` - Estimated tokens - `actual_tokens: int` - Actual tokens used - `percentage_error: float` - Error percentage - `within_tolerance: bool` - Whether within ±20% - `notes: str` - Optional notes **Methods:** - `to_dict() -> dict` - Convert to dictionary ### Enums #### `ComplexityLevel` Implementation complexity levels. - `LOW = 10000` - `MEDIUM = 20000` - `HIGH = 30000` #### `TestLevel` Test requirement levels. - `LOW = 5000` - `MEDIUM = 10000` - `HIGH = 15000` #### `DocLevel` Documentation requirement levels. - `NONE = 0` - `LIGHT = 2000` - `MEDIUM = 3000` - `HEAVY = 5000` ### Functions #### `estimate_issue(files, complexity, tests, docs)` Convenience function for quick estimation. **Parameters:** - `files: int` - Number of files to modify - `complexity: str` - "low", "medium", or "high" - `tests: str` - "low", "medium", or "high" - `docs: str` - "none", "light", "medium", or "heavy" **Returns:** - `EstimationResult` - Estimation result ## Future Enhancements Potential improvements for future versions: 1. **Machine learning calibration** - Learn from actual usage 2. **Language-specific multipliers** - Adjust for Python vs TypeScript 3. **Historical accuracy tracking** - Track estimator accuracy over time 4. **Confidence intervals** - Provide ranges instead of point estimates 5. **Workspace-specific tuning** - Allow per-workspace calibration ## Related Documentation - [Coordinator Architecture](../../docs/3-architecture/non-ai-coordinator-comprehensive.md) - [Issue #154 - Context Estimator](https://git.mosaicstack.dev/mosaic/stack/issues/154) - [Coordinator Scripts README](README.md) ## Support For issues or questions about the context estimator: 1. Check examples in this document 2. Review test cases in `test_issue_estimator.py` 3. Open an issue in the repository