Implement the main orchestration loop that coordinates all components:
- Queue processing with priority sorting (issues by number)
- Integration with ContextMonitor for tracking agent context usage
- Integration with QualityOrchestrator for running quality gates
- Integration with ForcedContinuationService for rejection prompts
- Metrics tracking (processed_count, success_count, rejection_count)
- Graceful start/stop with proper lifecycle management
- Error handling at all levels (spawn, context, quality, continuation)
The OrchestrationLoop flow:
1. Read issue queue (priority sorted by issue number)
2. Mark issue as in progress
3. Spawn agent (stub implementation for Phase 0)
4. Check context usage via ContextMonitor
5. Run quality gates via QualityOrchestrator
6. On approval: mark complete, increment success count
7. On rejection: generate continuation prompt, increment rejection count
99% test coverage for coordinator.py (183 statements, 2 missed).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive test suite for OrchestrationLoop class that integrates:
- Queue processing with priority sorting
- Agent assignment (50% rule)
- Quality gate verification on completion claims
- Rejection handling with forced continuation prompts
- Context monitoring during agent execution
- Lifecycle management (start/stop)
- Error handling for all edge cases
- Metrics tracking (processed, success, rejection counts)
33 new tests covering all acceptance criteria.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixed code review findings:
- Removed unused imports (AsyncMock, MagicMock)
- Fixed line length violation in test_forced_continuation.py
All 15 tests still passing after fixes.
Implement comprehensive test suite for four core quality gates:
- BuildGate: Tests mypy type checking enforcement
- LintGate: Tests ruff linting with warnings as failures
- TestGate: Tests pytest execution requiring 100% pass rate
- CoverageGate: Tests coverage enforcement with 85% minimum
All tests follow TDD methodology - written before implementation.
Total: 36 tests covering success, failure, and edge cases.
Related to #147
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add comprehensive cost optimization test scenarios and validation report.
Test Scenarios Added (10 new tests):
- Low difficulty assigns to MiniMax/GLM (free agents)
- Medium difficulty assigns to GLM when within capacity
- High difficulty assigns to Opus (only capable agent)
- Oversized issues rejected with actionable error
- Boundary conditions at capacity limits
- Aggregate cost optimization across all scenarios
Results:
- All 33 tests passing (23 existing + 10 new)
- 100% coverage of agent_assignment.py (36/36 statements)
- Cost savings validation: 50%+ in aggregate scenarios
- Real-world projection: 70%+ savings with typical workload
Documentation:
- Created cost-optimization-validation.md with detailed analysis
- Documents cost savings for each scenario
- Validates all acceptance criteria from COORD-006
Completes Phase 2 (M4.1-Coordinator) testing requirements.
Fixes#146
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements the Coordinator class with main orchestration loop:
- Async loop architecture with configurable poll interval
- process_queue() method gets next ready issue and spawns agent (stub)
- Graceful shutdown handling with stop() method
- Error handling that allows loop to continue after failures
- Logging for all actions (start, stop, processing, errors)
- Integration with QueueManager from #159
- Active agent tracking for future agent management
Configuration settings added:
- COORDINATOR_POLL_INTERVAL (default: 5.0s)
- COORDINATOR_MAX_CONCURRENT_AGENTS (default: 10)
- COORDINATOR_ENABLED (default: true)
Tests: 27 new tests covering all acceptance criteria
Coverage: 92% overall (100% for coordinator.py)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Capability enum (HIGH, MEDIUM, LOW) for agent difficulty levels
- Add AgentName enum for all 5 agents (opus, sonnet, haiku, glm, minimax)
- Implement AgentProfile data structure with validation
- context_limit: max tokens for context window
- cost_per_mtok: cost per million tokens (0 for self-hosted)
- capabilities: list of difficulty levels the agent handles
- best_for: description of optimal use cases
- Define profiles for all 5 agents with specifications:
- Anthropic models (opus, sonnet, haiku): 200K context, various costs
- Self-hosted models (glm, minimax): 128K context, free
- Implement get_agent_profile() function for profile lookup
- Add comprehensive test suite (37 tests, 100% coverage)
- Profile data structure validation
- All 5 predefined profiles exist and are correct
- Capability enum and AgentName enum tests
- Best_for validation and capability matching
- Consistency checks across profiles
Fixes#144
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements ContextMonitor class with real-time token usage tracking:
- COMPACT_THRESHOLD at 0.80 (80% triggers compaction)
- ROTATE_THRESHOLD at 0.95 (95% triggers rotation)
- Poll Claude API for context usage
- Return appropriate ContextAction based on thresholds
- Background monitoring loop (10-second polling)
- Log usage over time
- Error handling and recovery
Added ContextUsage model for tracking agent token consumption.
Tests:
- 25 test cases covering all functionality
- 100% coverage for context_monitor.py and models.py
- Mocked API responses for different usage levels
- Background monitoring and threshold detection
- Error handling verification
Quality gates:
- Type checking: PASS (mypy)
- Linting: PASS (ruff)
- Tests: PASS (25/25)
- Coverage: 100% for new files, 95.43% overall
Fixes#155
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>