stack

Author	SHA1	Message	Date
Jason Woltje	9f3c76d43b	test(#146 ): Validate assignment cost optimization Add comprehensive cost optimization test scenarios and validation report. Test Scenarios Added (10 new tests): - Low difficulty assigns to MiniMax/GLM (free agents) - Medium difficulty assigns to GLM when within capacity - High difficulty assigns to Opus (only capable agent) - Oversized issues rejected with actionable error - Boundary conditions at capacity limits - Aggregate cost optimization across all scenarios Results: - All 33 tests passing (23 existing + 10 new) - 100% coverage of agent_assignment.py (36/36 statements) - Cost savings validation: 50%+ in aggregate scenarios - Real-world projection: 70%+ savings with typical workload Documentation: - Created cost-optimization-validation.md with detailed analysis - Documents cost savings for each scenario - Validates all acceptance criteria from COORD-006 Completes Phase 2 (M4.1-Coordinator) testing requirements. Fixes #146 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 18:13:53 -06:00
Jason Woltje	10ecbd63f1	test(#161 ): Add comprehensive E2E integration test for coordinator Implements complete end-to-end integration test covering: - Webhook receiver → parser → queue → orchestrator flow - Signature validation in full flow - Dependency blocking and unblocking logic - Multi-issue processing with correct ordering - Error handling (malformed issues, agent failures) - Performance requirement (< 10 seconds) Test suite includes 7 test cases: 1. test_full_flow_webhook_to_orchestrator - Main critical path 2. test_full_flow_with_blocked_dependency - Dependency management 3. test_full_flow_with_multiple_issues - Queue ordering 4. test_webhook_signature_validation_in_flow - Security 5. test_parser_handles_malformed_issue_body - Error handling 6. test_orchestrator_handles_spawn_agent_failure - Resilience 7. test_performance_full_flow_under_10_seconds - Performance All tests pass (182 total including 7 new). Performance verified: Full flow completes in < 1 second. 100% of critical integration path covered. Completes #161 (COORD-005) and validates Phase 0. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 18:08:10 -06:00
Jason Woltje	9b1a1c0b8a	feat(#145 ): Build assignment algorithm Implement intelligent agent assignment algorithm that selects the optimal agent for each issue based on context capacity, difficulty, and cost. Algorithm: 1. Filter agents that meet context capacity (50% rule - agent needs 2x context) 2. Filter agents that can handle difficulty level 3. Sort by cost (prefer self-hosted when capable) 4. Return cheapest qualifying agent Features: - NoCapableAgentError raised when no agent can handle requirements - Difficulty mapping: easy/low->LOW, medium->MEDIUM, hard/high->HIGH - Self-hosted preference (GLM, minimax cost=0) - Comprehensive test coverage (100%, 23 tests) Test scenarios: - Assignment for low/medium/high difficulty issues - Context capacity filtering (50% rule enforcement) - Cost optimization logic (prefers self-hosted) - Error handling for impossible assignments - Edge cases (zero context, negative context, invalid difficulty) Quality gates: - All 23 tests passing - 100% code coverage (exceeds 85% requirement) - Lint: passing (ruff) - Type check: passing (mypy) Refs #145 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 18:07:58 -06:00
Jason Woltje	88953fc998	feat(#160 ): Implement basic orchestration loop Implements the Coordinator class with main orchestration loop: - Async loop architecture with configurable poll interval - process_queue() method gets next ready issue and spawns agent (stub) - Graceful shutdown handling with stop() method - Error handling that allows loop to continue after failures - Logging for all actions (start, stop, processing, errors) - Integration with QueueManager from #159 - Active agent tracking for future agent management Configuration settings added: - COORDINATOR_POLL_INTERVAL (default: 5.0s) - COORDINATOR_MAX_CONCURRENT_AGENTS (default: 10) - COORDINATOR_ENABLED (default: true) Tests: 27 new tests covering all acceptance criteria Coverage: 92% overall (100% for coordinator.py) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 18:03:12 -06:00
Jason Woltje	f0fd0bed41	feat(#144 ): Implement agent profiles - Add Capability enum (HIGH, MEDIUM, LOW) for agent difficulty levels - Add AgentName enum for all 5 agents (opus, sonnet, haiku, glm, minimax) - Implement AgentProfile data structure with validation - context_limit: max tokens for context window - cost_per_mtok: cost per million tokens (0 for self-hosted) - capabilities: list of difficulty levels the agent handles - best_for: description of optimal use cases - Define profiles for all 5 agents with specifications: - Anthropic models (opus, sonnet, haiku): 200K context, various costs - Self-hosted models (glm, minimax): 128K context, free - Implement get_agent_profile() function for profile lookup - Add comprehensive test suite (37 tests, 100% coverage) - Profile data structure validation - All 5 predefined profiles exist and are correct - Capability enum and AgentName enum tests - Best_for validation and capability matching - Consistency checks across profiles Fixes #144 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 18:00:19 -06:00
Jason Woltje	a1b911d836	test(#143 ): Validate 50% rule prevents context exhaustion Following TDD (Red-Green-Refactor): - RED: Created comprehensive test suite with 12 test cases - GREEN: Implemented validation logic that passes all tests - All quality gates passed Test Coverage: - Oversized issue (120K) correctly rejected - Properly sized issue (80K) correctly accepted - Edge case at exactly 50% (100K) correctly accepted - Sequential issues validated individually - All agent types tested (opus, sonnet, haiku, glm, minimax) - Edge cases covered (zero, very small, boundaries) Implementation: - src/validation.py: Pure validation function - tests/test_fifty_percent_rule.py: 12 comprehensive tests - docs/50-percent-rule-validation.md: Validation report - 100% test coverage (14/14 statements) - Type checking: PASS (mypy) - Linting: PASS (ruff) The 50% rule ensures no single issue exceeds 50% of target agent's context limit, preventing context exhaustion while allowing efficient capacity utilization. Fixes #143 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 17:56:04 -06:00
Jason Woltje	72321f5fcd	feat(#159 ): Implement queue manager Implements QueueManager with full dependency tracking, persistence, and status management. Key features: - QueueItem dataclass with status, metadata, and ready flag - QueueManager with enqueue, dequeue, get_next_ready, mark_complete - Dependency resolution (blocked_by → not ready) - JSON persistence with auto-save on state changes - Automatic reload on startup - Graceful handling of circular dependencies - Status transitions (pending → in_progress → completed) Test coverage: - 26 comprehensive tests covering all operations - Dependency chain resolution - Persistence and reload scenarios - Edge cases (circular deps, missing items) - 100% code coverage on queue module - 97% total project coverage Quality gates passed: ✓ All tests passing (88 total) ✓ Type checking (mypy) passing ✓ Linting (ruff) passing ✓ Coverage ≥85% (97% achieved) This unblocks #160 (orchestrator needs queue). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 17:55:48 -06:00
Jason Woltje	dad4b68f66	feat(#158 ): Implement issue parser agent Add AI-powered issue metadata parser using Anthropic Sonnet model. - Parse issue markdown to extract: estimated_context, difficulty, assigned_agent, blocks, blocked_by - Implement in-memory caching to avoid duplicate API calls - Graceful fallback to defaults on parse failures - Add comprehensive test suite (9 test cases) - 95% test coverage (exceeds 85% requirement) - Add ANTHROPIC_API_KEY to config - Update documentation and add .env.example Fixes #158 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 17:50:35 -06:00
Jason Woltje	d54c65360a	feat(#155 ): Build basic context monitor Implements ContextMonitor class with real-time token usage tracking: - COMPACT_THRESHOLD at 0.80 (80% triggers compaction) - ROTATE_THRESHOLD at 0.95 (95% triggers rotation) - Poll Claude API for context usage - Return appropriate ContextAction based on thresholds - Background monitoring loop (10-second polling) - Log usage over time - Error handling and recovery Added ContextUsage model for tracking agent token consumption. Tests: - 25 test cases covering all functionality - 100% coverage for context_monitor.py and models.py - Mocked API responses for different usage levels - Background monitoring and threshold detection - Error handling verification Quality gates: - Type checking: PASS (mypy) - Linting: PASS (ruff) - Tests: PASS (25/25) - Coverage: 100% for new files, 95.43% overall Fixes #155 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 17:49:09 -06:00
Jason Woltje	e23c09f1f2	feat(#157 ): Set up webhook receiver endpoint Implement FastAPI webhook receiver for Gitea issue assignment events with HMAC SHA256 signature verification and event routing. Implementation details: - FastAPI application with /webhook/gitea POST endpoint - HMAC SHA256 signature verification in security.py - Event routing for assigned, unassigned, closed actions - Comprehensive logging for all webhook events - Health check endpoint at /health - Docker containerization with health checks - 91% test coverage (exceeds 85% requirement) TDD workflow followed: - Wrote 16 tests first (RED phase) - Implemented features to pass tests (GREEN phase) - All tests passing with 91% coverage - Type checking with mypy: success - Linting with ruff: success Files created: - apps/coordinator/src/main.py - FastAPI application - apps/coordinator/src/webhook.py - Webhook handlers - apps/coordinator/src/security.py - HMAC verification - apps/coordinator/src/config.py - Configuration management - apps/coordinator/tests/ - Comprehensive test suite - apps/coordinator/Dockerfile - Production container - apps/coordinator/pyproject.toml - Python project config Configuration: - Updated .env.example with GITEA_WEBHOOK_SECRET - Updated docker-compose.yml with coordinator service Testing: - 16 unit and integration tests - Security tests for signature verification - Event handler tests for all supported actions - Health check endpoint tests - All tests passing with 91% coverage This unblocks issue #158 (issue parser). Fixes #157 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-01 17:41:46 -06:00

10 Commits