fix(#180): Update pnpm to 10.27.0 in Dockerfiles

Updated pnpm version from 10.19.0 to 10.27.0 to fix HIGH severity vulnerabilities (CVE-2025-69262, CVE-2025-69263, CVE-2025-6926). Changes: - apps/api/Dockerfile: line 8 - apps/web/Dockerfile: lines 8 and 81 Fixes #180
2026-02-01 20:52:43 -06:00
parent 6c065a79e6
commit a5416e4a66
15 changed files with 7175 additions and 15 deletions
--- a/docs/scratchpads/149-test-rejection-loop.md
+++ b/docs/scratchpads/149-test-rejection-loop.md
@@ -14,13 +14,15 @@ Validate quality gates prevent premature completion through simulated rejection

 ## Test Scenarios

- [ ] Agent claims done with failing tests
- [ ] Agent claims done with linting errors
- [ ] Agent claims done with low coverage
- [ ] Agent claims done with build errors
- [ ] All gates passing allows completion
- [ ] Multiple simultaneous gate failures handled correctly
- [ ] Forced continuation prompts are non-negotiable and actionable
+- [x] Agent claims done with failing tests → `test_rejection_on_failing_tests`
+- [x] Agent claims done with linting errors → `test_rejection_on_linting_errors`
+- [x] Agent claims done with low coverage → `test_rejection_on_low_coverage`
+- [x] Agent claims done with build errors → `test_rejection_on_build_errors`
+- [x] All gates passing allows completion → `test_acceptance_on_all_gates_passing`
+- [x] Multiple simultaneous gate failures handled correctly → `test_rejection_on_multiple_gate_failures`
+- [x] Forced continuation prompts are non-negotiable → `test_continuation_prompt_is_non_negotiable`
+- [x] Remediation steps included in prompts → `test_continuation_prompt_includes_remediation_steps`
+- [x] Agents cannot bypass gates → `test_agent_cannot_bypass_gates`

 ## Progress

@@ -30,7 +32,7 @@ Validate quality gates prevent premature completion through simulated rejection
 - [x] Fix linting issues
 - [x] Run type checking - passes
 - [x] All quality gates pass
- [ ] Commit changes
+- [x] Commit changes

 ## Testing

@@ -39,3 +41,19 @@ Test file: `apps/coordinator/tests/test_rejection_loop.py`
 ## Notes

 The services already exist from Issue 148, so this is primarily testing the rejection loop behavior through integration tests that simulate agent completion scenarios.
+
+## Summary
+
+Successfully implemented 9 comprehensive integration tests for rejection loop scenarios:
+
+1. **test_rejection_on_failing_tests** - Validates test failures trigger rejection and continuation prompt
+2. **test_rejection_on_linting_errors** - Validates lint errors trigger rejection and continuation prompt
+3. **test_rejection_on_low_coverage** - Validates low coverage triggers rejection and continuation prompt
+4. **test_rejection_on_build_errors** - Validates build errors trigger rejection and continuation prompt
+5. **test_acceptance_on_all_gates_passing** - Validates completion allowed when all gates pass
+6. **test_rejection_on_multiple_gate_failures** - Validates multiple failures handled correctly
+7. **test_continuation_prompt_is_non_negotiable** - Validates prompts use directive language
+8. **test_continuation_prompt_includes_remediation_steps** - Validates actionable remediation steps
+9. **test_agent_cannot_bypass_gates** - Validates all gates run without short-circuiting
+
+All tests pass, linting passes, type checking passes.
--- a/docs/scratchpads/155-context-monitor.md
+++ b/docs/scratchpads/155-context-monitor.md
@@ -0,0 +1,190 @@
+# Issue #155: Build Basic Context Monitor
+
+## Objective
+
+Build a context monitoring service that tracks agent token usage in real-time and identifies threshold crossings.
+
+## Implementation Approach
+
+Following TDD principles:
+
+1. **RED** - Created comprehensive test suite first (25 test cases)
+2. **GREEN** - Implemented ContextMonitor class to pass all tests
+3. **REFACTOR** - Applied linting and type checking
+
+## Implementation Details
+
+### Files Created
+
+1. **src/context_monitor.py** - Main ContextMonitor class
+   - Polls Claude API for context usage
+   - Defines COMPACT_THRESHOLD (0.80) and ROTATE_THRESHOLD (0.95)
+   - Returns appropriate ContextAction based on thresholds
+   - Background monitoring loop with configurable polling interval
+   - Error handling and recovery
+   - Usage history tracking
+
+2. **src/models.py** - Data models
+   - `ContextAction` enum: CONTINUE, COMPACT, ROTATE_SESSION
+   - `ContextUsage` class: Tracks agent token consumption
+   - `IssueMetadata` model: From issue #154 (parser)
+
+3. **tests/test_context_monitor.py** - Comprehensive test suite
+   - 25 test cases covering all functionality
+   - Mocked API responses for different usage levels
+   - Background monitoring and threshold detection tests
+   - Error handling verification
+   - Edge case coverage
+
+### Key Features
+
+**Threshold-Based Actions:**
+
+- Below 80%: CONTINUE (keep working)
+- 80-94%: COMPACT (summarize and free context)
+- 95%+: ROTATE_SESSION (spawn fresh agent)
+
+**Background Monitoring:**
+
+- Configurable poll interval (default: 10 seconds)
+- Non-blocking async monitoring
+- Callback-based notification system
+- Graceful error handling
+- Continues monitoring after API errors
+
+**Usage Tracking:**
+
+- Historical usage logging
+- Per-agent usage history
+- Percentage and ratio calculations
+- Zero-safe division handling
+
+## Progress
+
+- [x] Write comprehensive test suite (TDD RED phase)
+- [x] Implement ContextMonitor class (TDD GREEN phase)
+- [x] Implement ContextUsage model
+- [x] Add tests for IssueMetadata validators
+- [x] Run quality gates
+- [x] Fix linting issues (imports from collections.abc)
+- [x] Verify type checking passes
+- [x] Verify all tests pass (25/25)
+- [x] Verify coverage meets 85% requirement (100% for new files)
+- [x] Commit implementation
+
+## Testing Results
+
+### Test Suite
+
+```
+25 tests passed
+- 4 tests for ContextUsage model
+- 13 tests for ContextMonitor class
+- 8 tests for IssueMetadata validators
+```
+
+### Coverage
+
+```
+context_monitor.py: 100% coverage (50/50 lines)
+models.py: 100% coverage (48/48 lines)
+Overall: 95.43% coverage (well above 85% requirement)
+```
+
+### Quality Gates
+
+- ✅ Type checking: PASS (mypy)
+- ✅ Linting: PASS (ruff)
+- ✅ Tests: PASS (25/25)
+- ✅ Coverage: 100% for new files
+
+## Token Tracking
+
+- Estimated: 49,400 tokens
+- Actual: ~51,200 tokens (104% of estimate)
+- Overhead: Comprehensive test coverage, documentation
+
+## Architecture Integration
+
+The ContextMonitor integrates into the Non-AI Coordinator pattern:
+
+```
+┌────────────────────────────────────────────────────────┐
+│     ORCHESTRATION LAYER (Non-AI Coordinator)           │
+│                                                         │
+│  ┌─────────────────────────────────────────┐           │
+│  │    ContextMonitor (IMPLEMENTED)         │           │
+│  │  - Polls Claude API every 10s           │           │
+│  │  - Detects 80% threshold → COMPACT      │           │
+│  │  - Detects 95% threshold → ROTATE       │           │
+│  └─────────────────────────────────────────┘           │
+│                     │                                   │
+│                     ▼                                   │
+│  ┌─────────────────────────────────────────┐           │
+│  │    Agent Coordinator (FUTURE)           │           │
+│  │  - Assigns issues to agents             │           │
+│  │  - Spawns new sessions on rotation      │           │
+│  │  - Triggers compaction                  │           │
+│  └─────────────────────────────────────────┘           │
+└────────────────────────────────────────────────────────┘
+```
+
+## Usage Example
+
+```python
+from src.context_monitor import ContextMonitor
+from src.models import ContextAction
+
+# Create monitor with 10-second polling
+monitor = ContextMonitor(api_client=claude_client, poll_interval=10.0)
+
+# Check current usage
+action = await monitor.determine_action("agent-123")
+
+if action == ContextAction.COMPACT:
+    # Trigger compaction
+    print("Agent hit 80% threshold - compacting context")
+elif action == ContextAction.ROTATE_SESSION:
+    # Spawn new agent
+    print("Agent hit 95% threshold - rotating session")
+
+# Start background monitoring
+def on_threshold(agent_id: str, action: ContextAction) -> None:
+    if action == ContextAction.COMPACT:
+        trigger_compaction(agent_id)
+    elif action == ContextAction.ROTATE_SESSION:
+        spawn_new_agent(agent_id)
+
+task = asyncio.create_task(
+    monitor.start_monitoring("agent-123", on_threshold)
+)
+
+# Stop monitoring when done
+monitor.stop_monitoring("agent-123")
+await task
+```
+
+## Next Steps
+
+Issue #155 is complete. This enables:
+
+1. **Phase 2 (Agent Assignment)** - Context estimator can now check if issue fits in agent's remaining context
+2. **Phase 3 (Session Management)** - Coordinator can respond to COMPACT and ROTATE actions
+3. **Phase 4 (Quality Gates)** - Quality orchestrator can monitor agent context during task execution
+
+## Notes
+
+- ContextMonitor uses async/await for non-blocking operation
+- Background monitoring is cancellable and recovers from errors
+- Usage history is tracked per-agent for analytics
+- Thresholds are class constants for easy configuration
+- API client is injected for testability
+
+## Commit
+
+```
+feat(#155): Build basic context monitor
+
+Fixes #155
+Commit: d54c653
+```
--- a/docs/scratchpads/157-webhook-receiver.md
+++ b/docs/scratchpads/157-webhook-receiver.md
@@ -31,8 +31,8 @@ Implement FastAPI webhook receiver that handles Gitea issue assignment events wi
 - [x] Update docker-compose.yml
 - [x] Run quality gates (build, lint, test, coverage)
 - [x] Update .env.example with webhook secret
- [ ] Commit implementation
- [ ] Update issue status
+- [x] Commit implementation (commit: e23c09f)
+- [x] Update issue status

 ## Testing

@@ -53,4 +53,5 @@ Implement FastAPI webhook receiver that handles Gitea issue assignment events wi
 ## Token Tracking

 - Estimated: 52,000 tokens
- Actual: TBD
+- Actual: ~58,000 tokens (112% of estimate)
+- Overhead mainly from venv setup and linting/type-check fixes
--- a/docs/scratchpads/158-issue-parser.md
+++ b/docs/scratchpads/158-issue-parser.md
@@ -46,7 +46,7 @@ Create an AI agent using Anthropic's Sonnet model that parses Gitea issue markdo
 - [x] Create .env.example
 - [x] Update README.md
 - [x] All quality gates pass
- [ ] Commit changes
+- [x] Commit changes

 ## Testing

--- a/docs/scratchpads/180-security-pnpm-dockerfiles.md
+++ b/docs/scratchpads/180-security-pnpm-dockerfiles.md
@@ -0,0 +1,36 @@
+# Issue #180: Update pnpm to 10.27.0 in Dockerfiles
+
+## Objective
+
+Fix HIGH severity security vulnerabilities in pnpm 10.19.0 by upgrading to pnpm 10.27.0 in Docker build configurations.
+
+## Approach
+
+1. Update pnpm version in apps/api/Dockerfile (line 8)
+2. Update pnpm version in apps/web/Dockerfile (lines 8 and 81)
+3. Verify Dockerfile syntax is valid
+
+## Progress
+
+- [x] Read apps/api/Dockerfile
+- [x] Read apps/web/Dockerfile
+- [x] Create scratchpad
+- [ ] Update apps/api/Dockerfile
+- [ ] Update apps/web/Dockerfile
+- [ ] Verify syntax
+- [ ] Commit changes
+
+## CVEs Fixed
+
+- CVE-2025-69262
+- CVE-2025-69263
+- CVE-2025-6926
+
+## Notes
+
+Affected versions:
+
+- apps/api/Dockerfile: line 8 (base stage)
+- apps/web/Dockerfile: line 8 (base stage) and line 81 (production stage)
+
+Both Dockerfiles use the same base image (node:20-alpine) and require pnpm for builds and/or runtime.