feat(#93): implement agent spawn via federation

Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 14:37:06 -06:00
parent a8c8af21e5
commit 12abdfe81d
405 changed files with 13545 additions and 2153 deletions
--- a/docs/scratchpads/184-add-coordinator-auth.md
+++ b/docs/scratchpads/184-add-coordinator-auth.md
@@ -1,9 +1,11 @@
 # Issue #184: [BLOCKER] Add authentication to coordinator integration endpoints

 ## Objective
+
 Add authentication to coordinator integration endpoints to prevent unauthorized access. This is a critical security vulnerability that must be fixed before deployment.

 ## Approach
+
 1. Identify all coordinator integration endpoints without authentication
 2. Write security tests first (TDD - RED phase)
 3. Implement authentication mechanism (JWT/bearer token or API key)
@@ -11,6 +13,7 @@ Add authentication to coordinator integration endpoints to prevent unauthorized
 5. Refactor if needed while maintaining test coverage

 ## Progress
+
 - [x] Create scratchpad
 - [x] Investigate coordinator endpoints
 - [x] Investigate stitcher endpoints
@@ -22,7 +25,9 @@ Add authentication to coordinator integration endpoints to prevent unauthorized
 - [ ] Update issue status

 ## Findings
+
 ### Unauthenticated Endpoints
+
 1. **CoordinatorIntegrationController** (`/coordinator/*`)
   - POST /coordinator/jobs - Create job from coordinator
   - PATCH /coordinator/jobs/:id/status - Update job status
@@ -37,15 +42,18 @@ Add authentication to coordinator integration endpoints to prevent unauthorized
   - POST /stitcher/dispatch - Manual job dispatch

 ### Authentication Mechanism
+
 **Decision: API Key Authentication**

 Reasons:
+
 - Service-to-service communication (coordinator Python app → NestJS API)
 - No user context needed
 - Simpler than JWT for this use case
 - Consistent with MOSAIC_API_TOKEN pattern already in use

 Implementation:
+
 - Create ApiKeyGuard that checks X-API-Key header
 - Add COORDINATOR_API_KEY to .env.example
 - Coordinator will send this key in X-API-Key header
@@ -54,9 +62,11 @@ Implementation:
 ## Security Review Notes

 ### Authentication Mechanism: API Key Guard
+
 **Implementation:** `/apps/api/src/common/guards/api-key.guard.ts`

 **Security Features:**
+
 1. **Constant-time comparison** - Uses `crypto.timingSafeEqual` to prevent timing attacks
 2. **Header case-insensitivity** - Accepts X-API-Key, x-api-key, X-Api-Key variations
 3. **Empty string validation** - Rejects empty API keys
@@ -64,33 +74,41 @@ Implementation:
 5. **Clear error messages** - Differentiates between missing, invalid, and unconfigured keys

 **Protected Endpoints:**
+
 - All CoordinatorIntegrationController endpoints (`/coordinator/*`)
 - All StitcherController endpoints (`/stitcher/*`)

 **Environment Variable:**
+
 - `COORDINATOR_API_KEY` - Must be at least 32 characters (recommended: `openssl rand -base64 32`)

 **Testing:**
+
 - 8 tests for ApiKeyGuard (95.65% coverage)
 - 10 tests for coordinator security
 - 7 tests for stitcher security
 - Total: 25 new security tests

 **Attack Prevention:**
+
 - Timing attacks: Prevented via constant-time comparison
 - Unauthorized access: All endpoints require valid API key
 - Empty/null keys: Explicitly rejected
 - Configuration errors: Server fails to start if misconfigured

 ## Testing
+
 ### Test Plan
+
 1. Security tests to verify authentication is required
 2. Tests to verify valid credentials are accepted
 3. Tests to verify invalid credentials are rejected
 4. Integration tests for end-to-end flows

 ### Test Results
+
 **ApiKeyGuard Tests:** 8/8 passing (95.65% coverage)
+
 - ✅ Valid API key accepted
 - ✅ Missing API key rejected
 - ✅ Invalid API key rejected
@@ -100,11 +118,13 @@ Implementation:
 - ✅ Timing attack prevention

 **Coordinator Security Tests:** 10/10 passing
+
 - ✅ All endpoints require authentication
 - ✅ Valid API key allows access
 - ✅ Invalid API key blocks access

 **Stitcher Security Tests:** 7/7 passing
+
 - ✅ All endpoints require authentication
 - ✅ Valid API key allows access
 - ✅ Invalid/empty API keys blocked
@@ -113,6 +133,7 @@ Implementation:
 **Existing Tests:** No regressions introduced (1420 tests still passing)

 ## Notes
+
 - Priority: CRITICAL SECURITY
 - Impact: Prevents unauthorized access to coordinator integration
 - Coverage requirement: Minimum 85%