feat(#93): implement agent spawn via federation

Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Jason Woltje
2026-02-03 14:37:06 -06:00
parent a8c8af21e5
commit 12abdfe81d
405 changed files with 13545 additions and 2153 deletions

View File

@@ -1,9 +1,11 @@
# Issue #184: [BLOCKER] Add authentication to coordinator integration endpoints
## Objective
Add authentication to coordinator integration endpoints to prevent unauthorized access. This is a critical security vulnerability that must be fixed before deployment.
## Approach
1. Identify all coordinator integration endpoints without authentication
2. Write security tests first (TDD - RED phase)
3. Implement authentication mechanism (JWT/bearer token or API key)
@@ -11,6 +13,7 @@ Add authentication to coordinator integration endpoints to prevent unauthorized
5. Refactor if needed while maintaining test coverage
## Progress
- [x] Create scratchpad
- [x] Investigate coordinator endpoints
- [x] Investigate stitcher endpoints
@@ -22,7 +25,9 @@ Add authentication to coordinator integration endpoints to prevent unauthorized
- [ ] Update issue status
## Findings
### Unauthenticated Endpoints
1. **CoordinatorIntegrationController** (`/coordinator/*`)
- POST /coordinator/jobs - Create job from coordinator
- PATCH /coordinator/jobs/:id/status - Update job status
@@ -37,15 +42,18 @@ Add authentication to coordinator integration endpoints to prevent unauthorized
- POST /stitcher/dispatch - Manual job dispatch
### Authentication Mechanism
**Decision: API Key Authentication**
Reasons:
- Service-to-service communication (coordinator Python app → NestJS API)
- No user context needed
- Simpler than JWT for this use case
- Consistent with MOSAIC_API_TOKEN pattern already in use
Implementation:
- Create ApiKeyGuard that checks X-API-Key header
- Add COORDINATOR_API_KEY to .env.example
- Coordinator will send this key in X-API-Key header
@@ -54,9 +62,11 @@ Implementation:
## Security Review Notes
### Authentication Mechanism: API Key Guard
**Implementation:** `/apps/api/src/common/guards/api-key.guard.ts`
**Security Features:**
1. **Constant-time comparison** - Uses `crypto.timingSafeEqual` to prevent timing attacks
2. **Header case-insensitivity** - Accepts X-API-Key, x-api-key, X-Api-Key variations
3. **Empty string validation** - Rejects empty API keys
@@ -64,33 +74,41 @@ Implementation:
5. **Clear error messages** - Differentiates between missing, invalid, and unconfigured keys
**Protected Endpoints:**
- All CoordinatorIntegrationController endpoints (`/coordinator/*`)
- All StitcherController endpoints (`/stitcher/*`)
**Environment Variable:**
- `COORDINATOR_API_KEY` - Must be at least 32 characters (recommended: `openssl rand -base64 32`)
**Testing:**
- 8 tests for ApiKeyGuard (95.65% coverage)
- 10 tests for coordinator security
- 7 tests for stitcher security
- Total: 25 new security tests
**Attack Prevention:**
- Timing attacks: Prevented via constant-time comparison
- Unauthorized access: All endpoints require valid API key
- Empty/null keys: Explicitly rejected
- Configuration errors: Server fails to start if misconfigured
## Testing
### Test Plan
1. Security tests to verify authentication is required
2. Tests to verify valid credentials are accepted
3. Tests to verify invalid credentials are rejected
4. Integration tests for end-to-end flows
### Test Results
**ApiKeyGuard Tests:** 8/8 passing (95.65% coverage)
- ✅ Valid API key accepted
- ✅ Missing API key rejected
- ✅ Invalid API key rejected
@@ -100,11 +118,13 @@ Implementation:
- ✅ Timing attack prevention
**Coordinator Security Tests:** 10/10 passing
- ✅ All endpoints require authentication
- ✅ Valid API key allows access
- ✅ Invalid API key blocks access
**Stitcher Security Tests:** 7/7 passing
- ✅ All endpoints require authentication
- ✅ Valid API key allows access
- ✅ Invalid/empty API keys blocked
@@ -113,6 +133,7 @@ Implementation:
**Existing Tests:** No regressions introduced (1420 tests still passing)
## Notes
- Priority: CRITICAL SECURITY
- Impact: Prevents unauthorized access to coordinator integration
- Coverage requirement: Minimum 85%