Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
226 lines
8.4 KiB
Markdown
226 lines
8.4 KiB
Markdown
# Issue #85: [FED-002] CONNECT/DISCONNECT Protocol
|
|
|
|
## Objective
|
|
|
|
Implement the connection handshake protocol for federation, building on the Instance Identity Model from issue #84. This includes:
|
|
|
|
- Connection request/accept/reject handshake
|
|
- Message signing and verification using instance keypairs
|
|
- Connection state management (PENDING → ACTIVE, DISCONNECTED)
|
|
- API endpoints for initiating and managing connections
|
|
- Proper error handling and validation
|
|
|
|
## Context
|
|
|
|
Issue #84 provides the foundation:
|
|
|
|
- `Instance` model with keypair for signing
|
|
- `FederationConnection` model with status enum (PENDING, ACTIVE, SUSPENDED, DISCONNECTED)
|
|
- `FederationService` with identity management
|
|
- `CryptoService` for encryption/decryption
|
|
- Database schema is already in place
|
|
|
|
## Approach
|
|
|
|
### 1. Create Types for Connection Protocol
|
|
|
|
Define TypeScript interfaces in `/apps/api/src/federation/types/connection.types.ts`:
|
|
|
|
```typescript
|
|
// Connection request payload
|
|
interface ConnectionRequest {
|
|
instanceId: string;
|
|
instanceUrl: string;
|
|
publicKey: string;
|
|
capabilities: FederationCapabilities;
|
|
timestamp: number;
|
|
signature: string; // Sign entire payload with private key
|
|
}
|
|
|
|
// Connection response
|
|
interface ConnectionResponse {
|
|
accepted: boolean;
|
|
instanceId: string;
|
|
publicKey: string;
|
|
capabilities: FederationCapabilities;
|
|
reason?: string; // If rejected
|
|
timestamp: number;
|
|
signature: string;
|
|
}
|
|
|
|
// Disconnect request
|
|
interface DisconnectRequest {
|
|
instanceId: string;
|
|
reason?: string;
|
|
timestamp: number;
|
|
signature: string;
|
|
}
|
|
```
|
|
|
|
### 2. Add Signature Service
|
|
|
|
Create `/apps/api/src/federation/signature.service.ts` for message signing:
|
|
|
|
- `sign(message: object, privateKey: string): string` - Sign a message
|
|
- `verify(message: object, signature: string, publicKey: string): boolean` - Verify signature
|
|
- `signConnectionRequest(...)` - Sign connection request
|
|
- `verifyConnectionRequest(...)` - Verify connection request
|
|
|
|
### 3. Create Connection Service
|
|
|
|
Create `/apps/api/src/federation/connection.service.ts`:
|
|
|
|
- `initiateConnection(workspaceId, remoteUrl)` - Start connection handshake
|
|
- `acceptConnection(workspaceId, connectionId)` - Accept pending connection
|
|
- `rejectConnection(workspaceId, connectionId, reason)` - Reject connection
|
|
- `disconnect(workspaceId, connectionId, reason)` - Disconnect active connection
|
|
- `getConnections(workspaceId, status?)` - List connections
|
|
- `getConnection(workspaceId, connectionId)` - Get single connection
|
|
|
|
### 4. Add API Endpoints
|
|
|
|
Extend `FederationController` with:
|
|
|
|
- `POST /api/v1/federation/connections/initiate` - Initiate connection to remote instance
|
|
- `POST /api/v1/federation/connections/:id/accept` - Accept incoming connection
|
|
- `POST /api/v1/federation/connections/:id/reject` - Reject incoming connection
|
|
- `POST /api/v1/federation/connections/:id/disconnect` - Disconnect active connection
|
|
- `GET /api/v1/federation/connections` - List workspace connections
|
|
- `GET /api/v1/federation/connections/:id` - Get connection details
|
|
- `POST /api/v1/federation/incoming/connect` - Public endpoint for receiving connection requests
|
|
|
|
### 5. Connection Handshake Flow
|
|
|
|
**Initiator (Instance A) → Target (Instance B)**
|
|
|
|
1. Instance A calls `POST /api/v1/federation/connections/initiate` with `remoteUrl`
|
|
2. Service creates connection record with status=PENDING
|
|
3. Service fetches remote instance identity from `GET {remoteUrl}/api/v1/federation/instance`
|
|
4. Service creates signed ConnectionRequest
|
|
5. Service sends request to `POST {remoteUrl}/api/v1/federation/incoming/connect`
|
|
6. Instance B receives request, validates signature
|
|
7. Instance B creates connection record with status=PENDING
|
|
8. Instance B can accept (status=ACTIVE) or reject (status=DISCONNECTED)
|
|
9. Instance B sends signed ConnectionResponse back to Instance A
|
|
10. Instance A updates connection status based on response
|
|
|
|
### 6. Security Considerations
|
|
|
|
- All connection requests must be signed with instance private key
|
|
- All responses must be verified using remote instance public key
|
|
- Timestamps must be within 5 minutes to prevent replay attacks
|
|
- Connection requests must come from authenticated workspace members
|
|
- Public key must match the one fetched from remote instance identity endpoint
|
|
|
|
### 7. Testing Strategy
|
|
|
|
**Unit Tests** (TDD approach):
|
|
|
|
- SignatureService.sign() creates valid signatures
|
|
- SignatureService.verify() validates signatures correctly
|
|
- SignatureService.verify() rejects invalid signatures
|
|
- ConnectionService.initiateConnection() creates PENDING connection
|
|
- ConnectionService.acceptConnection() updates to ACTIVE
|
|
- ConnectionService.rejectConnection() marks as DISCONNECTED
|
|
- ConnectionService.disconnect() updates active connection to DISCONNECTED
|
|
- Timestamp validation rejects old requests (>5 min)
|
|
|
|
**Integration Tests**:
|
|
|
|
- POST /connections/initiate creates connection and calls remote
|
|
- POST /incoming/connect validates signature and creates connection
|
|
- POST /connections/:id/accept updates status correctly
|
|
- POST /connections/:id/reject marks connection as rejected
|
|
- POST /connections/:id/disconnect disconnects active connection
|
|
- GET /connections returns workspace connections
|
|
- Workspace isolation (can't access other workspace connections)
|
|
|
|
## Progress
|
|
|
|
- [x] Create scratchpad
|
|
- [x] Create connection.types.ts with protocol types
|
|
- [x] Write tests for SignatureService (18 tests)
|
|
- [x] Implement SignatureService (sign, verify, validateTimestamp)
|
|
- [x] Write tests for ConnectionService (20 tests)
|
|
- [x] Implement ConnectionService (all 8 methods)
|
|
- [x] Write tests for connection API endpoints (13 tests)
|
|
- [x] Implement connection API endpoints (7 endpoints)
|
|
- [x] Update FederationModule with new providers
|
|
- [x] Verify all tests pass (70/70 passing)
|
|
- [x] Verify type checking passes
|
|
- [x] Verify test coverage ≥85% (100% coverage on new code)
|
|
- [x] Commit changes (commit fc39190)
|
|
|
|
## Testing Plan
|
|
|
|
### Unit Tests
|
|
|
|
1. **SignatureService**:
|
|
- Should create RSA SHA-256 signatures
|
|
- Should verify valid signatures
|
|
- Should reject invalid signatures
|
|
- Should reject tampered messages
|
|
- Should reject expired timestamps
|
|
|
|
2. **ConnectionService**:
|
|
- Should initiate connection with PENDING status
|
|
- Should fetch remote instance identity before connecting
|
|
- Should create signed connection request
|
|
- Should accept connection and update to ACTIVE
|
|
- Should reject connection with reason
|
|
- Should disconnect active connection
|
|
- Should list connections for workspace
|
|
- Should enforce workspace isolation
|
|
|
|
### Integration Tests
|
|
|
|
1. **POST /api/v1/federation/connections/initiate**:
|
|
- Should require authentication
|
|
- Should create connection record
|
|
- Should fetch remote instance identity
|
|
- Should return connection details
|
|
|
|
2. **POST /api/v1/federation/incoming/connect**:
|
|
- Should validate connection request signature
|
|
- Should reject requests with invalid signatures
|
|
- Should reject requests with old timestamps
|
|
- Should create pending connection
|
|
|
|
3. **POST /api/v1/federation/connections/:id/accept**:
|
|
- Should require authentication
|
|
- Should update connection to ACTIVE
|
|
- Should set connectedAt timestamp
|
|
- Should enforce workspace ownership
|
|
|
|
4. **POST /api/v1/federation/connections/:id/reject**:
|
|
- Should require authentication
|
|
- Should update connection to DISCONNECTED
|
|
- Should store rejection reason
|
|
|
|
5. **POST /api/v1/federation/connections/:id/disconnect**:
|
|
- Should require authentication
|
|
- Should disconnect active connection
|
|
- Should set disconnectedAt timestamp
|
|
|
|
6. **GET /api/v1/federation/connections**:
|
|
- Should list workspace connections
|
|
- Should filter by status if provided
|
|
- Should enforce workspace isolation
|
|
|
|
## Design Decisions
|
|
|
|
1. **RSA Signatures**: Use RSA SHA-256 for signing (matches existing keypair format)
|
|
2. **Timestamp Validation**: 5-minute window to prevent replay attacks
|
|
3. **Workspace Scoping**: All connections belong to a workspace for RLS
|
|
4. **Stateless Protocol**: Each request is independently signed and verified
|
|
5. **Public Connection Endpoint**: `/incoming/connect` is public (no auth) but requires valid signature
|
|
6. **State Transitions**: PENDING → ACTIVE, PENDING → DISCONNECTED, ACTIVE → DISCONNECTED
|
|
|
|
## Notes
|
|
|
|
- Connection requests are workspace-scoped (authenticated users only)
|
|
- Incoming connection endpoint is public but cryptographically verified
|
|
- Need to handle network errors gracefully when calling remote instances
|
|
- Should validate remote instance URL format before attempting connection
|
|
- Consider rate limiting for incoming connection requests (future enhancement)
|