Files
stack/docs/scratchpads/85-connect-disconnect-protocol.md
Jason Woltje 12abdfe81d feat(#93): implement agent spawn via federation
Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 14:37:06 -06:00

8.4 KiB

Issue #85: [FED-002] CONNECT/DISCONNECT Protocol

Objective

Implement the connection handshake protocol for federation, building on the Instance Identity Model from issue #84. This includes:

  • Connection request/accept/reject handshake
  • Message signing and verification using instance keypairs
  • Connection state management (PENDING → ACTIVE, DISCONNECTED)
  • API endpoints for initiating and managing connections
  • Proper error handling and validation

Context

Issue #84 provides the foundation:

  • Instance model with keypair for signing
  • FederationConnection model with status enum (PENDING, ACTIVE, SUSPENDED, DISCONNECTED)
  • FederationService with identity management
  • CryptoService for encryption/decryption
  • Database schema is already in place

Approach

1. Create Types for Connection Protocol

Define TypeScript interfaces in /apps/api/src/federation/types/connection.types.ts:

// Connection request payload
interface ConnectionRequest {
  instanceId: string;
  instanceUrl: string;
  publicKey: string;
  capabilities: FederationCapabilities;
  timestamp: number;
  signature: string; // Sign entire payload with private key
}

// Connection response
interface ConnectionResponse {
  accepted: boolean;
  instanceId: string;
  publicKey: string;
  capabilities: FederationCapabilities;
  reason?: string; // If rejected
  timestamp: number;
  signature: string;
}

// Disconnect request
interface DisconnectRequest {
  instanceId: string;
  reason?: string;
  timestamp: number;
  signature: string;
}

2. Add Signature Service

Create /apps/api/src/federation/signature.service.ts for message signing:

  • sign(message: object, privateKey: string): string - Sign a message
  • verify(message: object, signature: string, publicKey: string): boolean - Verify signature
  • signConnectionRequest(...) - Sign connection request
  • verifyConnectionRequest(...) - Verify connection request

3. Create Connection Service

Create /apps/api/src/federation/connection.service.ts:

  • initiateConnection(workspaceId, remoteUrl) - Start connection handshake
  • acceptConnection(workspaceId, connectionId) - Accept pending connection
  • rejectConnection(workspaceId, connectionId, reason) - Reject connection
  • disconnect(workspaceId, connectionId, reason) - Disconnect active connection
  • getConnections(workspaceId, status?) - List connections
  • getConnection(workspaceId, connectionId) - Get single connection

4. Add API Endpoints

Extend FederationController with:

  • POST /api/v1/federation/connections/initiate - Initiate connection to remote instance
  • POST /api/v1/federation/connections/:id/accept - Accept incoming connection
  • POST /api/v1/federation/connections/:id/reject - Reject incoming connection
  • POST /api/v1/federation/connections/:id/disconnect - Disconnect active connection
  • GET /api/v1/federation/connections - List workspace connections
  • GET /api/v1/federation/connections/:id - Get connection details
  • POST /api/v1/federation/incoming/connect - Public endpoint for receiving connection requests

5. Connection Handshake Flow

Initiator (Instance A) → Target (Instance B)

  1. Instance A calls POST /api/v1/federation/connections/initiate with remoteUrl
  2. Service creates connection record with status=PENDING
  3. Service fetches remote instance identity from GET {remoteUrl}/api/v1/federation/instance
  4. Service creates signed ConnectionRequest
  5. Service sends request to POST {remoteUrl}/api/v1/federation/incoming/connect
  6. Instance B receives request, validates signature
  7. Instance B creates connection record with status=PENDING
  8. Instance B can accept (status=ACTIVE) or reject (status=DISCONNECTED)
  9. Instance B sends signed ConnectionResponse back to Instance A
  10. Instance A updates connection status based on response

6. Security Considerations

  • All connection requests must be signed with instance private key
  • All responses must be verified using remote instance public key
  • Timestamps must be within 5 minutes to prevent replay attacks
  • Connection requests must come from authenticated workspace members
  • Public key must match the one fetched from remote instance identity endpoint

7. Testing Strategy

Unit Tests (TDD approach):

  • SignatureService.sign() creates valid signatures
  • SignatureService.verify() validates signatures correctly
  • SignatureService.verify() rejects invalid signatures
  • ConnectionService.initiateConnection() creates PENDING connection
  • ConnectionService.acceptConnection() updates to ACTIVE
  • ConnectionService.rejectConnection() marks as DISCONNECTED
  • ConnectionService.disconnect() updates active connection to DISCONNECTED
  • Timestamp validation rejects old requests (>5 min)

Integration Tests:

  • POST /connections/initiate creates connection and calls remote
  • POST /incoming/connect validates signature and creates connection
  • POST /connections/:id/accept updates status correctly
  • POST /connections/:id/reject marks connection as rejected
  • POST /connections/:id/disconnect disconnects active connection
  • GET /connections returns workspace connections
  • Workspace isolation (can't access other workspace connections)

Progress

  • Create scratchpad
  • Create connection.types.ts with protocol types
  • Write tests for SignatureService (18 tests)
  • Implement SignatureService (sign, verify, validateTimestamp)
  • Write tests for ConnectionService (20 tests)
  • Implement ConnectionService (all 8 methods)
  • Write tests for connection API endpoints (13 tests)
  • Implement connection API endpoints (7 endpoints)
  • Update FederationModule with new providers
  • Verify all tests pass (70/70 passing)
  • Verify type checking passes
  • Verify test coverage ≥85% (100% coverage on new code)
  • Commit changes (commit fc39190)

Testing Plan

Unit Tests

  1. SignatureService:

    • Should create RSA SHA-256 signatures
    • Should verify valid signatures
    • Should reject invalid signatures
    • Should reject tampered messages
    • Should reject expired timestamps
  2. ConnectionService:

    • Should initiate connection with PENDING status
    • Should fetch remote instance identity before connecting
    • Should create signed connection request
    • Should accept connection and update to ACTIVE
    • Should reject connection with reason
    • Should disconnect active connection
    • Should list connections for workspace
    • Should enforce workspace isolation

Integration Tests

  1. POST /api/v1/federation/connections/initiate:

    • Should require authentication
    • Should create connection record
    • Should fetch remote instance identity
    • Should return connection details
  2. POST /api/v1/federation/incoming/connect:

    • Should validate connection request signature
    • Should reject requests with invalid signatures
    • Should reject requests with old timestamps
    • Should create pending connection
  3. POST /api/v1/federation/connections/:id/accept:

    • Should require authentication
    • Should update connection to ACTIVE
    • Should set connectedAt timestamp
    • Should enforce workspace ownership
  4. POST /api/v1/federation/connections/:id/reject:

    • Should require authentication
    • Should update connection to DISCONNECTED
    • Should store rejection reason
  5. POST /api/v1/federation/connections/:id/disconnect:

    • Should require authentication
    • Should disconnect active connection
    • Should set disconnectedAt timestamp
  6. GET /api/v1/federation/connections:

    • Should list workspace connections
    • Should filter by status if provided
    • Should enforce workspace isolation

Design Decisions

  1. RSA Signatures: Use RSA SHA-256 for signing (matches existing keypair format)
  2. Timestamp Validation: 5-minute window to prevent replay attacks
  3. Workspace Scoping: All connections belong to a workspace for RLS
  4. Stateless Protocol: Each request is independently signed and verified
  5. Public Connection Endpoint: /incoming/connect is public (no auth) but requires valid signature
  6. State Transitions: PENDING → ACTIVE, PENDING → DISCONNECTED, ACTIVE → DISCONNECTED

Notes

  • Connection requests are workspace-scoped (authenticated users only)
  • Incoming connection endpoint is public but cryptographically verified
  • Need to handle network errors gracefully when calling remote instances
  • Should validate remote instance URL format before attempting connection
  • Consider rate limiting for incoming connection requests (future enhancement)