Files
stack/docs/scratchpads/orch-117-killswitch.md
Jason Woltje 12abdfe81d feat(#93): implement agent spawn via federation
Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 14:37:06 -06:00

3.7 KiB

Issue ORCH-117: Killswitch Implementation

Objective

Implement emergency stop functionality to kill single agent or all agents immediately, with proper cleanup of Docker containers, git worktrees, and state updates.

Approach

  1. Create KillswitchService with methods:
    • killAgent(agentId) - Kill single agent
    • killAllAgents() - Kill all active agents
  2. Implement cleanup orchestration:
    • Immediate termination (SIGKILL)
    • Cleanup Docker containers (via DockerSandboxService)
    • Cleanup git worktrees (via WorktreeManagerService)
    • Update agent state to 'killed' (via AgentLifecycleService)
    • Audit trail logging
  3. Add API endpoints to AgentsController:
    • POST /agents/:agentId/kill
    • POST /agents/kill-all
  4. Follow TDD: write tests first, then implementation
  5. Ensure test coverage >= 85%

Progress

  • Read ORCH-117 requirements
  • Understand existing service interfaces
  • Create scratchpad
  • Write killswitch.service.spec.ts tests (13 tests)
  • Implement killswitch.service.ts
  • Add controller endpoints (POST /agents/:agentId/kill, POST /agents/kill-all)
  • Write controller tests (7 tests)
  • Update killswitch.module.ts
  • Verify test coverage (100% statements, 85% branches, 100% functions)
  • Create Gitea issue
  • Close Gitea issue

Testing

Following TDD (Red-Green-Refactor):

  1. RED: Write failing tests for killswitch functionality
  2. GREEN: Implement minimal code to pass tests
  3. REFACTOR: Clean up implementation

Test coverage areas:

  • Single agent kill with successful cleanup
  • Kill all agents
  • Error handling for non-existent agents
  • Partial cleanup failures (Docker but not worktree)
  • Audit logging verification

Notes

  • Killswitch bypasses all queues - must respond within seconds
  • Cleanup should be best-effort (log failures but continue)
  • State transition to 'killed' enforced by AgentLifecycleService
  • Need to handle agents in different states (spawning, running)
  • Docker containers may not exist if sandbox is disabled

Implementation Summary

Files Created

  1. /home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.service.ts

    • killAgent(agentId) - Kill single agent with full cleanup
    • killAllAgents() - Kill all active agents
    • Best-effort cleanup: Docker containers, git worktrees
    • Audit trail logging for all killswitch operations
  2. /home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.service.spec.ts

    • 13 comprehensive tests covering all scenarios
    • 100% code coverage (statements, functions, lines)
    • 85% branch coverage
  3. /home/localadmin/src/mosaic-stack/apps/orchestrator/src/api/agents/agents-killswitch.controller.spec.ts

    • 7 controller tests for killswitch endpoints
    • Full coverage of success and error paths

Files Modified

  1. /home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.module.ts

    • Added KillswitchService provider
    • Imported SpawnerModule, GitModule, ValkeyModule
    • Exported KillswitchService for use in controllers
  2. /home/localadmin/src/mosaic-stack/apps/orchestrator/src/api/agents/agents.controller.ts

    • Added POST /agents/:agentId/kill endpoint
    • Added POST /agents/kill-all endpoint
    • Integrated KillswitchService
  3. /home/localadmin/src/mosaic-stack/apps/orchestrator/src/api/agents/agents.module.ts

    • Imported KillswitchModule

Test Results

  • All 20 tests passing (13 service + 7 controller)
  • Killswitch service: 100% coverage
  • Error handling: Properly propagates errors from state transitions
  • Resilience: Continues cleanup even if Docker or worktree cleanup fails
  • Filtering: Only kills active agents (spawning/running states)