Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
103 lines
3.7 KiB
Markdown
103 lines
3.7 KiB
Markdown
# Issue ORCH-117: Killswitch Implementation
|
|
|
|
## Objective
|
|
|
|
Implement emergency stop functionality to kill single agent or all agents immediately, with proper cleanup of Docker containers, git worktrees, and state updates.
|
|
|
|
## Approach
|
|
|
|
1. Create KillswitchService with methods:
|
|
- `killAgent(agentId)` - Kill single agent
|
|
- `killAllAgents()` - Kill all active agents
|
|
2. Implement cleanup orchestration:
|
|
- Immediate termination (SIGKILL)
|
|
- Cleanup Docker containers (via DockerSandboxService)
|
|
- Cleanup git worktrees (via WorktreeManagerService)
|
|
- Update agent state to 'killed' (via AgentLifecycleService)
|
|
- Audit trail logging
|
|
3. Add API endpoints to AgentsController:
|
|
- POST /agents/:agentId/kill
|
|
- POST /agents/kill-all
|
|
4. Follow TDD: write tests first, then implementation
|
|
5. Ensure test coverage >= 85%
|
|
|
|
## Progress
|
|
|
|
- [x] Read ORCH-117 requirements
|
|
- [x] Understand existing service interfaces
|
|
- [x] Create scratchpad
|
|
- [x] Write killswitch.service.spec.ts tests (13 tests)
|
|
- [x] Implement killswitch.service.ts
|
|
- [x] Add controller endpoints (POST /agents/:agentId/kill, POST /agents/kill-all)
|
|
- [x] Write controller tests (7 tests)
|
|
- [x] Update killswitch.module.ts
|
|
- [x] Verify test coverage (100% statements, 85% branches, 100% functions)
|
|
- [x] Create Gitea issue
|
|
- [x] Close Gitea issue
|
|
|
|
## Testing
|
|
|
|
Following TDD (Red-Green-Refactor):
|
|
|
|
1. RED: Write failing tests for killswitch functionality
|
|
2. GREEN: Implement minimal code to pass tests
|
|
3. REFACTOR: Clean up implementation
|
|
|
|
Test coverage areas:
|
|
|
|
- Single agent kill with successful cleanup
|
|
- Kill all agents
|
|
- Error handling for non-existent agents
|
|
- Partial cleanup failures (Docker but not worktree)
|
|
- Audit logging verification
|
|
|
|
## Notes
|
|
|
|
- Killswitch bypasses all queues - must respond within seconds
|
|
- Cleanup should be best-effort (log failures but continue)
|
|
- State transition to 'killed' enforced by AgentLifecycleService
|
|
- Need to handle agents in different states (spawning, running)
|
|
- Docker containers may not exist if sandbox is disabled
|
|
|
|
## Implementation Summary
|
|
|
|
### Files Created
|
|
|
|
1. `/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.service.ts`
|
|
- `killAgent(agentId)` - Kill single agent with full cleanup
|
|
- `killAllAgents()` - Kill all active agents
|
|
- Best-effort cleanup: Docker containers, git worktrees
|
|
- Audit trail logging for all killswitch operations
|
|
|
|
2. `/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.service.spec.ts`
|
|
- 13 comprehensive tests covering all scenarios
|
|
- 100% code coverage (statements, functions, lines)
|
|
- 85% branch coverage
|
|
|
|
3. `/home/localadmin/src/mosaic-stack/apps/orchestrator/src/api/agents/agents-killswitch.controller.spec.ts`
|
|
- 7 controller tests for killswitch endpoints
|
|
- Full coverage of success and error paths
|
|
|
|
### Files Modified
|
|
|
|
1. `/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.module.ts`
|
|
- Added KillswitchService provider
|
|
- Imported SpawnerModule, GitModule, ValkeyModule
|
|
- Exported KillswitchService for use in controllers
|
|
|
|
2. `/home/localadmin/src/mosaic-stack/apps/orchestrator/src/api/agents/agents.controller.ts`
|
|
- Added POST /agents/:agentId/kill endpoint
|
|
- Added POST /agents/kill-all endpoint
|
|
- Integrated KillswitchService
|
|
|
|
3. `/home/localadmin/src/mosaic-stack/apps/orchestrator/src/api/agents/agents.module.ts`
|
|
- Imported KillswitchModule
|
|
|
|
### Test Results
|
|
|
|
- All 20 tests passing (13 service + 7 controller)
|
|
- Killswitch service: 100% coverage
|
|
- Error handling: Properly propagates errors from state transitions
|
|
- Resilience: Continues cleanup even if Docker or worktree cleanup fails
|
|
- Filtering: Only kills active agents (spawning/running states)
|