Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4.6 KiB
4.6 KiB
Issue ORCH-118: Resource cleanup
Objective
Create a dedicated CleanupService that handles resource cleanup when agents terminate (completion, failure, or killswitch). Extract cleanup logic from KillswitchService into a reusable service with proper event emission.
Approach
- Create
CleanupServiceinsrc/killswitch/cleanup.service.ts - Extract cleanup logic from
KillswitchService.performCleanup() - Add event emission for cleanup operations
- Integrate with existing services (DockerSandboxService, WorktreeManagerService, ValkeyService)
- Update KillswitchService to use CleanupService
- Write comprehensive unit tests following TDD
Acceptance Criteria
src/killswitch/cleanup.service.tsimplemented- Stop Docker container
- Remove Docker container
- Remove git worktree
- Clear Valkey state
- Emit cleanup event
- Run cleanup on: agent completion, agent failure, killswitch
- NestJS service with proper dependency injection
- Comprehensive unit tests with ≥85% coverage
Progress
- Read ORCH-118 requirements
- Analyze existing KillswitchService implementation
- Understand event system (Valkey pub/sub)
- Create scratchpad
- Write tests for CleanupService (TDD - RED)
- Implement CleanupService (TDD - GREEN)
- Refactor KillswitchService to use CleanupService
- Update KillswitchModule with CleanupService
- Run tests - all 25 tests pass (10 cleanup, 8 killswitch, 7 controller)
- Add agent.cleanup event type to events.types.ts
- Create Gitea issue #253
- Close Gitea issue with completion notes
Testing
Test Scenarios
- Successful cleanup: All resources cleaned up successfully
- Docker cleanup failure: Continue to other cleanup steps
- Worktree cleanup failure: Continue to other cleanup steps
- Missing containerId: Skip Docker cleanup
- Missing repository: Skip worktree cleanup
- Docker disabled: Skip Docker cleanup
- Event emission: Verify cleanup event published
- Valkey state clearing: Verify agent state deleted
Technical Notes
- CleanupService should be reusable by KillswitchService, lifecycle service, etc.
- Best-effort cleanup: log errors but continue with other cleanup steps
- Event emission: Use
agent.cleanupevent type (need to add to EventType) - Valkey state: Use
deleteAgentState()to clear state after cleanup - Integration: Service should be injectable and testable
Dependencies
- DockerSandboxService (container cleanup)
- WorktreeManagerService (git worktree cleanup)
- ValkeyService (state management + event emission)
Event Structure
{
type: 'agent.cleanup',
agentId: string,
taskId: string,
timestamp: string,
cleanup: {
docker: boolean,
worktree: boolean,
state: boolean
}
}
Completion Summary
Issue: #253 [ORCH-118] Resource cleanup Status: CLOSED ✓
Implementation Details
Created a dedicated CleanupService that provides reusable agent resource cleanup with the following features:
- Best-effort cleanup strategy - Continues even if individual steps fail
- Comprehensive logging - Logs each step and any errors
- Event emission - Publishes cleanup events with detailed status
- Service integration - Properly integrated via NestJS dependency injection
- Reusability - Can be used by KillswitchService, lifecycle service, or any other service
Files Created
/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/cleanup.service.ts(135 lines)/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/cleanup.service.spec.ts(386 lines, 10 tests)
Files Modified
/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.service.ts- Refactored to use CleanupService/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.service.spec.ts- Updated tests/home/localadmin/src/mosaic-stack/apps/orchestrator/src/killswitch/killswitch.module.ts- Added CleanupService provider/export/home/localadmin/src/mosaic-stack/apps/orchestrator/src/valkey/types/events.types.ts- Added agent.cleanup event type
Test Results
✓ All 25 tests pass
- 10 CleanupService tests (comprehensive coverage)
- 8 KillswitchService tests (refactored)
- 7 Controller tests (API endpoints)
Cleanup Flow
- Docker container (stop and remove) - skipped if no containerId or sandbox disabled
- Git worktree (remove) - skipped if no repository
- Valkey state (delete agent state) - always attempted
- Event emission (agent.cleanup with results) - always attempted
Each step is independent and continues even if previous steps fail.