Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4.2 KiB
4.2 KiB
Issue ORCH-109: Agent lifecycle management
Objective
Implement agent lifecycle management service to manage state transitions through the agent lifecycle (spawning → running → completed/failed/killed).
Approach
Following TDD principles:
- Write failing tests first for all state transition scenarios
- Implement minimal code to make tests pass
- Refactor while keeping tests green
The service will:
- Enforce valid state transitions using state machine
- Persist agent state changes to Valkey
- Emit pub/sub events on state changes
- Track agent metadata (startedAt, completedAt, error)
- Integrate with ValkeyService and AgentSpawnerService
Acceptance Criteria
src/spawner/agent-lifecycle.service.tsimplemented- State transitions: spawning → running → completed/failed/killed
- State persisted in Valkey
- Events emitted on state changes (pub/sub)
- Agent metadata tracked (startedAt, completedAt, error)
- State machine enforces valid transitions only
- Comprehensive unit tests with ≥85% coverage
- Tests follow TDD (written first)
Implementation Details
State Machine
Valid transitions (from state.types.ts):
spawning→running,failed,killedrunning→completed,failed,killedcompleted→ (terminal state)failed→ (terminal state)killed→ (terminal state)
Key Methods
transitionToRunning(agentId)- Move agent from spawning to runningtransitionToCompleted(agentId)- Mark agent as completedtransitionToFailed(agentId, error)- Mark agent as failed with errortransitionToKilled(agentId)- Mark agent as killedgetAgentLifecycleState(agentId)- Get current lifecycle state
Events Emitted
agent.running- When transitioning to runningagent.completed- When transitioning to completedagent.failed- When transitioning to failedagent.killed- When transitioning to killed
Progress
- Read issue requirements
- Create scratchpad
- Write unit tests (TDD - RED phase)
- Implement service (TDD - GREEN phase)
- Refactor and add edge case tests
- Verify test coverage = 100%
- Add service to module exports
- Verify build passes
- Create Gitea issue
- Close Gitea issue with completion notes
Testing
Test coverage: 100% (28 tests)
Coverage areas:
- Valid state transitions (spawning→running→completed)
- Valid state transitions (spawning→failed, running→failed)
- Valid state transitions (spawning→killed, running→killed)
- Invalid state transitions (should throw errors)
- Event emission on state changes
- State persistence in Valkey
- Metadata tracking (timestamps, errors)
- Conditional timestamp setting (startedAt, completedAt)
- Agent not found error handling
- List operations
Notes
- State transition validation logic already exists in
state.types.ts - ValkeyService provides state persistence and pub/sub
- AgentSpawnerService manages agent sessions in memory
- This service bridges the two by managing lifecycle + persistence
Completion Summary
Successfully implemented ORCH-109 following TDD principles:
Files Created
/home/localadmin/src/mosaic-stack/apps/orchestrator/src/spawner/agent-lifecycle.service.ts- Main service implementation/home/localadmin/src/mosaic-stack/apps/orchestrator/src/spawner/agent-lifecycle.service.spec.ts- Comprehensive tests (28 tests, 100% coverage)
Files Modified
/home/localadmin/src/mosaic-stack/apps/orchestrator/src/spawner/spawner.module.ts- Added service to module/home/localadmin/src/mosaic-stack/apps/orchestrator/src/spawner/index.ts- Exported service
Key Features Implemented
- State transition enforcement via state machine
- State persistence in Valkey
- Pub/sub event emission on state changes
- Metadata tracking (startedAt, completedAt, error)
- Comprehensive error handling
- 100% test coverage (28 tests)
Gitea Issue
- Created: #244
- Status: Closed
- URL: #244
Next Steps
This service is now ready for integration with:
- ORCH-117: Killswitch implementation (depends on this)
- ORCH-127: E2E test for concurrent agents (depends on this)