Files
stack/docs/scratchpads/orch-107-valkey.md
Jason Woltje 12abdfe81d feat(#93): implement agent spawn via federation
Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 14:37:06 -06:00

6.8 KiB

Issue ORCH-107: Valkey client and state management

Objective

Implement Valkey client and state management system for the orchestrator service using ioredis for:

  • Connection management
  • State persistence for tasks and agents
  • Pub/sub for events (agent spawned, completed, failed)
  • Task and agent state machines

Acceptance Criteria

  • Create scratchpad document
  • src/valkey/client.ts with ioredis connection
  • State schema implemented (tasks, agents, queue)
  • Pub/sub for events (agent spawned, completed, failed)
  • Task state: pending, assigned, executing, completed, failed
  • Agent state: spawning, running, completed, failed, killed
  • Unit tests with ≥85% coverage (TDD approach) - Achieved 96.96% branch coverage
  • Configuration from environment variables

Approach

TDD Implementation Plan (Red-Green-Refactor)

  1. Phase 1: Valkey Client Foundation

    • Write tests for ValkeyClient connection management
    • Implement ValkeyClient with ioredis
    • Write tests for basic get/set/delete operations
    • Implement basic operations
  2. Phase 2: State Schema & Persistence

    • Write tests for task state persistence
    • Implement task state operations
    • Write tests for agent state persistence
    • Implement agent state operations
  3. Phase 3: Pub/Sub Events

    • Write tests for event publishing
    • Implement event publishing
    • Write tests for event subscription
    • Implement event subscription
  4. Phase 4: NestJS Service Integration

    • Write tests for ValkeyService
    • Implement ValkeyService with dependency injection
    • Update ValkeyModule with providers

State Schema Design

Task State:

interface TaskState {
  taskId: string;
  status: "pending" | "assigned" | "executing" | "completed" | "failed";
  agentId?: string;
  context: TaskContext;
  createdAt: string;
  updatedAt: string;
  metadata?: Record<string, unknown>;
}

Agent State:

interface AgentState {
  agentId: string;
  status: "spawning" | "running" | "completed" | "failed" | "killed";
  taskId: string;
  startedAt?: string;
  completedAt?: string;
  error?: string;
  metadata?: Record<string, unknown>;
}

Event Types:

type EventType =
  | "agent.spawned"
  | "agent.running"
  | "agent.completed"
  | "agent.failed"
  | "agent.killed"
  | "task.assigned"
  | "task.executing"
  | "task.completed"
  | "task.failed";

File Structure

apps/orchestrator/src/valkey/
├── valkey.module.ts              # NestJS module (exists, needs update)
├── valkey.client.ts              # ioredis client wrapper (new)
├── valkey.client.spec.ts         # Client tests (new)
├── valkey.service.ts             # NestJS service (new)
├── valkey.service.spec.ts        # Service tests (new)
├── types/
│   ├── index.ts                  # Type exports (new)
│   ├── state.types.ts            # State interfaces (new)
│   └── events.types.ts           # Event interfaces (new)
└── index.ts                      # Public API exports (new)

Progress

Phase 1: Types and Interfaces

  • Create state.types.ts with TaskState and AgentState
  • Create events.types.ts with event interfaces
  • Create index.ts for type exports

Phase 2: Valkey Client (TDD)

  • Write ValkeyClient tests (connection, basic ops)
  • Implement ValkeyClient
  • Write state persistence tests
  • Implement state persistence methods

Phase 3: Pub/Sub (TDD)

  • Write pub/sub tests
  • Implement pub/sub methods

Phase 4: NestJS Service (TDD)

  • Write ValkeyService tests
  • Implement ValkeyService
  • Update ValkeyModule
  • Add configuration support for VALKEY_PASSWORD
  • Update .env.example with VALKEY_HOST and VALKEY_PASSWORD

Testing

  • Using vitest for unit tests
  • Mock ioredis using ioredis-mock or manual mocks
  • Target: ≥85% coverage
  • Run: pnpm test in apps/orchestrator

Summary

Implementation of ORCH-107 is complete. All acceptance criteria have been met:

What Was Built

  1. State Management Types (types/state.types.ts, types/events.types.ts)

    • TaskState and AgentState interfaces
    • State transition validation
    • Event types for pub/sub
    • Full TypeScript type safety
  2. Valkey Client (valkey.client.ts)

    • ioredis connection management
    • Task state CRUD operations
    • Agent state CRUD operations
    • Pub/sub event system
    • State transition enforcement
    • Error handling
  3. NestJS Service (valkey.service.ts)

    • Dependency injection integration
    • Configuration management via ConfigService
    • Lifecycle management (onModuleDestroy)
    • Convenience methods for common operations
  4. Module Integration (valkey.module.ts)

    • Proper NestJS module setup
    • Service provider configuration
    • ConfigModule import
  5. Comprehensive Tests (45 tests, 96.96% coverage)

    • ValkeyClient unit tests (27 tests)
    • ValkeyService unit tests (18 tests)
    • All state transitions tested
    • Error handling tested
    • Pub/sub functionality tested
    • Edge cases covered

Configuration

Added environment variable support:

  • VALKEY_HOST - Valkey server host (default: localhost)
  • VALKEY_PORT - Valkey server port (default: 6379)
  • VALKEY_PASSWORD - Optional password for authentication
  • VALKEY_URL - Alternative connection string format

Key Features

  • State Machines: Enforces valid state transitions for tasks and agents
  • Type Safety: Full TypeScript types with validation
  • Pub/Sub Events: Real-time event notifications for state changes
  • Modularity: Clean separation of concerns (client, service, module)
  • Testability: Fully mocked tests, no actual Valkey connection required
  • Configuration: Environment-based configuration via NestJS ConfigService

Next Steps

This implementation provides the foundation for:

  • ORCH-108: BullMQ task queue (uses Valkey for state persistence)
  • ORCH-109: Agent lifecycle management (uses state management)
  • Future orchestrator features that need state persistence

Notes

Environment Variables

From orchestrator.config.ts:

  • VALKEY_HOST (default: localhost)
  • VALKEY_PORT (default: 6379)
  • VALKEY_URL (default: redis://localhost:6379)
  • VALKEY_PASSWORD (optional, from .env.example)

Dependencies

  • ioredis: Already installed in package.json (^5.9.2)
  • @nestjs/config: Already installed
  • Configuration already set up in src/config/orchestrator.config.ts

Key Design Decisions

  1. Use ioredis for Valkey client (Redis-compatible)
  2. State keys pattern: orchestrator:{type}:{id}
    • Tasks: orchestrator:task:{taskId}
    • Agents: orchestrator:agent:{agentId}
  3. Pub/sub channel pattern: orchestrator:events
  4. All timestamps in ISO 8601 format
  5. State transitions enforced by state machine logic
  6. Mock ioredis in tests (no actual Valkey connection needed)