Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
API: - Add AuthModule import to RunnerJobsModule - Fixes: Nest can't resolve dependencies of AuthGuard Orchestrator: - Remove --prod flag from dependency installation - Copy full node_modules tree to production stage - Align Dockerfile with API pattern for monorepo builds - Fixes: Cannot find module '@nestjs/core' Both services now match the working API Dockerfile pattern.
Mosaic Orchestrator
Agent orchestration service for Mosaic Stack built with NestJS.
Overview
The Orchestrator is the execution plane of Mosaic Stack, responsible for:
- Spawning and managing Claude agents (worker, reviewer, tester)
- Task queue management via BullMQ with Valkey backend
- Agent lifecycle state machine (spawning → running → completed/failed/killed)
- Git workflow automation with worktree isolation per agent
- Quality gate enforcement via Coordinator integration
- Killswitch emergency stop with cleanup
- Docker sandbox isolation (optional)
- Secret scanning on agent commits
Architecture
AppModule
├── HealthModule → GET /health, GET /health/ready
├── AgentsModule → POST /agents/spawn, GET /agents/:id/status, kill endpoints
│ ├── QueueModule → BullMQ task queue (priority 1-10, retry with backoff)
│ ├── SpawnerModule → Agent session management, Docker sandbox, lifecycle FSM
│ ├── KillswitchModule → Emergency kill + cleanup (Docker, worktree, Valkey state)
│ └── ValkeyModule → Distributed state persistence and pub/sub events
├── CoordinatorModule → Quality gate checks (typecheck, lint, tests, coverage, AI review)
├── GitModule → Clone, branch, commit, push, conflict detection, secret scanning
└── MonitorModule → Agent health monitoring (placeholder)
Part of the Mosaic Stack monorepo at apps/orchestrator/.
Controlled by apps/coordinator/ (Quality Coordinator).
Monitored via apps/web/ (Agent Dashboard).
API Reference
Health
| Method | Path | Description |
|---|---|---|
| GET | /health |
Uptime and status |
| GET | /health/ready |
Readiness check |
Agents
| Method | Path | Description |
|---|---|---|
| POST | /agents/spawn |
Spawn a new agent |
| GET | /agents/:agentId/status |
Get agent status |
| POST | /agents/:agentId/kill |
Kill a single agent |
| POST | /agents/kill-all |
Kill all active agents |
POST /agents/spawn
{
"taskId": "string (required)",
"agentType": "worker | reviewer | tester",
"gateProfile": "strict | standard | minimal | custom (optional)",
"context": {
"repository": "https://git.example.com/repo.git",
"branch": "main",
"workItems": ["US-001"],
"skills": ["typescript"]
}
}
Response:
{
"agentId": "uuid",
"status": "spawning"
}
GET /agents/:agentId/status
Response:
{
"agentId": "uuid",
"taskId": "string",
"status": "spawning | running | completed | failed | killed",
"spawnedAt": "ISO timestamp",
"startedAt": "ISO timestamp (optional)",
"completedAt": "ISO timestamp (optional)",
"error": "string (optional)"
}
POST /agents/kill-all
Response:
{
"message": "Kill all completed: 3 killed, 0 failed",
"total": 3,
"killed": 3,
"failed": 0,
"errors": []
}
Services
| Service | Module | Responsibility |
|---|---|---|
| AgentSpawnerService | Spawner | Create agent sessions, generate UUIDs, track state |
| AgentLifecycleService | Spawner | State machine transitions with Valkey pub/sub events |
| DockerSandboxService | Spawner | Container creation with memory/CPU limits |
| QueueService | Queue | BullMQ priority queue with exponential backoff retry |
| KillswitchService | Killswitch | Emergency agent termination with audit logging |
| CleanupService | Killswitch | Multi-step cleanup (Docker, worktree, Valkey state) |
| GitOperationsService | Git | Clone, branch, commit, push operations |
| WorktreeManagerService | Git | Per-agent worktree isolation |
| ConflictDetectionService | Git | Merge conflict detection before push |
| SecretScannerService | Git | Detect hardcoded secrets (AWS, API keys, JWTs, etc.) |
| ValkeyService | Valkey | Distributed state and event pub/sub |
| CoordinatorClientService | Coordinator | HTTP client for quality gate API with retry |
| QualityGatesService | Coordinator | Pre-commit and post-commit gate evaluation |
Valkey State Keys
orchestrator:task:{taskId} → TaskState (status, agentId, context, timestamps)
orchestrator:agent:{agentId} → AgentState (status, taskId, timestamps, error)
orchestrator:events → Pub/sub channel for lifecycle events
Quality Gate Profiles
| Profile | Default For | Gates |
|---|---|---|
| strict | reviewer | typecheck, lint, tests, coverage (85%), build, integration, AI review |
| standard | worker | typecheck, lint, tests, coverage (85%) |
| minimal | tester | tests only |
Development
# Install dependencies (from monorepo root)
pnpm install
# Run in dev mode
pnpm --filter @mosaic/orchestrator dev
# Build
pnpm --filter @mosaic/orchestrator build
# Run unit tests
pnpm --filter @mosaic/orchestrator test
# Run E2E/integration tests
pnpm --filter @mosaic/orchestrator test:e2e
# Type check
pnpm --filter @mosaic/orchestrator typecheck
# Lint
pnpm --filter @mosaic/orchestrator lint
Testing
- Unit tests: Co-located
*.spec.tsfiles (19 test files, 447+ tests) - Integration tests:
tests/integration/*.e2e-spec.ts(17 E2E tests) - Coverage threshold: 85% (lines, functions, branches, statements)
Configuration
Environment variables loaded via @nestjs/config. Key variables:
| Variable | Description |
|---|---|
ORCHESTRATOR_PORT |
HTTP port (default: 3001) |
CLAUDE_API_KEY |
Claude API key for agents |
VALKEY_HOST |
Valkey/Redis host (default: localhost) |
VALKEY_PORT |
Valkey/Redis port (default: 6379) |
COORDINATOR_URL |
Quality Coordinator base URL |
SANDBOX_ENABLED |
Enable Docker sandbox (true/false) |
Related Documentation
- Design:
docs/design/agent-orchestration.md - Setup:
docs/ORCHESTRATOR-MONOREPO-SETUP.md - Milestone: M6-AgentOrchestration (0.0.6)