# PRD: MS22 — Fleet Evolution (DB-Centric Agent Architecture) ## Metadata - Owner: Jason Woltje - Date: 2026-03-01 - Status: in-progress - Design Doc: `docs/design/MS22-DB-CENTRIC-ARCHITECTURE.md` ## Problem Statement Mosaic Stack needs a multi-user agent fleet where each user gets their own isolated OpenClaw instance with their own LLM provider credentials and agent config. The system must be Docker-first with minimal environment variables and all configuration managed through the WebUI. ## Objectives 1. **Minimal bootstrap** — 2 env vars (`DATABASE_URL`, `MOSAIC_SECRET_KEY`) to start the entire stack 2. **DB-centric config** — All runtime config in Postgres, managed via WebUI 3. **Per-user isolation** — Each user gets their own OpenClaw container with own API keys, memory, sessions 4. **Onboarding wizard** — First-boot experience: breakglass admin → OIDC → LLM provider → agent config 5. **Settings UI** — Runtime management of providers, agents, and auth config 6. **Mosaic as gatekeeper** — Users never talk to OpenClaw directly; Mosaic proxies all requests 7. **Zero cross-user access** — Full container, volume, and DB isolation between users ## Security Requirements - User A cannot access User B's API keys, chat history, or agent memory - All API keys stored encrypted (AES-256-GCM) in database - Breakglass admin always works as OIDC fallback - OIDC config stored in DB (not env vars) — configured via settings UI - Container-to-container communication blocked by default - Admin cannot decrypt other users' API keys ## Phase 0: Knowledge Layer — COMPLETE - Findings API (pgvector, CRUD, similarity search) - AgentMemory API (key/value store) - ConversationArchive API (pgvector, ingest, search) - OpenClaw mosaic skill - Session log ingestion pipeline ## Phase 1: DB-Centric Agent Fleet ### Phase 1a: DB Schema — COMPLETE - SystemConfig, BreakglassUser, LlmProvider, UserContainer, SystemContainer, UserAgentConfig tables ### Phase 1b: Encryption Service — COMPLETE - CryptoService (AES-256-GCM using MOSAIC_SECRET_KEY) ### Phase 1c: Internal Config API - `GET /api/internal/agent-config/:id` — assembles openclaw.json from DB - Auth: bearer token (container's own gateway token) - Returns complete openclaw.json with decrypted provider credentials ### Phase 1d: Container Lifecycle Manager - Docker API integration via `dockerode` npm package - Start/stop/health-check/reap user containers - Auto-generate gateway tokens, assign ports - Docker socket access required (`/var/run/docker.sock`) ### Phase 1e: Onboarding API - First-boot detection (`SystemConfig.onboarding.completed`) - `POST /api/onboarding/breakglass` — create admin user - `POST /api/onboarding/oidc` — save OIDC provider config - `POST /api/onboarding/provider` — add LLM provider + test connection - `POST /api/onboarding/complete` — mark done ### Phase 1f: Onboarding Wizard UI - Multi-step wizard component - Skip-able OIDC step - LLM provider connection test ### Phase 1g: Settings API - CRUD: LLM providers (per-user scoped) - CRUD: Agent config (model assignments, personalities) - CRUD: OIDC config (admin only) - Breakglass password reset (admin only) ### Phase 1h: Settings UI - Settings/Providers page - Settings/Agent Config page - Settings/Auth page (OIDC + breakglass) ### Phase 1i: Chat Proxy - Route WebUI chat to user's OpenClaw container - SSE streaming pass-through - Ensure container is running before proxying (auto-start) ### Phase 1j: Docker Compose + Entrypoint - Simplified compose (core services only — user containers are dynamic) - Entrypoint: fetch config from API, write openclaw.json, start gateway - Health check integration ### Phase 1k: Idle Reaper - Cron job to stop inactive user containers - Configurable idle timeout (default 30min) - Preserve state volumes ## Future Phases (out of scope) - Phase 2: Agent fleet standup (predefined agent roles) - Phase 3: WebUI chat + task management integration - Phase 4: Multi-LLM provider management UI (advanced) - Team workspaces (shared agent contexts) — explicitly out of scope