From a3a0d7afcaacaa4db104552f7204a4777f5410e3 Mon Sep 17 00:00:00 2001 From: Jason Woltje Date: Sun, 1 Mar 2026 15:05:35 +0000 Subject: [PATCH] chore(orchestrator): add MS22 PRD, mark P1a+P1b done (#608) Co-authored-by: Jason Woltje Co-committed-by: Jason Woltje --- docs/PRD-MS22.md | 114 +++++++++++++++++++++++++++++++++++++++++++++++ docs/TASKS.md | 4 +- 2 files changed, 116 insertions(+), 2 deletions(-) create mode 100644 docs/PRD-MS22.md diff --git a/docs/PRD-MS22.md b/docs/PRD-MS22.md new file mode 100644 index 0000000..ad57375 --- /dev/null +++ b/docs/PRD-MS22.md @@ -0,0 +1,114 @@ +# PRD: MS22 — Fleet Evolution (DB-Centric Agent Architecture) + +## Metadata + +- Owner: Jason Woltje +- Date: 2026-03-01 +- Status: in-progress +- Design Doc: `docs/design/MS22-DB-CENTRIC-ARCHITECTURE.md` + +## Problem Statement + +Mosaic Stack needs a multi-user agent fleet where each user gets their own isolated OpenClaw instance with their own LLM provider credentials and agent config. The system must be Docker-first with minimal environment variables and all configuration managed through the WebUI. + +## Objectives + +1. **Minimal bootstrap** — 2 env vars (`DATABASE_URL`, `MOSAIC_SECRET_KEY`) to start the entire stack +2. **DB-centric config** — All runtime config in Postgres, managed via WebUI +3. **Per-user isolation** — Each user gets their own OpenClaw container with own API keys, memory, sessions +4. **Onboarding wizard** — First-boot experience: breakglass admin → OIDC → LLM provider → agent config +5. **Settings UI** — Runtime management of providers, agents, and auth config +6. **Mosaic as gatekeeper** — Users never talk to OpenClaw directly; Mosaic proxies all requests +7. **Zero cross-user access** — Full container, volume, and DB isolation between users + +## Security Requirements + +- User A cannot access User B's API keys, chat history, or agent memory +- All API keys stored encrypted (AES-256-GCM) in database +- Breakglass admin always works as OIDC fallback +- OIDC config stored in DB (not env vars) — configured via settings UI +- Container-to-container communication blocked by default +- Admin cannot decrypt other users' API keys + +## Phase 0: Knowledge Layer — COMPLETE + +- Findings API (pgvector, CRUD, similarity search) +- AgentMemory API (key/value store) +- ConversationArchive API (pgvector, ingest, search) +- OpenClaw mosaic skill +- Session log ingestion pipeline + +## Phase 1: DB-Centric Agent Fleet + +### Phase 1a: DB Schema — COMPLETE + +- SystemConfig, BreakglassUser, LlmProvider, UserContainer, SystemContainer, UserAgentConfig tables + +### Phase 1b: Encryption Service — COMPLETE + +- CryptoService (AES-256-GCM using MOSAIC_SECRET_KEY) + +### Phase 1c: Internal Config API + +- `GET /api/internal/agent-config/:id` — assembles openclaw.json from DB +- Auth: bearer token (container's own gateway token) +- Returns complete openclaw.json with decrypted provider credentials + +### Phase 1d: Container Lifecycle Manager + +- Docker API integration via `dockerode` npm package +- Start/stop/health-check/reap user containers +- Auto-generate gateway tokens, assign ports +- Docker socket access required (`/var/run/docker.sock`) + +### Phase 1e: Onboarding API + +- First-boot detection (`SystemConfig.onboarding.completed`) +- `POST /api/onboarding/breakglass` — create admin user +- `POST /api/onboarding/oidc` — save OIDC provider config +- `POST /api/onboarding/provider` — add LLM provider + test connection +- `POST /api/onboarding/complete` — mark done + +### Phase 1f: Onboarding Wizard UI + +- Multi-step wizard component +- Skip-able OIDC step +- LLM provider connection test + +### Phase 1g: Settings API + +- CRUD: LLM providers (per-user scoped) +- CRUD: Agent config (model assignments, personalities) +- CRUD: OIDC config (admin only) +- Breakglass password reset (admin only) + +### Phase 1h: Settings UI + +- Settings/Providers page +- Settings/Agent Config page +- Settings/Auth page (OIDC + breakglass) + +### Phase 1i: Chat Proxy + +- Route WebUI chat to user's OpenClaw container +- SSE streaming pass-through +- Ensure container is running before proxying (auto-start) + +### Phase 1j: Docker Compose + Entrypoint + +- Simplified compose (core services only — user containers are dynamic) +- Entrypoint: fetch config from API, write openclaw.json, start gateway +- Health check integration + +### Phase 1k: Idle Reaper + +- Cron job to stop inactive user containers +- Configurable idle timeout (default 30min) +- Preserve state volumes + +## Future Phases (out of scope) + +- Phase 2: Agent fleet standup (predefined agent roles) +- Phase 3: WebUI chat + task management integration +- Phase 4: Multi-LLM provider management UI (advanced) +- Team workspaces (shared agent contexts) — explicitly out of scope diff --git a/docs/TASKS.md b/docs/TASKS.md index 82870ed..d462577 100644 --- a/docs/TASKS.md +++ b/docs/TASKS.md @@ -78,8 +78,8 @@ Design doc: `docs/design/MS22-DB-CENTRIC-ARCHITECTURE.md` | Task ID | Status | Phase | Description | Issue | Scope | Branch | Depends On | Blocks | Assigned Worker | Started | Completed | Est Tokens | Act Tokens | Notes | | -------- | ----------- | -------- | --------------------------------------------------------------------------------------------------------------------- | ----- | ------- | ---------------------------- | ---------- | --------------- | --------------- | ------- | --------- | ---------- | ---------- | ----- | -| MS22-P1a | not-started | phase-1a | Prisma schema: SystemConfig, BreakglassUser, LlmProvider, UserContainer, SystemContainer, UserAgentConfig + migration | — | api | feat/ms22-p1a-schema | — | P1b,P1c,P1d,P1e | — | — | — | 20K | — | | -| MS22-P1b | not-started | phase-1b | Encryption service (AES-256-GCM) for API keys and tokens | — | api | feat/ms22-p1b-crypto | — | P1c,P1e,P1g | — | — | — | 15K | — | | +| MS22-P1a | done | phase-1a | Prisma schema: SystemConfig, BreakglassUser, LlmProvider, UserContainer, SystemContainer, UserAgentConfig + migration | — | api | feat/ms22-p1a-schema | — | P1b,P1c,P1d,P1e | — | — | — | 20K | — | | +| MS22-P1b | done | phase-1b | Encryption service (AES-256-GCM) for API keys and tokens | — | api | feat/ms22-p1b-crypto | — | P1c,P1e,P1g | — | — | — | 15K | — | | | MS22-P1c | not-started | phase-1c | Internal config endpoint: assemble openclaw.json from DB | — | api | feat/ms22-p1c-config-api | P1a,P1b | P1i,P1j | — | — | — | 20K | — | | | MS22-P1d | not-started | phase-1d | ContainerLifecycleService: Docker API (dockerode) start/stop/health/reap | — | api | feat/ms22-p1d-container-mgr | P1a | P1i,P1k | — | — | — | 25K | — | | | MS22-P1e | not-started | phase-1e | Onboarding API: breakglass, OIDC, provider, agents, complete | — | api | feat/ms22-p1e-onboarding-api | P1a,P1b | P1f | — | — | — | 20K | — | |