Files
stack/docs/fleet/PRD-fleet-suite.md
Jason Woltje 9f6da92a4b
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
chore(format): prettier docs/fleet/PRD-fleet-suite.md (pre-existing main violation)
Pre-existing repo-wide format:check failure on origin/main (markdown table
alignment only — no content change). Picked up to unblock the pre-push gate;
flagged to the Lead. Not part of the P5 composer change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EsgTQzV5YUGk1JtCLP4B83
2026-06-21 20:56:48 -05:00

110 lines
9.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PRD — Mosaic Fleet Suite (init, configure, operate)
> **Workstream:** W-FLEET (Fleet) under mission `mvp-20260312` · **Phase:** 3→4 productization
> **North star:** [docs/fleet/north-star.md](./north-star.md) · prior: Phase-2 observability (#579), durable launch (#581), real-agent enablement (#583/#584/#586), releases 0.0.350.0.37
> **Lead:** Jarvis @ `w-jarvis`. **Collaborator:** coder agent @ `dragon-lin` (jwoltje@10.1.10.37:coder0-0).
> Owner of this file: Fleet workstream lead. Does not modify MVP single-writer control-plane files.
## Mission
Turn the proven fleet primitives into a **user-installable, AI-free-configurable fleet product**:
a user runs `mosaic fleet init`, answers a few questions (general / coding / research / hybrid),
gets a recommended set of agents plus one always-on orchestrator wired for chat-ops, and can
operate, mutate, re-create, and observe the fleet — over tmux today and Matrix tomorrow — from
CLI/TUI and (designed-for) the webUI.
**Immediate tangible goal:** the **"Mos"** orchestrator agent running on `w-jarvis`, reachable
in **Discord channel `1517622518662434996`** (server `1112631390438166618`). Once the fleet is
functional, we use the fleet itself to continue the work.
## Requirements
### A. Configure-without-AI CLI
| ID | Requirement |
| --- | ------------------------------------------------------------------------------------------------------------- |
| R1 | `mosaic fleet` command set is functional end-to-end (init/install/start/stop/status/ps/verify + agent verbs). |
| R2 | `mosaic fleet init` is an interactive, **AI-free** CLI wizard. |
| R3 | Init asks the **configuration type**: `general`, `coding`, `research`, `hybrid`, … (extensible). |
| R4 | Based on the answer, the fleet is populated with a **recommended set of agents** (a preset). |
| R5 | **Exactly one main orchestrator agent** is always configured, regardless of type. |
| R10 | A set of **recommended configurations (presets)** ships for easy duplication. |
| R8 | User can **re-create** the fleet when config needs change (idempotent re-init / reconfigure). |
| R17 | Fleet controls are **simple and intuitive**. |
### B. Comms & orchestrator chat-ops
| ID | Requirement |
| --- | --------------------------------------------------------------------------------------------------------------------------------- |
| R6 | Init can wire the orchestrator to a chat connector — **Telegram / Discord / Matrix / Slack** — for command + comms. |
| R7 | Designed with the end-goal of **Matrix comms on a locally-controlled server**. |
| R16 | Fleet supports **tmux AND Matrix** comms, **user-configurable** at init or any time. Not all users want Matrix. |
| R19 | **"Mos" orchestrator on Discord** (`chan 1517622518662434996` / `srv 1112631390438166618`) on `w-jarvis` — the first live target. |
### C. Runtime, health, lifecycle
| ID | Requirement |
| --- | ---------------------------------------------------------------------------------- |
| R9 | Fleet is **mutable by the orchestrator agent** — add/remove agents per need. |
| R13 | Fleet **gracefully handles Pi + Claude harness updates** — keep harnesses current. |
| R14 | The **Pi harness is customized** for proper tool usage, etc. |
| R15 | **Agent heartbeat** properly configured for **Claude AND GPT/Pi** agents. |
### D. Surfaces, testing, docs
| ID | Requirement |
| --- | ----------------------------------------------------------------------------------- |
| R18 | Fleet built so the **webUI can view / monitor / terminate / butt-in** on a session. |
| R11 | Installed and **tested on both `w-jarvis` and `dragon-lin`**. |
| R12 | **Documentation**: how to install, configure, and use the fleet. |
## Architecture / approach
- **Config model:** `roster.yaml` is the source of truth (already exists). Add **presets** (`general`/`coding`/`research`/`hybrid`) as shipped example rosters; `init` selects a preset, always injects the orchestrator, and writes the roster. Re-init = regenerate roster (preserve user/site overrides — mirrors install env-merge from #567).
- **Orchestrator agent:** always present; carries the chat connector config (connector type + target IDs) so it can be commanded over chat. tmux is the substrate; the connector bridges chat ↔ the orchestrator session.
- **Comms layers (R16):** (1) **tmux** inter-agent (`agent-send`, proven) — default, always available. (2) **chat connector** for human↔orchestrator (Discord now; Matrix the strategic target). (3) **Matrix** as the locally-controlled cross-agent bus (future). Connector is pluggable + reconfigurable.
- **Heartbeat (R15):** runtime-agnostic launcher sidecar already covers pi/claude/codex (#584). Refine per-runtime (native HB) with the **custom Pi harness** (R14) + a Claude path.
- **Updates (R13):** `mosaic update` (CLI) + a fleet-aware harness-update step that refreshes pi/claude/codex and re-launches agents safely (drain → update → relaunch via the durable launcher).
- **webUI (R18):** the fleet exposes machine-readable state (`fleet ps --json` already carries tenant/host/heartbeat/managed) + control verbs (start/stop/watch/send); webUI consumes these (control plane rides federation per north star). Ensure a stable JSON contract + a terminate/attach(butt-in) path.
## Phases (incremental, each shippable)
| Phase | Deliverable | Notes |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------- |
| **F1 Presets + init wizard** | preset rosters (general/coding/research/hybrid) + always-orchestrator + AI-free `fleet init` selecting a preset; re-init idempotent | R1R5, R8, R10, R17 |
| **F2 Connector + Mos-on-Discord** | orchestrator chat-connector config (Discord first) + **Mos live on Discord `1517…`/`1112…`** on w-jarvis | R6, R19, partial R16 |
| **F3 Heartbeat + harness** | HB confirmed for claude + pi/gpt; **custom Pi harness** (tool usage, native HB, model self-report); graceful harness updates | R13, R14, R15 |
| **F4 Matrix + comms toggle** | Matrix connector (local server) + user toggle tmux/Matrix at init/anytime | R7, R16 |
| **F5 Orchestrator-mutable fleet** | orchestrator can add/remove agents at runtime | R9 |
| **F6 webUI hooks** | stable JSON contract + terminate/attach surface for webUI view/monitor/terminate/butt-in | R18 |
| **F7 Test + docs** | install+test on w-jarvis AND dragon-lin; user docs (install/configure/use) | R11, R12 (runs alongside every phase) |
## Work division (proposed — confirm with dragon-lin)
- **Jarvis @ w-jarvis (Lead):** F1 presets+wizard, F2 connector+Mos-on-Discord, F5 mutability, F6 webUI hooks; merge authority + dual-engine reviews; co-testing on w-jarvis.
- **coder @ dragon-lin:** F3 custom Pi harness + harness-update flow (pi/codex-savvy); plus its in-flight constitution P4P6 (P4 installer rework underpins `fleet init`/updates — coordinate the install path). Co-testing on dragon-lin (R11).
- **Shared:** F4 Matrix (whoever has bandwidth); F7 testing/docs continuous.
## Immediate target: Mos on Discord (F2 first slice)
The discord plugin is available (`~/.claude.json`). Path: configure the **orchestrator** as a durable
fleet session running Claude Code with the discord plugin bridged to channel `1517622518662434996`
(server `1112631390438166618`) on w-jarvis, with the existing Discord Bridge Protocol (ack within
~3s, reply via `mcp__discord__reply`, no `AskUserQuestion`). Heartbeat via the launcher sidecar.
## Success criteria
- A non-AI user can `mosaic fleet init`, pick a type, and get a working fleet + orchestrator.
- **Mos answers in Discord `1517…`** on w-jarvis.
- Fleet runs + is observable (`fleet ps`) on **both** w-jarvis and dragon-lin.
- Harness updates handled gracefully; HB healthy for claude + pi/gpt agents.
- Docs let a new operator install/configure/use the fleet.
- Re-init + orchestrator mutation work.
## Assumptions (veto-able)
- `ASSUMPTION:` presets ship as example rosters under the framework (`fleet/examples/*.yaml`), selected by `init`.
- `ASSUMPTION:` chat connectors are pluggable; Discord first (target exists), Matrix is the strategic default later.
- `ASSUMPTION:` "Mos" = a Claude Code orchestrator session with the discord plugin (reuses the documented Discord Bridge Protocol).
- `ASSUMPTION:` per north star, runtimes default to Codex/pi-on-Codex for workers; the orchestrator "Mos" runs Claude Code (in Claude Code, which is allowed).