Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>
412 lines
36 KiB
Markdown
412 lines
36 KiB
Markdown
# Mosaic Fleet — North Star
|
||
|
||
> **Workstream:** W-FLEET (Fleet) under mission `mvp-20260312`
|
||
> **Umbrella:** [docs/MISSION-MANIFEST.md](../MISSION-MANIFEST.md) · [docs/PRD.md](../PRD.md) (Mosaic Stack v0.1.0)
|
||
> **Status:** doctrine — authored 2026-06-20. Owner of this file: Fleet workstream lead.
|
||
> This document does **not** modify the MVP rollup; a rollup row is proposed, not written here.
|
||
|
||
## Vision
|
||
|
||
A **customizable, multi-tenant fleet of always-on AI agents** — each defined by role,
|
||
materialized as a durable, joinable runtime session, coordinated by the proven
|
||
orchestrator/worker model, and observable end-to-end across hosts. Coding today;
|
||
finance, analytics, research as roster entries tomorrow — same primitives, different
|
||
roster. The fleet is the **agent-session execution layer** of the Mosaic Stack MVP:
|
||
the thing federation makes reachable across hosts and the webUI/TUI/CLI make visible.
|
||
|
||
The USC tmux PoC (durable sessions + `agent-send` comms) proved the model. This
|
||
workstream makes it an official, observable, multi-tenant Mosaic Stack capability.
|
||
|
||
## The Fleet as means of production (bootstrapping)
|
||
|
||
The Fleet has a **dual role**, and that is the point:
|
||
|
||
- **As product** — a multi-tenant agent-fleet capability of Mosaic Stack (this workstream).
|
||
- **As means of production** — the orchestrator/worker fleet that _actually builds the
|
||
entire MVP_ (federation W1, webUI, TUI, CLI, and the Fleet itself).
|
||
|
||
We are **building the system that builds the system.** Every other MVP workstream is
|
||
delivered _by_ the fleet, so fleet observability and control are not merely product
|
||
features — they are the **operational floor of the whole delivery effort**. If we cannot
|
||
see and steer the agents, we cannot trust what they ship. This is why Phase 2
|
||
(observability) leads: it is the instrument panel for the factory, dogfooded on the live
|
||
fleet that is, recursively, building Mosaic Stack.
|
||
|
||
The discipline that makes great power safe is the same gate chain the fleet enforces:
|
||
independent review before merge, green CI, honest completion, decide-and-inform cadence,
|
||
and no irreversible action without authority. The bootstrap is only as trustworthy as
|
||
those gates.
|
||
|
||
## Alignment with MVP cross-cutting requirements
|
||
|
||
The Fleet inherits — does not re-invent — the MVP's hard requirements:
|
||
|
||
| MVP req | What it means for the Fleet |
|
||
| ----------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
|
||
| MVP-X1 three-surface parity | fleet observability/control reachable via **CLI + TUI + webUI** (CLI first; webUI is required for parity, not optional) |
|
||
| MVP-X2 multi-tenant isolation | one tenant = one **Linux uid** (own `systemd --user`, socket, `~/.config/mosaic`); no cross-tenant leakage |
|
||
| MVP-X3 auth (BetterAuth/SSO) | operator→fleet and cross-host views are auth-gated through the platform's existing auth |
|
||
| MVP-X4 quality gates | `pnpm typecheck`/`lint`/`format:check` green before any push |
|
||
| MVP-X5 federated topology | cross-host fleet visibility rides the **federation** boundary (W1), not a bespoke broker |
|
||
| MVP-X6 OTEL tracing | heartbeats, sends, and lifecycle events emit spans; `traceparent` crosses the federation boundary |
|
||
| MVP-X7 trunk merge | branch from `main`, squash-merge via PR, never push to `main` |
|
||
|
||
## The stack — where every concern lives
|
||
|
||
One **definition** is the source of truth; the **session** is how it runs.
|
||
|
||
| Layer | Owner | Phase-2 reality | Destination |
|
||
| -------------------------------- | ------------------------------------------------------------------------------------------- | ------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------- |
|
||
| **Definition + identity + auth** | gateway / `mosaic-as` (scoped tokens, #541) | `roster.yaml` (tenant-tagged) | one definition; `mosaic agent --new` materializes it |
|
||
| **Tenancy boundary** | **Linux uid per tenant** (linger, own `systemd --user`, own socket, own `~/.config/mosaic`) | one tenant: `jarvis` = tenant zero | uid-per-tenant; federation aggregates across hosts |
|
||
| **Runtime** | per-tenant tmux session on isolated socket | dogfood stub sessions (live now on `mosaic-factory`) | claude/codex/pi/opencode TUIs |
|
||
| **Liveness** | **heartbeat protocol** every runtime answers | protocol defined + dogfood stub answers it | all runtimes answer; "healthy" ≠ "pane alive" |
|
||
| **Observation** | read-only `watch` (native tmux) + `pipe-pane` stream | CLI `watch`/`ps`; explicit opt-in `attach` for control | + auth-gated webUI streams |
|
||
| **Control plane** | **federation** across hosts × tenants | records already carry `tenant_id` + `host` | federated gateways expose fleet state; webUI in Phase 5 |
|
||
| **Central register** | Postgres `fleet` schema (gateway instance); access via gateway API only | _none in PoC_ (files + `roster.yaml`) | agents, missions, tasks, heartbeats, spend — single network-accessible SSOT; docs = generated projections |
|
||
| **Budget / spend governance** | **per-tenant budget policy** ingested by the orchestrator + routing layer | none today (spend is unmetered) | usage-vs-limit feedback ingested; spend auto-paced to the limit window; per-provider/per-account/concurrency/API-$ budgets enforced |
|
||
|
||
> **PoC socket hygiene:** the PoC fleet runs on the **default tmux socket** (no `-L`).
|
||
> The named production-isolation socket is **`mosaic-fleet`** (matches the product brand);
|
||
> an absent roster `socket_name` means the default socket everywhere (spawn, `fleet ps`,
|
||
> onboarding cheat-sheet). The legacy dogfood canary still runs on the old `mosaic-factory`
|
||
> socket pending migration.
|
||
|
||
## Operating model (inherited, not reinvented)
|
||
|
||
The AI-guide law stands: one accountable **orchestrator**, isolated **workers** that
|
||
stop at PR-open, the serialized **gate chain** (independent review → green CI →
|
||
diff-sanity → squash-merge → verify), **decide-and-inform** cadence, and a durable
|
||
**board** so missions survive session death. The Fleet is the infrastructure _under_
|
||
this model. See `mosaicstack-aiguide` whitepapers 01 (inter-agent comms) and 03
|
||
(orchestration model) for the rationale.
|
||
|
||
## Fleet roster — the two-agent floor and the role library
|
||
|
||
A fleet is **never a single agent**. The minimum viable fleet is **two**:
|
||
|
||
| Role | Mandate | Boundaries |
|
||
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
|
||
| **Orchestrator** | The user's **single point of contact**. Owns the general flow, keeps agentic actions on-target, and **adds/removes agents from the fleet at will** to meet goals and user needs. Exactly **one** per fleet (the existing R5 invariant). | Delegates source work; never the sole worker. |
|
||
| **Enhancer** | The fleet's **continuous-improvement loop**. Monitors fleet activity, analyzes for enhancements/optimizations, builds a **plan of remediation**, and — **with the orchestrator** — upgrades fleet capability: tool creation/repair, skills, harness improvements, and **bug reports filed to Mosaic Stack** for proper remediation. Recommends which agents are needed. | **Does not code, review code, or perform delivery tasks.** Improvement and diagnosis only. |
|
||
|
||
> **Why two, not one:** the orchestrator drives delivery; the enhancer makes the fleet
|
||
> _get better at delivering_ over time. The enhancer is how the fleet self-heals its tools,
|
||
> skills, and harnesses, and how real defects flow back to Mosaic Stack as bug reports.
|
||
> Together they are the irreducible core — every other role is added on demand.
|
||
|
||
A **general** fleet starts at this floor: the orchestrator (advised by the enhancer)
|
||
materializes whatever roles prove necessary over the mission's life. Specialized presets
|
||
(coding, research, etc.) seed additional roles up front, but all reduce to the same two-agent
|
||
spine plus an on-demand **role library**:
|
||
|
||
| Role profile | Purpose |
|
||
| ------------------- | --------------------------------------------------------------------------------- |
|
||
| **orchestrator** | point of contact, flow control, fleet composition (1 per fleet) |
|
||
| **enhancer** | fleet monitoring, optimization, tool/skill/harness upgrades, upstream bug reports |
|
||
| **coder** | implementation (worker; stops at PR-open) |
|
||
| **code review** | independent code review gate |
|
||
| **security review** | security/auth/secret review gate |
|
||
| **research** | investigation, synthesis, options analysis |
|
||
| **board** | deliberation panel — moonshot, contrarian, technical, business, financial lenses |
|
||
| **operations** | infra, deploy, health, incident response |
|
||
| _…extensible_ | new profiles added as missions demand (orchestrator + enhancer decide) |
|
||
|
||
## Invariants — "maximal vision, incremental delivery, zero foreclosure"
|
||
|
||
Every artifact, starting Phase 2, MUST:
|
||
|
||
1. Carry **`tenant_id` + `host`** in schema and message addressing — even with one of each today.
|
||
2. Treat **isolation socket ≠ invisibility** — anything isolated is surfaced by one command.
|
||
3. Define **healthy = answered a heartbeat within N seconds**, never just "pane alive".
|
||
4. Make **observation read-only by default**; control is an explicit, separate, opt-in verb.
|
||
|
||
> **OPS INVARIANT — runtime agents need a real TTY.** Claude/Codex/pi/opencode agents
|
||
> cannot be bare-launched from a systemd `ExecStart`; a durable harness with a real PTY is
|
||
> required. This is **why `start-agent-session.sh` launches into tmux** and uses a
|
||
> `MOSAIC_AGENT_COMMAND` override rather than running the runtime directly under systemd.
|
||
|
||
## Budget & token governance (first-class fleet concern)
|
||
|
||
Spend is a fleet-level resource, not a per-agent afterthought. The fleet treats token
|
||
and API-dollar budget the way it treats liveness: a signal every runtime exposes and the
|
||
control plane is accountable for. This rides the same primitives as everything else —
|
||
`tenant_id` + `host` on every spend record, **read-only metering by default**, and the
|
||
**federation** layer as the cross-host aggregation point (W1) — so budgeting is zero-foreclosure
|
||
from day one even while one tenant exists.
|
||
|
||
**Two spend regimes, one policy surface:**
|
||
|
||
| Regime | Feedback signal | Fleet obligation |
|
||
| ------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
|
||
| **OAuth-subscription runtimes** (Claude sub, Codex sub) | runtime exposes **current-usage-vs-limit** within a rolling limit window | **ingest** the signal per sub-account; **auto-pace** agentic spend so the window is not exhausted early |
|
||
| **API-token runtimes** (metered per token) | provider billing / token counts | enforce **hard $-spend ceilings**; on breach, **downgrade → queue → refuse** (below) |
|
||
|
||
**Auto-pacing law (OAuth subs) — EVEN-SPREAD default (Jason override, 2026-06-22):** the fleet
|
||
paces agentic token spend to consume the limit window **evenly over remaining time**:
|
||
target rate = _(remaining usage available)_ ÷ _(remaining time in the window)_. Example: 100% of
|
||
a 7-day window = **~14.285%/day**; the system tracks current usage and continuously re-splits the
|
||
remainder evenly to hold pace. **Anticipated token-spend-per-task is the budgeting informant** —
|
||
tasks are scheduled against the daily pace, not run until the quota is gone. Rationale: spreading
|
||
delivery evenly beats rapidly exhausting usage and losing **multiple days of momentum**.
|
||
**Rapid pacing / overspend requires EXPLICIT user authorization;** absent it, even-spread holds.
|
||
Pacing is a control-plane decision, surfaced read-only before it throttles a lane.
|
||
|
||
**Hard-cap breach behavior (ladder):** when a budget ceiling is hit mid-work, the fleet
|
||
**downgrades first** (opus → sonnet → haiku, then Claude → Codex), **queues** the lane at the
|
||
cheapest floor until the window resets, and **refuses** only as a last resort. Refusal is never
|
||
the first response to a breach.
|
||
|
||
**Spend accounting, learning & telemetry:**
|
||
|
||
- **Multi-subscription auto-routing:** a tenant with multiple subscriptions may let the fleet
|
||
**auto-route work to the account with the most available usage** (within budget policy).
|
||
- **Historical spend learning:** every task's token spend is **recorded**; historical data
|
||
continuously updates known **spend-per-task**, **typical daily spend**, and projections — so
|
||
estimates self-correct and pacing stays on target.
|
||
- **Projected + actual spend on artifacts (Mosaic Stack mandate):** PRDs, missions, and task
|
||
decomposition **MUST note projected AND actual token spend** — a Mosaic Stack process standard
|
||
(template-level), tracked separately as **#622**.
|
||
- **Anonymized telemetry → mosaicstack.dev:** spend data is reported (anonymous) to the
|
||
mosaicstack.dev telemetry endpoint so other agents/fleets budget and optimize from real,
|
||
anonymized data. Product workstream, tracked separately as **#623**.
|
||
|
||
**User-settable budgets (the policy surface).** A tenant operator can set budgets for every
|
||
configured **provider** (per-provider ceilings), the **account-to-task mapping**, the **agentic
|
||
routing flow**, **concurrency** (the spend multiplier), and **hard API-token $-limits**. Budgets
|
||
are enforced at the orchestrator + routing boundary, not inside individual workers (a worker never
|
||
decides its own budget — see delegation discipline).
|
||
|
||
**Budget CLI UX (#558):** `mosaic budget set --reset-at` sets the window reset; reset-datetimes
|
||
carry **confidence tags** (`user` / `provider` / `estimated` / `unknown`); and **urgency/criticality
|
||
is a dispatch-gate modifier** — high-urgency work may override even-spread pacing **within
|
||
authorization**. (Also feeds the budgeting workstream, not only this doc.)
|
||
|
||
## Observation model
|
||
|
||
| Verb | Behavior |
|
||
| ----------------------------------- | -------------------------------------------------------------------------------------------------- |
|
||
| `mosaic fleet ps` | one table joining systemd + tmux + process + idle + last-heartbeat, with drift + boot-enable flags |
|
||
| `mosaic agent watch <name>` | **read-only** join (grouped session / `-r`), no resize tyranny, no keystrokes |
|
||
| `mosaic agent attach <name>` | explicit interactive takeover (the only path that can type) |
|
||
| `mosaic agent send <name> --verify` | confirms message **accepted**, not merely keystroke-injected |
|
||
|
||
> Why the current PoC blocks observation: sessions live on the isolated `mosaic-factory`
|
||
> socket (invisible to default `tmux ls`), the only sanctioned read is `capture-pane`
|
||
> (blank for full-screen TUIs), and `attach` is read-write + resizes the session. The
|
||
> verbs above restore "join and observe" safely.
|
||
|
||
## Control plane & central register
|
||
|
||
### Why the register must be Postgres
|
||
|
||
The fleet is multi-host (w-jarvis + dragon-lin + future). A SQLite file is a local
|
||
file — it is not a network service and cannot be shared across hosts. Beyond topology,
|
||
Postgres MVCC eliminates the concurrent-writer corruption class Hermes hit with SQLite
|
||
under multi-agent access.
|
||
|
||
Access is exclusively through the **gateway API** (`apps/gateway` — typed, auth-gated,
|
||
scoped tokens). No agent or dispatcher pane ever holds a raw DB credential; a
|
||
compromised pane cannot corrupt or exfiltrate the register.
|
||
|
||
### Architecture (layers)
|
||
|
||
| Layer | Responsibility | Implementation |
|
||
| ---------------------- | ------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||
| **Register** | Source of truth: agents, missions, tasks, heartbeats, spend | Postgres `fleet` schema — existing stack instance (`@mosaicstack/db`) |
|
||
| **Access** | Typed, auth-gated API | Gateway `fleet/*` routes |
|
||
| **Dispatcher** | Brief classification, BOD review, planning/coding/review/test/deploy sequencing + gates → fleet task dispatch | **forge pipeline engine** (`runPipeline`/`resumePipeline`, brief classifier, BOD) **+ thin `forge-exec` adapter → `agent-send.sh`**; NOT a new daemon — forge is reused, only stage→agent dispatch is new |
|
||
| **Orchestrator (Mos)** | Goals, missions, judgment, user/PA interface | Context-light; sets intent → re-engages only for decisions |
|
||
|
||
### Dispatcher = forge (reuse, do not rebuild)
|
||
|
||
The dispatcher is **not new work**: it is `@mosaicstack/forge`, a fully-implemented
|
||
software-factory pipeline engine (brief → Board-of-Directors review → 3 planning stages →
|
||
coding → review/remediation → testing → deploy). Forge already provides
|
||
`runPipeline`/`resumePipeline`, a brief classifier, and a BOD persona loader, so the fleet
|
||
does **not** re-implement sequencing, gate logic, or brief classification. The only new
|
||
fleet-owned code is a thin **`forge-exec` TaskExecutor adapter** (`ForgeTask` →
|
||
`agent-send.sh` to a named agent) — forge's single missing piece — tracked as a Gitea
|
||
issue and built post-PoC. The Postgres register backs forge's pipeline state (durable
|
||
`resumePipeline`, cross-host) in addition to cross-project missions/tasks/Kanban. The
|
||
north-star **'board' role IS forge's Board-of-Directors** — reused from forge, not a new
|
||
role implementation.
|
||
|
||
### Docs as projections
|
||
|
||
`docs/TASKS.md` and `MISSION-MANIFEST.md` are **generated projections** of the DB,
|
||
not hand-maintained. The dispatcher (or a scheduled job) renders Markdown from
|
||
`fleet.*` tables and commits the output. DB is authoritative; docs are for human
|
||
reference.
|
||
|
||
### Spend
|
||
|
||
`fleet.spend_ledger` records projected and actual token spend per agent/mission/task
|
||
(ties to issue #622). The dispatcher enforces budget caps before dispatching. Mos reads
|
||
the roll-up via API — no raw DB access, no context-bloating dumps.
|
||
|
||
### Federation
|
||
|
||
Cross-host fleet state flows through federated gateway queries (existing
|
||
`federation_peers` / `federation_grants` machinery). This is the existing north-star
|
||
invariant: **control plane rides federation (W1), not a bespoke broker.** No new
|
||
broker introduced.
|
||
|
||
### Scope
|
||
|
||
This is Phase 4–5 of this roadmap, materialized. It MUST NOT block the PoC (which
|
||
runs correctly on files + `roster.yaml`). Begin when Phase 2 heartbeat protocol is
|
||
stable and concurrent-agent count makes file coordination the bottleneck.
|
||
|
||
### Open sub-decision
|
||
|
||
Dedicated Postgres **instance** vs. dedicated **schema** in the existing instance.
|
||
Recommendation: dedicated schema, existing instance (a migration file, not new infra);
|
||
re-evaluate if isolation or write-volume demands it.
|
||
|
||
## Phased roadmap
|
||
|
||
| Phase | Outcome | Status |
|
||
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
|
||
| 0–1 | tmux PoC, hardening, published CLI v0.0.34 (#565–#568) | ✅ done |
|
||
| **2 — Observability** | `fleet ps` (host+tenant aware join), heartbeat protocol + dogfood stub answers it, `agent watch` (read-only), `agent send --verify` receipts | ▶ now |
|
||
| 3 — Real runtimes | claude/codex/pi/opencode answer heartbeat; **hybrid lifecycle** (core always-on: **orchestrator + enhancer**; ephemeral workers per lane) | planned |
|
||
| 4 — Unified definition | one agent schema in gateway; `mosaic agent --new` → materialized per-tenant session; uid-tenant provisioning; **`fleet` schema migration + `forge-exec` TaskExecutor adapter (forge → `agent-send.sh`)** | planned |
|
||
| 5 — Control plane | federation-backed cross-host × cross-tenant fleet view; **webUI** (surface chosen then) for MVP-X1 parity; **central register live (spend ledger, docs-as-projections, multi-host Kanban)** | planned |
|
||
|
||
## Decisions of record (2026-06-20, with Jason)
|
||
|
||
- Agent model: **config defines, session runs** (gateway = definition/identity/auth; tmux = runtime).
|
||
- Tenancy: **multi-tenant from the start**; isolation = **per-tenant Linux uid**.
|
||
- Health: **heartbeat required** (dogfood stub implements the protocol now).
|
||
- Lifecycle: **hybrid** — core always-on + ephemeral workers per lane.
|
||
- Observation: **read-only default, opt-in takeover**.
|
||
- Multi-host: **designed-for from day one**; control plane **rides federation (W1)**.
|
||
- Delivery: **CLI-first now**, dogfood against the live stub fleet; webUI deferred to Phase 5.
|
||
- Runtimes: fleet agents default to **Codex / pi-on-Codex**; **Claude is reserved for Claude
|
||
Code only** (avoid alternate-harness API pricing). Validated durable recipe:
|
||
`mosaic yolo pi --model openai-codex/gpt-5.5:high`. Durable detached launch requires the
|
||
runtime-bin on PATH (baked into the pane command) + boot-survival (`enable` + linger),
|
||
which `fleet init` should automate.
|
||
|
||
## Decisions of record (2026-06-22, with Jason)
|
||
|
||
- **Two-agent floor:** every fleet has, at minimum, an **orchestrator** and an **enhancer**.
|
||
The orchestrator is the user's point of contact and composes the fleet; the enhancer runs the
|
||
continuous-improvement loop (monitor → analyze → remediate → upgrade tools/skills/harness →
|
||
file Mosaic Stack bug reports) and **does not code or review**.
|
||
- **Role library:** orchestrator, enhancer, coder, code review, security review, research,
|
||
board (moonshot/contrarian/technical/business/financial), operations — extensible; the
|
||
orchestrator (advised by the enhancer) adds roles as missions demand.
|
||
- **Orchestrator chat connector:** the orchestrator is reachable over a user-chosen connector
|
||
(tmux now; Telegram/Discord/Matrix/Slack configurable). Validated live: **"Mos" orchestrator
|
||
on Discord** via the Claude Code discord channel plugin (w-jarvis).
|
||
- **Session context cap = 200k tokens (GLOBAL to all Claude sessions):** Claude Code sessions are
|
||
capped at a **max 200k-token context window**. Long-running sessions extended toward 1M tokens
|
||
have proven **worse in practice** (degraded steering, off-plan divergence); 200k is the standard.
|
||
**Enforcement split:** the _window_ lives in **`~/.claude/settings.json`** (host-global) as
|
||
`"autoCompactWindow": 200000` + `"autoCompactEnabled": true`; the _1M-disable_ lives in **launch
|
||
ENV** (`CLAUDE_CODE_DISABLE_1M_CONTEXT=1`, plus `CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000`) wherever
|
||
a `[1m]` model can be selected (`mos-claude.service` + the fleet Claude launcher), so every Claude
|
||
agent is capped at spawn. (settings = window; env = 1M-disable.)
|
||
- **Worker context bound (#8):** workers are kept context-bounded via the **ephemeral-per-lane
|
||
lifecycle + native compaction**, not via the 200k knob. The explicit `autoCompactWindow` 200k knob
|
||
**stays Claude-specific** — the _principle_ (bounded context) extends to workers, the _knob_ does not.
|
||
- **Orchestrator delegation discipline:** the orchestrator **delegates all delivery work** to
|
||
subagents / workflows / ultracode / coder agents and confines its own context to \*\*orchestration
|
||
- the personal-assistant lane\*\*. Keeping delivery out of the orchestrator's window keeps its
|
||
context unpolluted and measurably reduces off-plan divergence. The orchestrator coordinates and
|
||
decides; it does not implement.
|
||
- **Budget governance is fleet doctrine:** token/API-dollar budgeting is a first-class fleet concern
|
||
(see "Budget & token governance"). OAuth-sub usage-vs-limit feedback is ingested per account, spend
|
||
is **auto-paced EVEN-SPREAD over remaining time** (rapid/overspend only on explicit authorization),
|
||
spend is **tracked historically** to self-correct per-task/daily estimates, multi-sub tenants may
|
||
**auto-route by available usage**, and operators set budgets per provider, per account-to-task
|
||
mapping, per routing flow, per concurrency level, and as hard API-$ ceilings.
|
||
- **Spend accounting is a Mosaic Stack process mandate:** PRDs, missions, and task decomposition
|
||
**MUST carry projected + actual token spend**; used locally for pacing and reported as **anonymized
|
||
telemetry to mosaicstack.dev**. The template standard (#622) and telemetry product (#623) are
|
||
tracked separately.
|
||
- **Unified identity = "Fleet" (Jason, 2026-06-22):** the product is **Mosaic Fleet** — one unified
|
||
user-facing identity and CLI surface. **forge** is the Fleet's **internal** delivery/orchestration
|
||
engine (not a separate product); the control-plane **Postgres register is the Fleet's register**;
|
||
workers/runtime are the **Fleet substrate**. **"factory" is RETIRED as a product term** — it was
|
||
only ever the software-factory concept (which forge implements) and the old `mosaic-factory` tmux
|
||
socket name. The production-isolation socket is now **`mosaic-fleet`** (matches the product brand);
|
||
the legacy dogfood canary remains on the old `mosaic-factory` socket pending migration. **Code stays
|
||
layered** (forge + fleet + control-plane as internal layers);
|
||
only the **identity + CLI surface unify under Fleet.**
|
||
- **Role-based session naming (Jason, 2026-06-22):** agent tmux sessions are named by **role**
|
||
(`orchestrator`, `enhancer`, `research`, `coder0-0`, …), not by persona. **Persona lives in
|
||
`SOUL.md`**; the front-end / Discord presents a **friendly alias** (e.g. "Mos" = the orchestrator's
|
||
alias). The session name is the stable addressing handle; the alias is presentation.
|
||
|
||
### Control plane & central register
|
||
|
||
- **Store:** Postgres (existing stack instance, dedicated `fleet` schema via `@mosaicstack/db`). SQLite rejected: (1) it is a local file — structurally incompatible with a multi-host fleet; (2) concurrent multi-agent writes caused repeated corruption in Hermes. "SQLite + access service" rejected as reinventing a DB server badly; "LLM agent gating DB access" rejected as slow, expensive, and a single point of failure.
|
||
- **Access:** gateway API only (`apps/gateway`, `fleet/*` routes). No raw DB credentials in any agent/dispatcher pane — directly mitigates the tmux attack-surface concern.
|
||
- **Dispatcher = forge (reuse, not a new build):** the dispatcher IS `@mosaicstack/forge`'s pipeline engine (`runPipeline`/`resumePipeline` + brief classifier + BOD persona loader), a fully-implemented software-factory pipeline (brief → BOD review → 3 planning stages → coding → review/remediation → testing → deploy). We do **not** design/build a new dispatcher and do **not** re-implement sequencing, gate logic, or brief classification. The only new fleet-owned piece is a thin **`forge-exec` TaskExecutor adapter** (suggested package `packages/forge-exec`) mapping a `ForgeTask` → `agent-send.sh` dispatch to a named fleet agent — forge's single missing piece. It is tracked as a Gitea issue and built **post-PoC** (not now).
|
||
- **Register backs forge:** the Postgres `fleet` register is genuinely new (neither forge nor the fleet has cross-project state). It BACKS forge's pipeline state (durable `resumePipeline`, cross-host) plus cross-project missions/tasks/Kanban.
|
||
- **'board' role = forge BOD:** the north-star role-library 'board' role IS forge's Board-of-Directors — reused, not reinvented.
|
||
- **Orchestration vs. dispatch:** Orchestrator (Mos) sets intent and handles judgment; forge works the mechanical pipeline (sequencing, gates, status transitions, spend ledger). LLM escalation reserved for judgment: mission decomposition, re-planning on failure.
|
||
- **Spend in the register:** `fleet.spend_ledger` tracks projected vs. actual tokens per agent/mission/task; ties to issue #622.
|
||
- **Docs as projections:** `docs/TASKS.md` and `MISSION-MANIFEST.md` become generated exports of the DB, not hand-maintained.
|
||
- **Sub-decision pending:** dedicated schema in existing PG instance (recommended) vs. dedicated PG instance. Revisit if isolation or write-volume demands it.
|
||
|
||
## Future enhancements (north-star, post-MVP — not on the MVP track)
|
||
|
||
- **Mosaic Claude Discord Plugin** — a first-party Mosaic Discord connector that properly
|
||
implements the basic Discord functions **and native Discord threads**. Threads let a user
|
||
separate conversation topics with the orchestrator (the pattern proven by the Hermes agent).
|
||
A major enhancement over the current third-party channel plugin; **not required for the MVP**,
|
||
but a committed north-star target. `ASSUMPTION:` ships as a Mosaic-owned plugin so the fleet
|
||
controls Discord UX (threads, reactions, attachments, per-thread context) end-to-end.
|
||
- **Matrix on a local homeserver — strategic future transport.** **F4 (in progress) IS the Matrix
|
||
connector**: an orchestrator chat connector speaking the Matrix client-server API against a
|
||
self-hosted homeserver (Conduit default, Synapse alt). Matrix is named here as the strategic
|
||
future transport — peer to tmux/Discord, not superseded by them.
|
||
- **tmux fleet attack-surface hardening.** Many always-on tmux sessions are an attack surface;
|
||
`tmux send-keys` / socket access could enable malicious action against agents directly.
|
||
Mitigations to build toward: socket ownership/perms, per-tenant socket isolation (already an
|
||
invariant), authenticated `agent-send`, and an audit of who can write to any pane. **Post-MVP
|
||
unless a P0 surfaces.** The control-plane register reinforces this (gateway-API access = no raw
|
||
DB creds in panes). A not-started risk-assessment + mitigation-plan task rides the Fleet `TASKS.md`.
|
||
|
||
## Assumptions (veto-able)
|
||
|
||
- `ASSUMPTION:` first-class runtimes = claude, codex, pi, opencode; a "role" (analyst,
|
||
finance, researcher) = persona + skills + tools on top of a runtime, shipped as a
|
||
starter role library in the framework.
|
||
- `ASSUMPTION:` the cross-host control plane is the **federation** layer (W1), not a
|
||
separate `fleetd` daemon.
|
||
- `ASSUMPTION:` Fleet is workstream **W-FLEET** under `mvp-20260312`; a rollup row in
|
||
`docs/TASKS.md` and a workstream declaration in `MISSION-MANIFEST.md` are proposed to
|
||
the MVP orchestrator, not written by this workstream.
|
||
- `ASSUMPTION:` OAuth-subscription runtimes (Claude sub, Codex sub) expose a machine-readable
|
||
current-usage-vs-limit signal the fleet can poll/ingest; if a provider exposes no such signal,
|
||
that provider's accounts fall back to API-style hard-ceiling budgeting only (no auto-pacing).
|
||
- `ASSUMPTION:` budget policy lives at the orchestrator + routing layer and is surfaced through the
|
||
same CLI→TUI→webUI parity (MVP-X1) as the rest of fleet state — not a separate budgeting daemon.
|
||
- `ASSUMPTION:` the 200k session cap is enforced by Claude Code settings/env composition (model
|
||
variant + `autoCompactWindow`), not by a Mosaic wrapper; a wrapper is the fallback only if the
|
||
harness later removes those knobs.
|
||
- `ASSUMPTION:` The central register (Postgres `fleet` schema + gateway API + forge as dispatcher) is
|
||
the Phase 4–5 control plane, begun after Phase 2 observability is proven. It is a dedicated
|
||
**W-FLEET** sub-workstream entry, not a separate mission. The dispatcher is `@mosaicstack/forge`
|
||
(reused, not a new daemon); the only new fleet-owned code is the thin **`forge-exec` TaskExecutor
|
||
adapter** (suggested package `packages/forge-exec`, `ForgeTask` → `agent-send.sh`), tracked as a
|
||
Gitea issue and built post-PoC.
|
||
|
||
---
|
||
|
||
> **Release procedure (drift re-capture, 2026-06-22):** `mosaic update` only propagates new fleet
|
||
> commands when the **CLI version is bumped** — without a version bump, fleet command changes never
|
||
> reach installed hosts. The release/version-bump procedure (bump → publish → `mosaic update`
|
||
> [→ `--relaunch`]) must be documented so fleet changes actually land. (Also feeds the budgeting
|
||
> workstream.)
|
||
>
|
||
> **Tracked separately (not in scope for this doc PR):** **#622** PRD/mission/task projected+actual
|
||
> spend template standard · **#623** anonymized spend telemetry → mosaicstack.dev (product) ·
|
||
> **#625** `tenant_id` roster-schema field (multi-tenant; invariant #1 home) · **#628** `forge-exec`
|
||
> TaskExecutor adapter (post-PoC). This PR records **doctrine only** — no implementation.
|