mosaicstack/stack

Fork 0

Files

Jason Woltje 3f69d45334

ci/woodpecker/push/publish Pipeline was canceled

Details

ci/woodpecker/push/ci Pipeline was canceled

Details

docs(fleet): consolidate north-star doctrine (budget + control plane + identity) (#629 )

Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>

2026-06-22 21:08:41 +00:00

36 KiB

Raw Blame History

Mosaic Fleet — North Star

Workstream: W-FLEET (Fleet) under mission mvp-20260312 Umbrella: docs/MISSION-MANIFEST.md · docs/PRD.md (Mosaic Stack v0.1.0) Status: doctrine — authored 2026-06-20. Owner of this file: Fleet workstream lead. This document does not modify the MVP rollup; a rollup row is proposed, not written here.

Vision

A customizable, multi-tenant fleet of always-on AI agents — each defined by role, materialized as a durable, joinable runtime session, coordinated by the proven orchestrator/worker model, and observable end-to-end across hosts. Coding today; finance, analytics, research as roster entries tomorrow — same primitives, different roster. The fleet is the agent-session execution layer of the Mosaic Stack MVP: the thing federation makes reachable across hosts and the webUI/TUI/CLI make visible.

The USC tmux PoC (durable sessions + agent-send comms) proved the model. This workstream makes it an official, observable, multi-tenant Mosaic Stack capability.

The Fleet as means of production (bootstrapping)

The Fleet has a dual role, and that is the point:

As product — a multi-tenant agent-fleet capability of Mosaic Stack (this workstream).
As means of production — the orchestrator/worker fleet that actually builds the entire MVP (federation W1, webUI, TUI, CLI, and the Fleet itself).

We are building the system that builds the system. Every other MVP workstream is delivered by the fleet, so fleet observability and control are not merely product features — they are the operational floor of the whole delivery effort. If we cannot see and steer the agents, we cannot trust what they ship. This is why Phase 2 (observability) leads: it is the instrument panel for the factory, dogfooded on the live fleet that is, recursively, building Mosaic Stack.

The discipline that makes great power safe is the same gate chain the fleet enforces: independent review before merge, green CI, honest completion, decide-and-inform cadence, and no irreversible action without authority. The bootstrap is only as trustworthy as those gates.

Alignment with MVP cross-cutting requirements

The Fleet inherits — does not re-invent — the MVP's hard requirements:

MVP req	What it means for the Fleet
MVP-X1 three-surface parity	fleet observability/control reachable via CLI + TUI + webUI (CLI first; webUI is required for parity, not optional)
MVP-X2 multi-tenant isolation	one tenant = one Linux uid (own `systemd --user`, socket, `~/.config/mosaic`); no cross-tenant leakage
MVP-X3 auth (BetterAuth/SSO)	operator→fleet and cross-host views are auth-gated through the platform's existing auth
MVP-X4 quality gates	`pnpm typecheck`/`lint`/`format:check` green before any push
MVP-X5 federated topology	cross-host fleet visibility rides the federation boundary (W1), not a bespoke broker
MVP-X6 OTEL tracing	heartbeats, sends, and lifecycle events emit spans; `traceparent` crosses the federation boundary
MVP-X7 trunk merge	branch from `main`, squash-merge via PR, never push to `main`

The stack — where every concern lives

One definition is the source of truth; the session is how it runs.

Layer	Owner	Phase-2 reality	Destination
Definition + identity + auth	gateway / `mosaic-as` (scoped tokens, #541)	`roster.yaml` (tenant-tagged)	one definition; `mosaic agent --new` materializes it
Tenancy boundary	Linux uid per tenant (linger, own `systemd --user`, own socket, own `~/.config/mosaic`)	one tenant: `jarvis` = tenant zero	uid-per-tenant; federation aggregates across hosts
Runtime	per-tenant tmux session on isolated socket	dogfood stub sessions (live now on `mosaic-factory`)	claude/codex/pi/opencode TUIs
Liveness	heartbeat protocol every runtime answers	protocol defined + dogfood stub answers it	all runtimes answer; "healthy" ≠ "pane alive"
Observation	read-only `watch` (native tmux) + `pipe-pane` stream	CLI `watch`/`ps`; explicit opt-in `attach` for control	+ auth-gated webUI streams
Control plane	federation across hosts × tenants	records already carry `tenant_id` + `host`	federated gateways expose fleet state; webUI in Phase 5
Central register	Postgres `fleet` schema (gateway instance); access via gateway API only	none in PoC (files + `roster.yaml`)	agents, missions, tasks, heartbeats, spend — single network-accessible SSOT; docs = generated projections
Budget / spend governance	per-tenant budget policy ingested by the orchestrator + routing layer	none today (spend is unmetered)	usage-vs-limit feedback ingested; spend auto-paced to the limit window; per-provider/per-account/concurrency/API-$ budgets enforced

PoC socket hygiene: the PoC fleet runs on the default tmux socket (no -L). The named production-isolation socket is mosaic-fleet (matches the product brand); an absent roster socket_name means the default socket everywhere (spawn, fleet ps, onboarding cheat-sheet). The legacy dogfood canary still runs on the old mosaic-factory socket pending migration.

Operating model (inherited, not reinvented)

The AI-guide law stands: one accountable orchestrator, isolated workers that stop at PR-open, the serialized gate chain (independent review → green CI → diff-sanity → squash-merge → verify), decide-and-inform cadence, and a durable board so missions survive session death. The Fleet is the infrastructure under this model. See mosaicstack-aiguide whitepapers 01 (inter-agent comms) and 03 (orchestration model) for the rationale.

Fleet roster — the two-agent floor and the role library

A fleet is never a single agent. The minimum viable fleet is two:

Role	Mandate	Boundaries
Orchestrator	The user's single point of contact. Owns the general flow, keeps agentic actions on-target, and adds/removes agents from the fleet at will to meet goals and user needs. Exactly one per fleet (the existing R5 invariant).	Delegates source work; never the sole worker.
Enhancer	The fleet's continuous-improvement loop. Monitors fleet activity, analyzes for enhancements/optimizations, builds a plan of remediation, and — with the orchestrator — upgrades fleet capability: tool creation/repair, skills, harness improvements, and bug reports filed to Mosaic Stack for proper remediation. Recommends which agents are needed.	Does not code, review code, or perform delivery tasks. Improvement and diagnosis only.

Why two, not one: the orchestrator drives delivery; the enhancer makes the fleet get better at delivering over time. The enhancer is how the fleet self-heals its tools, skills, and harnesses, and how real defects flow back to Mosaic Stack as bug reports. Together they are the irreducible core — every other role is added on demand.

A general fleet starts at this floor: the orchestrator (advised by the enhancer) materializes whatever roles prove necessary over the mission's life. Specialized presets (coding, research, etc.) seed additional roles up front, but all reduce to the same two-agent spine plus an on-demand role library:

Role profile	Purpose
orchestrator	point of contact, flow control, fleet composition (1 per fleet)
enhancer	fleet monitoring, optimization, tool/skill/harness upgrades, upstream bug reports
coder	implementation (worker; stops at PR-open)
code review	independent code review gate
security review	security/auth/secret review gate
research	investigation, synthesis, options analysis
board	deliberation panel — moonshot, contrarian, technical, business, financial lenses
operations	infra, deploy, health, incident response
…extensible	new profiles added as missions demand (orchestrator + enhancer decide)

Invariants — "maximal vision, incremental delivery, zero foreclosure"

Every artifact, starting Phase 2, MUST:

Carry tenant_id + host in schema and message addressing — even with one of each today.
Treat isolation socket ≠ invisibility — anything isolated is surfaced by one command.
Define healthy = answered a heartbeat within N seconds, never just "pane alive".
Make observation read-only by default; control is an explicit, separate, opt-in verb.

OPS INVARIANT — runtime agents need a real TTY. Claude/Codex/pi/opencode agents cannot be bare-launched from a systemd ExecStart; a durable harness with a real PTY is required. This is why start-agent-session.sh launches into tmux and uses a MOSAIC_AGENT_COMMAND override rather than running the runtime directly under systemd.

Budget & token governance (first-class fleet concern)

Spend is a fleet-level resource, not a per-agent afterthought. The fleet treats token and API-dollar budget the way it treats liveness: a signal every runtime exposes and the control plane is accountable for. This rides the same primitives as everything else — tenant_id + host on every spend record, read-only metering by default, and the federation layer as the cross-host aggregation point (W1) — so budgeting is zero-foreclosure from day one even while one tenant exists.

Two spend regimes, one policy surface:

Regime	Feedback signal	Fleet obligation
OAuth-subscription runtimes (Claude sub, Codex sub)	runtime exposes current-usage-vs-limit within a rolling limit window	ingest the signal per sub-account; auto-pace agentic spend so the window is not exhausted early
API-token runtimes (metered per token)	provider billing / token counts	enforce hard $-spend ceilings; on breach, downgrade → queue → refuse (below)

Auto-pacing law (OAuth subs) — EVEN-SPREAD default (Jason override, 2026-06-22): the fleet paces agentic token spend to consume the limit window evenly over remaining time: target rate = (remaining usage available) ÷ (remaining time in the window). Example: 100% of a 7-day window = ~14.285%/day; the system tracks current usage and continuously re-splits the remainder evenly to hold pace. Anticipated token-spend-per-task is the budgeting informant — tasks are scheduled against the daily pace, not run until the quota is gone. Rationale: spreading delivery evenly beats rapidly exhausting usage and losing multiple days of momentum. Rapid pacing / overspend requires EXPLICIT user authorization; absent it, even-spread holds. Pacing is a control-plane decision, surfaced read-only before it throttles a lane.

Hard-cap breach behavior (ladder): when a budget ceiling is hit mid-work, the fleet downgrades first (opus → sonnet → haiku, then Claude → Codex), queues the lane at the cheapest floor until the window resets, and refuses only as a last resort. Refusal is never the first response to a breach.

Spend accounting, learning & telemetry:

Multi-subscription auto-routing: a tenant with multiple subscriptions may let the fleet auto-route work to the account with the most available usage (within budget policy).
Historical spend learning: every task's token spend is recorded; historical data continuously updates known spend-per-task, typical daily spend, and projections — so estimates self-correct and pacing stays on target.
Projected + actual spend on artifacts (Mosaic Stack mandate): PRDs, missions, and task decomposition MUST note projected AND actual token spend — a Mosaic Stack process standard (template-level), tracked separately as #622.
Anonymized telemetry → mosaicstack.dev: spend data is reported (anonymous) to the mosaicstack.dev telemetry endpoint so other agents/fleets budget and optimize from real, anonymized data. Product workstream, tracked separately as #623.

User-settable budgets (the policy surface). A tenant operator can set budgets for every configured provider (per-provider ceilings), the account-to-task mapping, the agentic routing flow, concurrency (the spend multiplier), and hard API-token $-limits. Budgets are enforced at the orchestrator + routing boundary, not inside individual workers (a worker never decides its own budget — see delegation discipline).

Budget CLI UX (#558): mosaic budget set --reset-at sets the window reset; reset-datetimes carry confidence tags (user / provider / estimated / unknown); and urgency/criticality is a dispatch-gate modifier — high-urgency work may override even-spread pacing within authorization. (Also feeds the budgeting workstream, not only this doc.)

Observation model

Verb	Behavior
`mosaic fleet ps`	one table joining systemd + tmux + process + idle + last-heartbeat, with drift + boot-enable flags
`mosaic agent watch <name>`	read-only join (grouped session / `-r`), no resize tyranny, no keystrokes
`mosaic agent attach <name>`	explicit interactive takeover (the only path that can type)
`mosaic agent send <name> --verify`	confirms message accepted, not merely keystroke-injected

Why the current PoC blocks observation: sessions live on the isolated mosaic-factory socket (invisible to default tmux ls), the only sanctioned read is capture-pane (blank for full-screen TUIs), and attach is read-write + resizes the session. The verbs above restore "join and observe" safely.

Control plane & central register

Why the register must be Postgres

The fleet is multi-host (w-jarvis + dragon-lin + future). A SQLite file is a local file — it is not a network service and cannot be shared across hosts. Beyond topology, Postgres MVCC eliminates the concurrent-writer corruption class Hermes hit with SQLite under multi-agent access.

Access is exclusively through the gateway API (apps/gateway — typed, auth-gated, scoped tokens). No agent or dispatcher pane ever holds a raw DB credential; a compromised pane cannot corrupt or exfiltrate the register.

Architecture (layers)

Layer	Responsibility	Implementation
Register	Source of truth: agents, missions, tasks, heartbeats, spend	Postgres `fleet` schema — existing stack instance (`@mosaicstack/db`)
Access	Typed, auth-gated API	Gateway `fleet/*` routes
Dispatcher	Brief classification, BOD review, planning/coding/review/test/deploy sequencing + gates → fleet task dispatch	forge pipeline engine (`runPipeline`/`resumePipeline`, brief classifier, BOD) + thin `forge-exec` adapter → `agent-send.sh`; NOT a new daemon — forge is reused, only stage→agent dispatch is new
Orchestrator (Mos)	Goals, missions, judgment, user/PA interface	Context-light; sets intent → re-engages only for decisions

Dispatcher = forge (reuse, do not rebuild)

The dispatcher is not new work: it is @mosaicstack/forge, a fully-implemented software-factory pipeline engine (brief → Board-of-Directors review → 3 planning stages → coding → review/remediation → testing → deploy). Forge already provides runPipeline/resumePipeline, a brief classifier, and a BOD persona loader, so the fleet does not re-implement sequencing, gate logic, or brief classification. The only new fleet-owned code is a thin forge-exec TaskExecutor adapter (ForgeTask → agent-send.sh to a named agent) — forge's single missing piece — tracked as a Gitea issue and built post-PoC. The Postgres register backs forge's pipeline state (durable resumePipeline, cross-host) in addition to cross-project missions/tasks/Kanban. The north-star 'board' role IS forge's Board-of-Directors — reused from forge, not a new role implementation.

Docs as projections

docs/TASKS.md and MISSION-MANIFEST.md are generated projections of the DB, not hand-maintained. The dispatcher (or a scheduled job) renders Markdown from fleet.* tables and commits the output. DB is authoritative; docs are for human reference.

Spend

fleet.spend_ledger records projected and actual token spend per agent/mission/task (ties to issue #622). The dispatcher enforces budget caps before dispatching. Mos reads the roll-up via API — no raw DB access, no context-bloating dumps.

Federation

Cross-host fleet state flows through federated gateway queries (existing federation_peers / federation_grants machinery). This is the existing north-star invariant: control plane rides federation (W1), not a bespoke broker. No new broker introduced.

Scope

This is Phase 4–5 of this roadmap, materialized. It MUST NOT block the PoC (which runs correctly on files + roster.yaml). Begin when Phase 2 heartbeat protocol is stable and concurrent-agent count makes file coordination the bottleneck.

Open sub-decision

Dedicated Postgres instance vs. dedicated schema in the existing instance. Recommendation: dedicated schema, existing instance (a migration file, not new infra); re-evaluate if isolation or write-volume demands it.

Phased roadmap

Phase	Outcome	Status
0–1	tmux PoC, hardening, published CLI v0.0.34 (#565–#568)	✅ done
2 — Observability	`fleet ps` (host+tenant aware join), heartbeat protocol + dogfood stub answers it, `agent watch` (read-only), `agent send --verify` receipts	▶ now
3 — Real runtimes	claude/codex/pi/opencode answer heartbeat; hybrid lifecycle (core always-on: orchestrator + enhancer; ephemeral workers per lane)	planned
4 — Unified definition	one agent schema in gateway; `mosaic agent --new` → materialized per-tenant session; uid-tenant provisioning; `fleet` schema migration + `forge-exec` TaskExecutor adapter (forge → `agent-send.sh`)	planned
5 — Control plane	federation-backed cross-host × cross-tenant fleet view; webUI (surface chosen then) for MVP-X1 parity; central register live (spend ledger, docs-as-projections, multi-host Kanban)	planned

Decisions of record (2026-06-20, with Jason)

Agent model: config defines, session runs (gateway = definition/identity/auth; tmux = runtime).
Tenancy: multi-tenant from the start; isolation = per-tenant Linux uid.
Health: heartbeat required (dogfood stub implements the protocol now).
Lifecycle: hybrid — core always-on + ephemeral workers per lane.
Observation: read-only default, opt-in takeover.
Multi-host: designed-for from day one; control plane rides federation (W1).
Delivery: CLI-first now, dogfood against the live stub fleet; webUI deferred to Phase 5.
Runtimes: fleet agents default to Codex / pi-on-Codex; Claude is reserved for Claude Code only (avoid alternate-harness API pricing). Validated durable recipe: mosaic yolo pi --model openai-codex/gpt-5.5:high. Durable detached launch requires the runtime-bin on PATH (baked into the pane command) + boot-survival (enable + linger), which fleet init should automate.

Decisions of record (2026-06-22, with Jason)

Two-agent floor: every fleet has, at minimum, an orchestrator and an enhancer. The orchestrator is the user's point of contact and composes the fleet; the enhancer runs the continuous-improvement loop (monitor → analyze → remediate → upgrade tools/skills/harness → file Mosaic Stack bug reports) and does not code or review.
Role library: orchestrator, enhancer, coder, code review, security review, research, board (moonshot/contrarian/technical/business/financial), operations — extensible; the orchestrator (advised by the enhancer) adds roles as missions demand.
Orchestrator chat connector: the orchestrator is reachable over a user-chosen connector (tmux now; Telegram/Discord/Matrix/Slack configurable). Validated live: "Mos" orchestrator on Discord via the Claude Code discord channel plugin (w-jarvis).
Session context cap = 200k tokens (GLOBAL to all Claude sessions): Claude Code sessions are capped at a max 200k-token context window. Long-running sessions extended toward 1M tokens have proven worse in practice (degraded steering, off-plan divergence); 200k is the standard. Enforcement split: the window lives in ~/.claude/settings.json (host-global) as "autoCompactWindow": 200000 + "autoCompactEnabled": true; the 1M-disable lives in launch ENV (CLAUDE_CODE_DISABLE_1M_CONTEXT=1, plus CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000) wherever a [1m] model can be selected (mos-claude.service + the fleet Claude launcher), so every Claude agent is capped at spawn. (settings = window; env = 1M-disable.)
Worker context bound (#8): workers are kept context-bounded via the ephemeral-per-lane lifecycle + native compaction, not via the 200k knob. The explicit autoCompactWindow 200k knob stays Claude-specific — the principle (bounded context) extends to workers, the knob does not.
Orchestrator delegation discipline: the orchestrator delegates all delivery work to subagents / workflows / ultracode / coder agents and confines its own context to **orchestration
- the personal-assistant lane**. Keeping delivery out of the orchestrator's window keeps its context unpolluted and measurably reduces off-plan divergence. The orchestrator coordinates and decides; it does not implement.
Budget governance is fleet doctrine: token/API-dollar budgeting is a first-class fleet concern (see "Budget & token governance"). OAuth-sub usage-vs-limit feedback is ingested per account, spend is auto-paced EVEN-SPREAD over remaining time (rapid/overspend only on explicit authorization), spend is tracked historically to self-correct per-task/daily estimates, multi-sub tenants may auto-route by available usage, and operators set budgets per provider, per account-to-task mapping, per routing flow, per concurrency level, and as hard API-$ ceilings.
Spend accounting is a Mosaic Stack process mandate: PRDs, missions, and task decomposition MUST carry projected + actual token spend; used locally for pacing and reported as anonymized telemetry to mosaicstack.dev. The template standard (#622) and telemetry product (#623) are tracked separately.
Unified identity = "Fleet" (Jason, 2026-06-22): the product is Mosaic Fleet — one unified user-facing identity and CLI surface. forge is the Fleet's internal delivery/orchestration engine (not a separate product); the control-plane Postgres register is the Fleet's register; workers/runtime are the Fleet substrate. "factory" is RETIRED as a product term — it was only ever the software-factory concept (which forge implements) and the old mosaic-factory tmux socket name. The production-isolation socket is now mosaic-fleet (matches the product brand); the legacy dogfood canary remains on the old mosaic-factory socket pending migration. Code stays layered (forge + fleet + control-plane as internal layers); only the identity + CLI surface unify under Fleet.
Role-based session naming (Jason, 2026-06-22): agent tmux sessions are named by role (orchestrator, enhancer, research, coder0-0, …), not by persona. Persona lives in SOUL.md; the front-end / Discord presents a friendly alias (e.g. "Mos" = the orchestrator's alias). The session name is the stable addressing handle; the alias is presentation.

Control plane & central register

Store: Postgres (existing stack instance, dedicated fleet schema via @mosaicstack/db). SQLite rejected: (1) it is a local file — structurally incompatible with a multi-host fleet; (2) concurrent multi-agent writes caused repeated corruption in Hermes. "SQLite + access service" rejected as reinventing a DB server badly; "LLM agent gating DB access" rejected as slow, expensive, and a single point of failure.
Access: gateway API only (apps/gateway, fleet/* routes). No raw DB credentials in any agent/dispatcher pane — directly mitigates the tmux attack-surface concern.
Dispatcher = forge (reuse, not a new build): the dispatcher IS @mosaicstack/forge's pipeline engine (runPipeline/resumePipeline + brief classifier + BOD persona loader), a fully-implemented software-factory pipeline (brief → BOD review → 3 planning stages → coding → review/remediation → testing → deploy). We do not design/build a new dispatcher and do not re-implement sequencing, gate logic, or brief classification. The only new fleet-owned piece is a thin forge-exec TaskExecutor adapter (suggested package packages/forge-exec) mapping a ForgeTask → agent-send.sh dispatch to a named fleet agent — forge's single missing piece. It is tracked as a Gitea issue and built post-PoC (not now).
Register backs forge: the Postgres fleet register is genuinely new (neither forge nor the fleet has cross-project state). It BACKS forge's pipeline state (durable resumePipeline, cross-host) plus cross-project missions/tasks/Kanban.
'board' role = forge BOD: the north-star role-library 'board' role IS forge's Board-of-Directors — reused, not reinvented.
Orchestration vs. dispatch: Orchestrator (Mos) sets intent and handles judgment; forge works the mechanical pipeline (sequencing, gates, status transitions, spend ledger). LLM escalation reserved for judgment: mission decomposition, re-planning on failure.
Spend in the register: fleet.spend_ledger tracks projected vs. actual tokens per agent/mission/task; ties to issue #622.
Docs as projections: docs/TASKS.md and MISSION-MANIFEST.md become generated exports of the DB, not hand-maintained.
Sub-decision pending: dedicated schema in existing PG instance (recommended) vs. dedicated PG instance. Revisit if isolation or write-volume demands it.

Future enhancements (north-star, post-MVP — not on the MVP track)

Mosaic Claude Discord Plugin — a first-party Mosaic Discord connector that properly implements the basic Discord functions and native Discord threads. Threads let a user separate conversation topics with the orchestrator (the pattern proven by the Hermes agent). A major enhancement over the current third-party channel plugin; not required for the MVP, but a committed north-star target. ASSUMPTION: ships as a Mosaic-owned plugin so the fleet controls Discord UX (threads, reactions, attachments, per-thread context) end-to-end.
Matrix on a local homeserver — strategic future transport. F4 (in progress) IS the Matrix connector: an orchestrator chat connector speaking the Matrix client-server API against a self-hosted homeserver (Conduit default, Synapse alt). Matrix is named here as the strategic future transport — peer to tmux/Discord, not superseded by them.
tmux fleet attack-surface hardening. Many always-on tmux sessions are an attack surface; tmux send-keys / socket access could enable malicious action against agents directly. Mitigations to build toward: socket ownership/perms, per-tenant socket isolation (already an invariant), authenticated agent-send, and an audit of who can write to any pane. Post-MVP unless a P0 surfaces. The control-plane register reinforces this (gateway-API access = no raw DB creds in panes). A not-started risk-assessment + mitigation-plan task rides the Fleet TASKS.md.

Assumptions (veto-able)

ASSUMPTION: first-class runtimes = claude, codex, pi, opencode; a "role" (analyst, finance, researcher) = persona + skills + tools on top of a runtime, shipped as a starter role library in the framework.
ASSUMPTION: the cross-host control plane is the federation layer (W1), not a separate fleetd daemon.
ASSUMPTION: Fleet is workstream W-FLEET under mvp-20260312; a rollup row in docs/TASKS.md and a workstream declaration in MISSION-MANIFEST.md are proposed to the MVP orchestrator, not written by this workstream.
ASSUMPTION: OAuth-subscription runtimes (Claude sub, Codex sub) expose a machine-readable current-usage-vs-limit signal the fleet can poll/ingest; if a provider exposes no such signal, that provider's accounts fall back to API-style hard-ceiling budgeting only (no auto-pacing).
ASSUMPTION: budget policy lives at the orchestrator + routing layer and is surfaced through the same CLI→TUI→webUI parity (MVP-X1) as the rest of fleet state — not a separate budgeting daemon.
ASSUMPTION: the 200k session cap is enforced by Claude Code settings/env composition (model variant + autoCompactWindow), not by a Mosaic wrapper; a wrapper is the fallback only if the harness later removes those knobs.
ASSUMPTION: The central register (Postgres fleet schema + gateway API + forge as dispatcher) is the Phase 4–5 control plane, begun after Phase 2 observability is proven. It is a dedicated W-FLEET sub-workstream entry, not a separate mission. The dispatcher is @mosaicstack/forge (reused, not a new daemon); the only new fleet-owned code is the thin forge-exec TaskExecutor adapter (suggested package packages/forge-exec, ForgeTask → agent-send.sh), tracked as a Gitea issue and built post-PoC.

Release procedure (drift re-capture, 2026-06-22): mosaic update only propagates new fleet commands when the CLI version is bumped — without a version bump, fleet command changes never reach installed hosts. The release/version-bump procedure (bump → publish → mosaic update [→ --relaunch]) must be documented so fleet changes actually land. (Also feeds the budgeting workstream.)

Tracked separately (not in scope for this doc PR): #622 PRD/mission/task projected+actual spend template standard · #623 anonymized spend telemetry → mosaicstack.dev (product) · #625 tenant_id roster-schema field (multi-tenant; invariant #1 home) · #628 forge-exec TaskExecutor adapter (post-PoC). This PR records doctrine only — no implementation.

36 KiB Raw Blame History Unescape Escape

Mosaic Fleet — North Star

Vision

The Fleet as means of production (bootstrapping)

Alignment with MVP cross-cutting requirements

The stack — where every concern lives

Operating model (inherited, not reinvented)

Fleet roster — the two-agent floor and the role library

Invariants — "maximal vision, incremental delivery, zero foreclosure"

Budget & token governance (first-class fleet concern)

Observation model

Control plane & central register

Why the register must be Postgres

Architecture (layers)

Dispatcher = forge (reuse, do not rebuild)

Docs as projections

Spend

Federation

Scope

Open sub-decision

Phased roadmap

Decisions of record (2026-06-20, with Jason)

Decisions of record (2026-06-22, with Jason)

Control plane & central register

Future enhancements (north-star, post-MVP — not on the MVP track)

Assumptions (veto-able)

36 KiB

Raw Blame History