feat(fleet): provision roster from system-type profile (H3) #665

Merged
jason.woltje merged 3 commits from feat/h3-fleet-provision into main 2026-06-24 19:48:55 +00:00
Owner

H3 — Fleet provisioning

Turns a declared system type (a profile) into a concrete, parser-valid roster.yaml. DRY-RUN-FIRST and reviewable.

Command surface

mosaic fleet provision --profile <id> [--full] [--write] [--force]
  • Default (no --write): DRY RUN. Prints the roster.yaml it WOULD generate to stdout + a topology summary (name / class / runtime / reports_to / persona-resolution layer). Writes nothing, exits 0.
  • --full: materialize the ENTIRE profile roster (all seats, multiplicity expanded). Without --full: only the floor seats (always-staffed minimum).
  • --write: write to <mosaicHome>/fleet/roster.yaml. Refuses to overwrite an existing roster unless --force is also given.
  • Validates the profile first (reuses validateProfile, override-aware class set). Fails with a class-naming message if any class doesn't resolve — never emits a roster referencing a nonexistent persona.

mosaicHome = process.env['MOSAIC_HOME'] ?? join(homedir(), '.config', 'mosaic').

Reuse (no core changes to fleet-profiles.ts / fleet-personas.ts)

  • Profile parse/validate: loadProfile + validateProfile + listPersonaClassesWithOverrides (fleet-profiles.ts).
  • Persona resolution: resolvePersona (fleet-personas.ts), override-aware, used both to confirm each class resolves and to report its baseline/override layer.
  • Wiring mirrors registerFleetProfileCommand / registerFleetPersonaCommand exactly (same mosaicHomeFor closure on the parent --mosaic-home).

Generation-rule defaults (each documented inline + reviewable)

  1. Seat names: multiplicity 1 → class; N>1 → class0,class1,… (e.g. code ×2 → code0/code1). Deterministic, in profile roster order.
  2. persistent_persona: true for floor classes + the lead; omitted otherwise.
  3. reset_between_tasks: true for non-floor, non-lead execution seats; floor/lead omit it (mirrors today's coders resetting while orchestrator/enhancer don't).
  4. Runtime policy: every seat → runtime: claude via a single centralized resolveSeatRuntime(). (see OPEN QUESTION below).
  5. Scaffold: emits the same generic non-personal scaffold as the committed example presets (socket_name: mosaic-fleet, holder_session: _holder, working_directory: ~, claude + pi runtimes) so output is a drop-in valid roster. No operator-personal data copied.
  6. reports_to — DECISION: OMITTED from the written roster. The fleet.ts agent parser (normalizeAgentassertKnownKeys) rejects unknown agent keys, and reports_to is NOT in its allow-list — emitting it would break round-trip. So reports_to is shown in the dry-run topology summary only, not written. Confirmed against the parser's allow-list.

⚠️ OPEN QUESTION for ratification (rule 4) — runtime-per-class policy

Provisioning currently defaults every seat to runtime: claude (the safe universal default — claude runs every persona, so the roster is always launchable). But today's live roster runs coders on pi + model_hint: openai-codex/gpt-5.5:high.

Should provisioning encode a class→runtime/model map, and if so, where?

  • (a) in the profile schema (per-roster-entry runtime/model_hint), or
  • (b) a separate runtime-policy file consumed by resolveSeatRuntime?

The policy is centralized in resolveSeatRuntime(klass, isFloor, isLead) so encoding either answer is a one-function edit; structural correctness holds regardless.

Tests (vitest, all green)

  • floor (default) → exactly orchestrator + enhancer, correct flags + valid scaffold.
  • --full → all seats, code ×2 → code0/code1, deterministic ordering.
  • round-trip: generated YAML fed back through loadFleetRoster (the real parser) → parses with the expected agent count/classes (key correctness proof; also asserts no reports_to emitted).
  • override-aware: a roles.local-only custom persona resolves (no false unresolved error, persona=override); a bogus class FAILS with a clear message.
  • --write refuses to clobber without --force; --write --force overwrites; --write to fresh home creates the file; dry-run writes nothing.

Gates

  • pnpm install
  • turbo build --filter=@mosaicstack/mosaic^... (12/12)
  • vitest run 618 passed (44 files; +9 new provision tests, +1 updated fleet subcommand-list assertion)
  • typecheck · lint · prettier

🤖 Generated with Claude Code

## H3 — Fleet provisioning Turns a declared **system type** (a profile) into a concrete, parser-valid `roster.yaml`. DRY-RUN-FIRST and reviewable. ### Command surface ``` mosaic fleet provision --profile <id> [--full] [--write] [--force] ``` - **Default (no `--write`): DRY RUN.** Prints the `roster.yaml` it WOULD generate to stdout + a topology summary (name / class / runtime / reports_to / persona-resolution layer). Writes nothing, exits 0. - `--full`: materialize the ENTIRE profile roster (all seats, multiplicity expanded). Without `--full`: only the `floor` seats (always-staffed minimum). - `--write`: write to `<mosaicHome>/fleet/roster.yaml`. **Refuses to overwrite** an existing roster unless `--force` is also given. - Validates the profile first (reuses `validateProfile`, override-aware class set). Fails with a class-naming message if any class doesn't resolve — never emits a roster referencing a nonexistent persona. `mosaicHome = process.env['MOSAIC_HOME'] ?? join(homedir(), '.config', 'mosaic')`. ### Reuse (no core changes to fleet-profiles.ts / fleet-personas.ts) - Profile parse/validate: `loadProfile` + `validateProfile` + `listPersonaClassesWithOverrides` (fleet-profiles.ts). - Persona resolution: `resolvePersona` (fleet-personas.ts), override-aware, used both to confirm each class resolves and to report its baseline/override layer. - Wiring mirrors `registerFleetProfileCommand` / `registerFleetPersonaCommand` exactly (same `mosaicHomeFor` closure on the parent `--mosaic-home`). ### Generation-rule defaults (each documented inline + reviewable) 1. **Seat names:** multiplicity 1 → `class`; N>1 → `class0`,`class1`,… (e.g. `code` ×2 → `code0`/`code1`). Deterministic, in profile roster order. 2. **persistent_persona:** `true` for floor classes + the lead; omitted otherwise. 3. **reset_between_tasks:** `true` for non-floor, non-lead execution seats; floor/lead omit it (mirrors today's coders resetting while orchestrator/enhancer don't). 4. **Runtime policy:** every seat → `runtime: claude` via a single centralized `resolveSeatRuntime()`. (see OPEN QUESTION below). 5. **Scaffold:** emits the same generic non-personal scaffold as the committed example presets (`socket_name: mosaic-fleet`, `holder_session: _holder`, `working_directory: ~`, `claude` + `pi` runtimes) so output is a drop-in valid roster. No operator-personal data copied. 6. **reports_to — DECISION: OMITTED from the written roster.** The `fleet.ts` agent parser (`normalizeAgent` → `assertKnownKeys`) rejects unknown agent keys, and `reports_to` is NOT in its allow-list — emitting it would break round-trip. So `reports_to` is shown in the dry-run topology summary only, not written. Confirmed against the parser's allow-list. ### ⚠️ OPEN QUESTION for ratification (rule 4) — runtime-per-class policy Provisioning currently defaults **every** seat to `runtime: claude` (the safe universal default — claude runs every persona, so the roster is always launchable). But **today's live roster runs coders on `pi` + `model_hint: openai-codex/gpt-5.5:high`.** **Should provisioning encode a class→runtime/model map, and if so, where?** - (a) in the **profile schema** (per-roster-entry `runtime`/`model_hint`), or - (b) a **separate runtime-policy file** consumed by `resolveSeatRuntime`? The policy is centralized in `resolveSeatRuntime(klass, isFloor, isLead)` so encoding either answer is a one-function edit; structural correctness holds regardless. ### Tests (vitest, all green) - floor (default) → exactly `orchestrator` + `enhancer`, correct flags + valid scaffold. - `--full` → all seats, `code` ×2 → `code0`/`code1`, deterministic ordering. - **round-trip:** generated YAML fed back through `loadFleetRoster` (the real parser) → parses with the expected agent count/classes (key correctness proof; also asserts no `reports_to` emitted). - override-aware: a `roles.local`-only custom persona resolves (no false unresolved error, `persona=override`); a bogus class FAILS with a clear message. - `--write` refuses to clobber without `--force`; `--write --force` overwrites; `--write` to fresh home creates the file; dry-run writes nothing. ### Gates - `pnpm install` ✅ - `turbo build --filter=@mosaicstack/mosaic^...` ✅ (12/12) - `vitest run` ✅ **618 passed** (44 files; +9 new provision tests, +1 updated fleet subcommand-list assertion) - `typecheck` ✅ · `lint` ✅ · prettier ✅ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
jason.woltje added 1 commit 2026-06-24 17:29:56 +00:00
feat(fleet): provision roster from system-type profile (H3)
Some checks failed
ci/woodpecker/push/ci Pipeline failed
ci/woodpecker/pr/ci Pipeline failed
b2e080df57
Add `mosaic fleet provision --profile <id> [--full] [--write] [--force]`:
materialize a concrete, parser-valid roster.yaml from a declared system-type
profile. DRY-RUN by default (prints roster + topology, writes nothing); --write
persists under <mosaicHome>/fleet/roster.yaml and refuses to clobber without
--force. Reuses loadProfile/validateProfile (fleet-profiles) and resolvePersona
(fleet-personas, override-aware); generation policy is local + documented.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
jason.woltje added 1 commit 2026-06-24 19:02:20 +00:00
perf(fleet): scan persona dirs once per provision
Some checks failed
ci/woodpecker/push/ci Pipeline failed
ci/woodpecker/pr/ci Pipeline was successful
0dc283bc3c
Fix #665 test timeout.
jason.woltje added 1 commit 2026-06-24 19:22:28 +00:00
test(fleet): raise provision spec timeout for I/O-bound CI runs (#665)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
1f33cb135d
Author
Owner

Independent review-of-record — mos-claude-1 (two-perspective) — APPROVE (correctness)

Reviewed PR head 1f33cb1 (3 commits) against origin/main: 5 files, +697/-0.

Design / correctness — clean.

  • fleet-provision.ts owns only the profile→roster generation policy; DRY-reuses loadProfile/validateProfile (fleet-profiles) and persona resolution (fleet-personas). Every generation RULE (1 naming, 2 persistent_persona, 3 reset_between_tasks, 4 runtime, 5 scaffold, 6 reports_to) is documented inline and justified.
  • Safety: DRY-RUN-FIRST default; --write persists; refuses to clobber an existing roster.yaml without --force (protects operator customizations). Verified the access/exists guard logic.
  • RULE 6 (reports_to tracked for topology but NOT emitted) is correct — the fleet.ts parser rejects unknown agent keys, so emitting it would break round-trip. Output is rendered via the same yaml serializer the parser uses, so it round-trips.
  • Integration is minimal and correct: resolvePersona refactored to delegate to a new reusable resolvePersonaFrom (scan-once primitive — the perf commit; backward compatible); command registered in fleet.ts mirroring profile/persona; fleet.spec.ts allow-list updated (provision, alphabetical).

Tests — comprehensive. fleet-provision.spec.ts (270 lines) are real FS integration tests against the committed library: floor-only default (seats+flags+scaffold), --full (multiplicity expansion + deterministic ordering + seat count), round-trip back through the real loadFleetRoster parser (key correctness proof), override-aware resolution of a roles.local-only persona, bogus-class FAILS with a clear message, and all four --write paths (clobber-refused / --force overwrite / fresh create / dry-run writes nothing).

CI (independent confirmation): push pipeline 1541 on this exact HEAD 1f33cb13 = success, all steps OK (ci-postgres, install, sanitization, typecheck, lint, format, test). The PR-event flakes (1542 unrelated @mosaicstack/storage vitest timeout; 1545 clone-step) are infra, not this diff (mergeable=true, no conflict).

Non-blocking note: RULE 4 defaults every seat to runtime: claude. This is intentional and flagged in the PR body — it guarantees a structurally-valid, launchable roster, and the class→runtime/model map (e.g. coders on pi + gpt-5.5) is centralized in resolveSeatRuntime so it's a one-edit follow-up. Appropriate scope for H3.

Verdict: APPROVE (correctness). Recommend merge once a PR-event pipeline goes actually green.

## Independent review-of-record — mos-claude-1 (two-perspective) — APPROVE (correctness) Reviewed PR head `1f33cb1` (3 commits) against `origin/main`: 5 files, +697/-0. **Design / correctness — clean.** - `fleet-provision.ts` owns only the profile→roster generation policy; DRY-reuses `loadProfile`/`validateProfile` (fleet-profiles) and persona resolution (fleet-personas). Every generation RULE (1 naming, 2 persistent_persona, 3 reset_between_tasks, 4 runtime, 5 scaffold, 6 reports_to) is documented inline and justified. - Safety: DRY-RUN-FIRST default; `--write` persists; refuses to clobber an existing `roster.yaml` without `--force` (protects operator customizations). Verified the access/exists guard logic. - RULE 6 (reports_to tracked for topology but NOT emitted) is correct — the `fleet.ts` parser rejects unknown agent keys, so emitting it would break round-trip. Output is rendered via the same `yaml` serializer the parser uses, so it round-trips. - Integration is minimal and correct: `resolvePersona` refactored to delegate to a new reusable `resolvePersonaFrom` (scan-once primitive — the perf commit; backward compatible); command registered in `fleet.ts` mirroring profile/persona; `fleet.spec.ts` allow-list updated (provision, alphabetical). **Tests — comprehensive.** `fleet-provision.spec.ts` (270 lines) are real FS integration tests against the committed library: floor-only default (seats+flags+scaffold), `--full` (multiplicity expansion + deterministic ordering + seat count), round-trip back through the real `loadFleetRoster` parser (key correctness proof), override-aware resolution of a roles.local-only persona, bogus-class FAILS with a clear message, and all four `--write` paths (clobber-refused / `--force` overwrite / fresh create / dry-run writes nothing). **CI (independent confirmation):** push pipeline **1541 on this exact HEAD `1f33cb13` = success**, all steps OK (ci-postgres, install, sanitization, typecheck, lint, format, test). The PR-event flakes (1542 unrelated @mosaicstack/storage vitest timeout; 1545 clone-step) are infra, not this diff (`mergeable=true`, no conflict). **Non-blocking note:** RULE 4 defaults every seat to `runtime: claude`. This is intentional and flagged in the PR body — it guarantees a structurally-valid, launchable roster, and the class→runtime/model map (e.g. coders on pi + gpt-5.5) is centralized in `resolveSeatRuntime` so it's a one-edit follow-up. Appropriate scope for H3. **Verdict: APPROVE (correctness).** Recommend merge once a PR-event pipeline goes actually green.
jason.woltje merged commit d7eaa19380 into main 2026-06-24 19:48:55 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaicstack/stack#665