Compare commits
1 Commits
main
...
docs/fleet
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d5951090e8 |
@@ -58,7 +58,3 @@ Active workstream is **W1 — Federation v1**. Workers should:
|
|||||||
## F3-m3 — mosaic update re-seeds framework + relaunches agents (#609) — feat/f3-m3-update-reseed
|
## F3-m3 — mosaic update re-seeds framework + relaunches agents (#609) — feat/f3-m3-update-reseed
|
||||||
|
|
||||||
- Status: implemented + tested. Closes R13: `mosaic update` now re-seeds the framework (data-safe MOSAIC_SYNC_ONLY) after the CLI install so shipped launcher/runtime changes activate; `--relaunch` restarts rostered agents; `--no-reseed` opts out. Detail: scratchpads/f3-m3-update-reseed.md.
|
- Status: implemented + tested. Closes R13: `mosaic update` now re-seeds the framework (data-safe MOSAIC_SYNC_ONLY) after the CLI install so shipped launcher/runtime changes activate; `--relaunch` restarts rostered agents; `--no-reseed` opts out. Detail: scratchpads/f3-m3-update-reseed.md.
|
||||||
|
|
||||||
## Fleet-polish bundle — boot-survival symmetry (#611) — feat/fleet-polish-bundle
|
|
||||||
|
|
||||||
- Status: implemented + tested. disable-on-remove (boot-resurrection bug, TDD) + add-enable + init-R5 hard guarantee. 4 new + 147 existing fleet tests green. Detail: scratchpads/fleet-polish-bundle.md.
|
|
||||||
|
|||||||
@@ -73,6 +73,37 @@ diff-sanity → squash-merge → verify), **decide-and-inform** cadence, and a d
|
|||||||
this model. See `mosaicstack-aiguide` whitepapers 01 (inter-agent comms) and 03
|
this model. See `mosaicstack-aiguide` whitepapers 01 (inter-agent comms) and 03
|
||||||
(orchestration model) for the rationale.
|
(orchestration model) for the rationale.
|
||||||
|
|
||||||
|
## Fleet roster — the two-agent floor and the role library
|
||||||
|
|
||||||
|
A fleet is **never a single agent**. The minimum viable fleet is **two**:
|
||||||
|
|
||||||
|
| Role | Mandate | Boundaries |
|
||||||
|
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
|
||||||
|
| **Orchestrator** | The user's **single point of contact**. Owns the general flow, keeps agentic actions on-target, and **adds/removes agents from the fleet at will** to meet goals and user needs. Exactly **one** per fleet (the existing R5 invariant). | Delegates source work; never the sole worker. |
|
||||||
|
| **Enhancer** | The fleet's **continuous-improvement loop**. Monitors fleet activity, analyzes for enhancements/optimizations, builds a **plan of remediation**, and — **with the orchestrator** — upgrades fleet capability: tool creation/repair, skills, harness improvements, and **bug reports filed to Mosaic Stack** for proper remediation. Recommends which agents are needed. | **Does not code, review code, or perform delivery tasks.** Improvement and diagnosis only. |
|
||||||
|
|
||||||
|
> **Why two, not one:** the orchestrator drives delivery; the enhancer makes the fleet
|
||||||
|
> _get better at delivering_ over time. The enhancer is how the fleet self-heals its tools,
|
||||||
|
> skills, and harnesses, and how real defects flow back to Mosaic Stack as bug reports.
|
||||||
|
> Together they are the irreducible core — every other role is added on demand.
|
||||||
|
|
||||||
|
A **general** fleet starts at this floor: the orchestrator (advised by the enhancer)
|
||||||
|
materializes whatever roles prove necessary over the mission's life. Specialized presets
|
||||||
|
(coding, research, etc.) seed additional roles up front, but all reduce to the same two-agent
|
||||||
|
spine plus an on-demand **role library**:
|
||||||
|
|
||||||
|
| Role profile | Purpose |
|
||||||
|
| ------------------- | --------------------------------------------------------------------------------- |
|
||||||
|
| **orchestrator** | point of contact, flow control, fleet composition (1 per fleet) |
|
||||||
|
| **enhancer** | fleet monitoring, optimization, tool/skill/harness upgrades, upstream bug reports |
|
||||||
|
| **coder** | implementation (worker; stops at PR-open) |
|
||||||
|
| **code review** | independent code review gate |
|
||||||
|
| **security review** | security/auth/secret review gate |
|
||||||
|
| **research** | investigation, synthesis, options analysis |
|
||||||
|
| **board** | deliberation panel — moonshot, contrarian, technical, business, financial lenses |
|
||||||
|
| **operations** | infra, deploy, health, incident response |
|
||||||
|
| _…extensible_ | new profiles added as missions demand (orchestrator + enhancer decide) |
|
||||||
|
|
||||||
## Invariants — "maximal vision, incremental delivery, zero foreclosure"
|
## Invariants — "maximal vision, incremental delivery, zero foreclosure"
|
||||||
|
|
||||||
Every artifact, starting Phase 2, MUST:
|
Every artifact, starting Phase 2, MUST:
|
||||||
@@ -102,7 +133,7 @@ Every artifact, starting Phase 2, MUST:
|
|||||||
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
|
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
|
||||||
| 0–1 | tmux PoC, hardening, published CLI v0.0.34 (#565–#568) | ✅ done |
|
| 0–1 | tmux PoC, hardening, published CLI v0.0.34 (#565–#568) | ✅ done |
|
||||||
| **2 — Observability** | `fleet ps` (host+tenant aware join), heartbeat protocol + dogfood stub answers it, `agent watch` (read-only), `agent send --verify` receipts | ▶ now |
|
| **2 — Observability** | `fleet ps` (host+tenant aware join), heartbeat protocol + dogfood stub answers it, `agent watch` (read-only), `agent send --verify` receipts | ▶ now |
|
||||||
| 3 — Real runtimes | claude/codex/pi/opencode answer heartbeat; **hybrid lifecycle** (core always-on: orchestrator+reviewer; ephemeral workers per lane) | planned |
|
| 3 — Real runtimes | claude/codex/pi/opencode answer heartbeat; **hybrid lifecycle** (core always-on: **orchestrator + enhancer**; ephemeral workers per lane) | planned |
|
||||||
| 4 — Unified definition | one agent schema in gateway; `mosaic agent --new` → materialized per-tenant session; uid-tenant provisioning | planned |
|
| 4 — Unified definition | one agent schema in gateway; `mosaic agent --new` → materialized per-tenant session; uid-tenant provisioning | planned |
|
||||||
| 5 — Control plane | federation-backed cross-host × cross-tenant fleet view; **webUI** (surface chosen then) for MVP-X1 parity | planned |
|
| 5 — Control plane | federation-backed cross-host × cross-tenant fleet view; **webUI** (surface chosen then) for MVP-X1 parity | planned |
|
||||||
|
|
||||||
@@ -121,6 +152,28 @@ Every artifact, starting Phase 2, MUST:
|
|||||||
runtime-bin on PATH (baked into the pane command) + boot-survival (`enable` + linger),
|
runtime-bin on PATH (baked into the pane command) + boot-survival (`enable` + linger),
|
||||||
which `fleet init` should automate.
|
which `fleet init` should automate.
|
||||||
|
|
||||||
|
## Decisions of record (2026-06-22, with Jason)
|
||||||
|
|
||||||
|
- **Two-agent floor:** every fleet has, at minimum, an **orchestrator** and an **enhancer**.
|
||||||
|
The orchestrator is the user's point of contact and composes the fleet; the enhancer runs the
|
||||||
|
continuous-improvement loop (monitor → analyze → remediate → upgrade tools/skills/harness →
|
||||||
|
file Mosaic Stack bug reports) and **does not code or review**.
|
||||||
|
- **Role library:** orchestrator, enhancer, coder, code review, security review, research,
|
||||||
|
board (moonshot/contrarian/technical/business/financial), operations — extensible; the
|
||||||
|
orchestrator (advised by the enhancer) adds roles as missions demand.
|
||||||
|
- **Orchestrator chat connector:** the orchestrator is reachable over a user-chosen connector
|
||||||
|
(tmux now; Telegram/Discord/Matrix/Slack configurable). Validated live: **"Mos" orchestrator
|
||||||
|
on Discord** via the Claude Code discord channel plugin (w-jarvis).
|
||||||
|
|
||||||
|
## Future enhancements (north-star, post-MVP — not on the MVP track)
|
||||||
|
|
||||||
|
- **Mosaic Claude Discord Plugin** — a first-party Mosaic Discord connector that properly
|
||||||
|
implements the basic Discord functions **and native Discord threads**. Threads let a user
|
||||||
|
separate conversation topics with the orchestrator (the pattern proven by the Hermes agent).
|
||||||
|
A major enhancement over the current third-party channel plugin; **not required for the MVP**,
|
||||||
|
but a committed north-star target. `ASSUMPTION:` ships as a Mosaic-owned plugin so the fleet
|
||||||
|
controls Discord UX (threads, reactions, attachments, per-thread context) end-to-end.
|
||||||
|
|
||||||
## Assumptions (veto-able)
|
## Assumptions (veto-able)
|
||||||
|
|
||||||
- `ASSUMPTION:` first-class runtimes = claude, codex, pi, opencode; a "role" (analyst,
|
- `ASSUMPTION:` first-class runtimes = claude, codex, pi, opencode; a "role" (analyst,
|
||||||
|
|||||||
@@ -1,20 +0,0 @@
|
|||||||
# Fleet-polish bundle — boot-survival symmetry (#611)
|
|
||||||
|
|
||||||
- **Issue:** #611 · **Branch:** `feat/fleet-polish-bundle` · From the Lead's Codex symmetry-gap finding.
|
|
||||||
|
|
||||||
## Three fixes
|
|
||||||
|
|
||||||
1. **disable-on-remove (BUG, TDD).** `fleet remove` stopped + deleted roster/env/heartbeat but never
|
|
||||||
`systemctl --user disable mosaic-agent@NAME.service` → a removed-but-enabled unit could resurrect on
|
|
||||||
reboot pointing at deleted config. Fix: `buildSystemdDisableCommand` + disable in `remove`
|
|
||||||
(best-effort, gated on !--keep-files).
|
|
||||||
2. **add-enable.** `fleet add` now enables the new agent's unit for boot-survival (best-effort,
|
|
||||||
independent of --start) — symmetry with disable-on-remove.
|
|
||||||
3. **init-R5 guarantee.** `fleet init --write` now FAILS HARD when a non-minimal profile doesn't yield
|
|
||||||
exactly one orchestrator (was a soft warning). `minimal` (sanctioned no-orchestrator) still allowed.
|
|
||||||
|
|
||||||
## Verification
|
|
||||||
|
|
||||||
- 4 new tests (disable builder; remove-invokes-disable; add-invokes-enable; init general → exactly 1
|
|
||||||
orchestrator) + 147 existing fleet tests green (151 total). tsc/eslint/prettier clean.
|
|
||||||
- TDD on the disable bug per contract.
|
|
||||||
@@ -14,7 +14,6 @@ import {
|
|||||||
buildEnableLingerCommand,
|
buildEnableLingerCommand,
|
||||||
buildFleetServiceCommand,
|
buildFleetServiceCommand,
|
||||||
buildSystemdEnableCommand,
|
buildSystemdEnableCommand,
|
||||||
buildSystemdDisableCommand,
|
|
||||||
buildSystemdShowCommand,
|
buildSystemdShowCommand,
|
||||||
buildTmuxListPanesCommand,
|
buildTmuxListPanesCommand,
|
||||||
buildTmuxListSessionsCommand,
|
buildTmuxListSessionsCommand,
|
||||||
@@ -984,127 +983,6 @@ describe('fleet ps — drift detection', () => {
|
|||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
describe('fleet-polish bundle — boot-survival symmetry', () => {
|
|
||||||
async function rosterHome(agents: string): Promise<string> {
|
|
||||||
const home = await tempDir();
|
|
||||||
await mkdir(join(home, 'fleet'), { recursive: true });
|
|
||||||
await writeFile(join(home, 'fleet', 'roster.yaml'), agents);
|
|
||||||
return home;
|
|
||||||
}
|
|
||||||
|
|
||||||
it('buildSystemdDisableCommand returns the systemctl --user disable array', () => {
|
|
||||||
expect(buildSystemdDisableCommand('mosaic-agent@coder0.service')).toEqual([
|
|
||||||
'systemctl',
|
|
||||||
'--user',
|
|
||||||
'disable',
|
|
||||||
'mosaic-agent@coder0.service',
|
|
||||||
]);
|
|
||||||
});
|
|
||||||
|
|
||||||
it('fleet remove DISABLES the unit so a removed agent cannot resurrect on boot', async () => {
|
|
||||||
const home = await rosterHome(
|
|
||||||
[
|
|
||||||
'version: 1',
|
|
||||||
'transport: tmux',
|
|
||||||
'agents:',
|
|
||||||
' - name: orchestrator',
|
|
||||||
' runtime: pi',
|
|
||||||
' class: orchestrator',
|
|
||||||
' - name: coder0',
|
|
||||||
' runtime: codex',
|
|
||||||
' class: worker',
|
|
||||||
].join('\n') + '\n',
|
|
||||||
);
|
|
||||||
const calls: string[][] = [];
|
|
||||||
const runner: CommandRunner = async (command, args) => {
|
|
||||||
calls.push([command, ...args]);
|
|
||||||
return { stdout: '', stderr: '', exitCode: 0 };
|
|
||||||
};
|
|
||||||
const program = new Command();
|
|
||||||
program.exitOverride();
|
|
||||||
registerFleetCommand(program, { runner, mosaicHome: home });
|
|
||||||
try {
|
|
||||||
await program.parseAsync(['node', 'mosaic', 'fleet', 'remove', 'coder0']);
|
|
||||||
expect(calls).toContainEqual([
|
|
||||||
'systemctl',
|
|
||||||
'--user',
|
|
||||||
'disable',
|
|
||||||
'mosaic-agent@coder0.service',
|
|
||||||
]);
|
|
||||||
// stop must still happen too
|
|
||||||
expect(calls).toContainEqual(['systemctl', '--user', 'stop', 'mosaic-agent@coder0.service']);
|
|
||||||
} finally {
|
|
||||||
await rm(home, { recursive: true, force: true });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
it('fleet add ENABLES the new agent unit for boot-survival', async () => {
|
|
||||||
const home = await rosterHome(
|
|
||||||
['version: 1', 'transport: tmux', 'agents:', ' - name: coder0', ' runtime: codex'].join(
|
|
||||||
'\n',
|
|
||||||
) + '\n',
|
|
||||||
);
|
|
||||||
const calls: string[][] = [];
|
|
||||||
const runner: CommandRunner = async (command, args) => {
|
|
||||||
calls.push([command, ...args]);
|
|
||||||
return { stdout: '', stderr: '', exitCode: 0 };
|
|
||||||
};
|
|
||||||
const program = new Command();
|
|
||||||
program.exitOverride();
|
|
||||||
registerFleetCommand(program, { runner, mosaicHome: home });
|
|
||||||
try {
|
|
||||||
await program.parseAsync([
|
|
||||||
'node',
|
|
||||||
'mosaic',
|
|
||||||
'fleet',
|
|
||||||
'add',
|
|
||||||
'coder1',
|
|
||||||
'--runtime',
|
|
||||||
'codex',
|
|
||||||
'--class',
|
|
||||||
'worker',
|
|
||||||
'--no-start',
|
|
||||||
]);
|
|
||||||
expect(calls).toContainEqual([
|
|
||||||
'systemctl',
|
|
||||||
'--user',
|
|
||||||
'enable',
|
|
||||||
'mosaic-agent@coder1.service',
|
|
||||||
]);
|
|
||||||
} finally {
|
|
||||||
await rm(home, { recursive: true, force: true });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
it('fleet init --write fails hard when a non-minimal profile lacks exactly one orchestrator', async () => {
|
|
||||||
// The general profile must yield exactly one orchestrator; the guarantee is
|
|
||||||
// enforced (not just warned). We assert the happy path writes cleanly.
|
|
||||||
const home = await tempDir();
|
|
||||||
const program = new Command();
|
|
||||||
program.exitOverride();
|
|
||||||
registerFleetCommand(program, {
|
|
||||||
runner: async () => ({ stdout: '', stderr: '', exitCode: 0 }),
|
|
||||||
mosaicHome: home,
|
|
||||||
});
|
|
||||||
try {
|
|
||||||
await program.parseAsync([
|
|
||||||
'node',
|
|
||||||
'mosaic',
|
|
||||||
'fleet',
|
|
||||||
'init',
|
|
||||||
'--profile',
|
|
||||||
'general',
|
|
||||||
'--write',
|
|
||||||
]);
|
|
||||||
const written = await readFile(join(home, 'fleet', 'roster.yaml'), 'utf8');
|
|
||||||
const orchestrators = (written.match(/class:\s*orchestrator/g) ?? []).length;
|
|
||||||
expect(orchestrators).toBe(1);
|
|
||||||
} finally {
|
|
||||||
await rm(home, { recursive: true, force: true });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe('fleet install — auto-enable units for boot-survival', () => {
|
describe('fleet install — auto-enable units for boot-survival', () => {
|
||||||
it('buildSystemdEnableCommand and buildEnableLingerCommand return correct command arrays', () => {
|
it('buildSystemdEnableCommand and buildEnableLingerCommand return correct command arrays', () => {
|
||||||
expect(buildSystemdEnableCommand('mosaic-tmux-holder.service')).toEqual([
|
expect(buildSystemdEnableCommand('mosaic-tmux-holder.service')).toEqual([
|
||||||
|
|||||||
@@ -227,15 +227,6 @@ export function buildSystemdEnableCommand(unit: string): string[] {
|
|||||||
return ['systemctl', '--user', 'enable', unit];
|
return ['systemctl', '--user', 'enable', unit];
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Returns the systemctl --user disable command for a given unit.
|
|
||||||
* Used by `fleet remove` so a removed agent's enabled unit cannot resurrect on
|
|
||||||
* boot pointing at deleted config (boot-survival symmetry with enable-on-add).
|
|
||||||
*/
|
|
||||||
export function buildSystemdDisableCommand(unit: string): string[] {
|
|
||||||
return ['systemctl', '--user', 'disable', unit];
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Returns the loginctl enable-linger command for a given user.
|
* Returns the loginctl enable-linger command for a given user.
|
||||||
* Linger allows user systemd services to survive logout.
|
* Linger allows user systemd services to survive logout.
|
||||||
@@ -881,19 +872,15 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
await mkdir(dirname(destination), { recursive: true });
|
await mkdir(dirname(destination), { recursive: true });
|
||||||
await writeFile(destination, content);
|
await writeFile(destination, content);
|
||||||
|
|
||||||
// Guarantee R5: exactly one orchestrator for every profile except the
|
// Validate: exactly one orchestrator required (R5) — friendly summary on success.
|
||||||
// sanctioned no-orchestrator `minimal` preset. A mismatch means a
|
|
||||||
// corrupted/edited preset — fail hard rather than write a malformed fleet.
|
|
||||||
const written = await loadFleetRoster(destination);
|
const written = await loadFleetRoster(destination);
|
||||||
const orchCount = countOrchestrators(written);
|
const orchCount = countOrchestrators(written);
|
||||||
if (profile === 'minimal') {
|
if (orchCount !== 1) {
|
||||||
console.log(
|
process.stderr.write(
|
||||||
`Initialized ${profile} fleet: ${written.agents.length} agent(s) (no orchestrator). Next: mosaic fleet install`,
|
`Warning: fleet roster at ${destination} has ${orchCount} orchestrator agent(s) (expected exactly 1).\n`,
|
||||||
);
|
);
|
||||||
} else if (orchCount !== 1) {
|
console.log(
|
||||||
throw new Error(
|
`Initialized ${profile} fleet: ${written.agents.length} agent(s). Next: mosaic fleet install`,
|
||||||
`Fleet init failed: the "${profile}" roster has ${orchCount} orchestrator agent(s), ` +
|
|
||||||
`expected exactly 1 (R5). The preset may be corrupted — re-install the framework.`,
|
|
||||||
);
|
);
|
||||||
} else {
|
} else {
|
||||||
const workerCount = written.agents.length - 1;
|
const workerCount = written.agents.length - 1;
|
||||||
@@ -1231,24 +1218,6 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
|
|
||||||
console.log(`Added ${name} (${opts.runtime}/${opts.class}) to the fleet.`);
|
console.log(`Added ${name} (${opts.runtime}/${opts.class}) to the fleet.`);
|
||||||
|
|
||||||
// Enable the unit for boot-survival (non-fatal) — symmetry with
|
|
||||||
// disable-on-remove. Independent of --start so a queued agent still
|
|
||||||
// survives a reboot once its unit exists.
|
|
||||||
try {
|
|
||||||
const enableResult = await runner(
|
|
||||||
...splitCommand(buildSystemdEnableCommand(`mosaic-agent@${name}.service`)),
|
|
||||||
);
|
|
||||||
if (enableResult.exitCode !== 0) {
|
|
||||||
process.stderr.write(
|
|
||||||
`Warning: could not enable mosaic-agent@${name}.service: ${enableResult.stderr || enableResult.stdout || 'non-zero exit'}\n`,
|
|
||||||
);
|
|
||||||
}
|
|
||||||
} catch (err) {
|
|
||||||
process.stderr.write(
|
|
||||||
`Warning: enable command failed for ${name}: ${err instanceof Error ? err.message : String(err)}\n`,
|
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (opts.start !== false) {
|
if (opts.start !== false) {
|
||||||
await runChecked(runner, buildFleetServiceCommand('start', name));
|
await runChecked(runner, buildFleetServiceCommand('start', name));
|
||||||
console.log(`Started mosaic-agent@${name}.service.`);
|
console.log(`Started mosaic-agent@${name}.service.`);
|
||||||
@@ -1285,26 +1254,6 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Disable the unit (non-fatal) so an enabled instance cannot resurrect on
|
|
||||||
// boot pointing at the now-deleted config — boot-survival symmetry with
|
|
||||||
// enable-on-add. Skipped only when --keep-files keeps the config in place.
|
|
||||||
if (!opts.keepFiles) {
|
|
||||||
try {
|
|
||||||
const disableResult = await runner(
|
|
||||||
...splitCommand(buildSystemdDisableCommand(`mosaic-agent@${name}.service`)),
|
|
||||||
);
|
|
||||||
if (disableResult.exitCode !== 0) {
|
|
||||||
process.stderr.write(
|
|
||||||
`Warning: could not disable mosaic-agent@${name}.service: ${disableResult.stderr || disableResult.stdout || 'non-zero exit'}\n`,
|
|
||||||
);
|
|
||||||
}
|
|
||||||
} catch (err) {
|
|
||||||
process.stderr.write(
|
|
||||||
`Warning: disable command failed for ${name}: ${err instanceof Error ? err.message : String(err)}\n`,
|
|
||||||
);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Write updated roster
|
// Write updated roster
|
||||||
await writeFile(rosterPath, serializeRosterToYaml(updatedRoster));
|
await writeFile(rosterPath, serializeRosterToYaml(updatedRoster));
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user