refactor(fleet): rename tmux socket mosaic-factory → mosaic-fleet
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful

Pure rename of the named production-isolation socket to match the product brand
(Mosaic Fleet), per Jason. Behavior is identical — only the socket NAME changes.

- 6 example presets: socket_name: mosaic-factory → mosaic-fleet
- fleet.ts: DEFAULT_SOCKET_NAME = 'mosaic-fleet' (+ all literals)
- systemd units + READMEs, roster.schema.json, start-agent-session.sh
- docs/guides + fleet PRD/TASKS + scratchpads; tests updated

The PoC roster is socket-LESS (default socket, no -L), so it is unaffected.
docs/fleet/north-star.md is intentionally EXCLUDED (owned by the in-flight
doctrine PR #629 — its mosaic-factory references are renamed there to avoid a
merge conflict). The live legacy canary fleet still running on the old socket is
a separate retire/migrate op, not this PR.

Single find/replace — target trivially swappable if the brand is reconsidered.

Verified: 172 fleet + onboarding tests green; tsc/eslint/prettier/sanitize clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EsgTQzV5YUGk1JtCLP4B83
This commit is contained in:
2026-06-22 15:07:05 -05:00
parent e2336bb0ca
commit 4e1fa88076
21 changed files with 84 additions and 84 deletions

View File

@@ -7,10 +7,10 @@
## Problem
The durable tmux fleet runs on the isolated `mosaic-factory` socket. That isolation
The durable tmux fleet runs on the isolated `mosaic-fleet` socket. That isolation
(which protects the operator's default tmux) makes the fleet **invisible** to default
tooling, and truth is split across three planes no single command joins — systemd
(`systemctl --user`), tmux (`-L mosaic-factory`), and the process tree (`pstree`).
(`systemctl --user`), tmux (`-L mosaic-fleet`), and the process tree (`pstree`).
`agent tail` (`capture-pane`) returns **blank for full-screen TUIs**, and `agent send`
confirms only keystroke injection, not acceptance. Net: the operator has near-zero
observability and no safe way to watch a session.
@@ -56,7 +56,7 @@ observability and no safe way to watch a session.
## Acceptance criteria
- `mosaic fleet ps` shows all 5 live sessions on `mosaic-factory` with correct
- `mosaic fleet ps` shows all 5 live sessions on `mosaic-fleet` with correct
pane/pid/idle and flags the dogfood **drift** (`canary-pi` runtime=pi but pane runs
`dogfood-agent.py`) and the **boot-enable** gap (active but disabled).
- Killing one agent's pane flips its row to dead/stale within one `interval`.
@@ -72,7 +72,7 @@ observability and no safe way to watch a session.
- Unit/CLI specs in `packages/mosaic/src/commands/fleet.spec.ts` (and a new
`fleet-ps`/`watch`/`send-verify` spec) using the injected `CommandRunner` to assert
exact tmux/systemd command construction and JSON shape (tenant+host present).
- Situational: run against the live `mosaic-factory` fleet; capture `fleet ps` output,
- Situational: run against the live `mosaic-fleet` fleet; capture `fleet ps` output,
a kill-and-detect cycle, a read-only `watch`, and a `send --verify` pass/fail pair.
## Known limitations

View File

@@ -7,18 +7,18 @@
> Mission: `mvp-20260312` · PRD: [docs/fleet/PRD.md](./PRD.md) · North star: [docs/fleet/north-star.md](./north-star.md)
> Status: `not-started` | `in-progress` | `done` | `blocked` | `failed`
| id | status | description | depends_on | agent | pr | notes |
| ------------- | ----------- | ------------------------------------------------------------------------------------------------------------------ | --------------------- | ----------- | --- | ----------------------------------------------------------------------------------------------------------------------------- |
| FLEET-OBS-000 | done | Plan: north-star + Phase-2 PRD + workstream scaffolding | — | lead | — | persisted 2026-06-20 on `feat/fleet-observability` |
| FLEET-OBS-001 | done | Heartbeat protocol v1 spec finalized in PRD + framework doc | FLEET-OBS-000 | lead | — | file-based `~/.config/mosaic/fleet/run/<agent>.hb`; spec in PRD |
| FLEET-OBS-002 | in-progress | Implement heartbeat responder in `dogfood-agent.py` | FLEET-OBS-001 | fleet-coder | — | dispatched to ad-hoc `mosaic yolo` fleet agent (dogfood) |
| FLEET-OBS-003 | done | `mosaic fleet ps` — join systemd+tmux+proc+idle+heartbeat; tenant+host tagged; drift + boot-enable flags; `--json` | FLEET-OBS-001 | worker | — | commit ab47831; LIVE-verified on mosaic-factory; caught canary-pi DRIFT + BOOT-ENABLE. Polish: idleSeconds parse returns null |
| FLEET-OBS-004 | done | `mosaic agent watch <name>` — read-only join (no resize, no keystrokes) | FLEET-OBS-000 | worker | — | `attach -r`; verb wired |
| FLEET-OBS-005 | done | `mosaic agent send --verify` — delivery/acceptance receipt | FLEET-OBS-000 | worker | — | --verify flag; draft-heuristic verify |
| FLEET-OBS-006 | done | CLI specs for ps/watch/send-verify (tenant+host shape, command construction) | FLEET-OBS-003,004,005 | worker | — | 62 tests green (31 new); re-verified by lead |
| FLEET-OBS-007 | not-started | Framework doc: fleet observability guide + verbs | FLEET-OBS-003,004,005 | lead | — | `docs/guides/` or `framework/tools/.../README` |
| FLEET-OBS-008 | not-started | Independent review + dogfood verification on live fleet | FLEET-OBS-002..007 | reviewer | — | author ≠ reviewer; capture evidence in scratchpad |
| FLEET-OBS-009 | not-started | Open PR → green CI (queue guard) → squash-merge → close `fleet-observability-1` | FLEET-OBS-008 | lead | — | trunk merge; no direct push to main |
| id | status | description | depends_on | agent | pr | notes |
| ------------- | ----------- | ------------------------------------------------------------------------------------------------------------------ | --------------------- | ----------- | --- | --------------------------------------------------------------------------------------------------------------------------- |
| FLEET-OBS-000 | done | Plan: north-star + Phase-2 PRD + workstream scaffolding | — | lead | — | persisted 2026-06-20 on `feat/fleet-observability` |
| FLEET-OBS-001 | done | Heartbeat protocol v1 spec finalized in PRD + framework doc | FLEET-OBS-000 | lead | — | file-based `~/.config/mosaic/fleet/run/<agent>.hb`; spec in PRD |
| FLEET-OBS-002 | in-progress | Implement heartbeat responder in `dogfood-agent.py` | FLEET-OBS-001 | fleet-coder | — | dispatched to ad-hoc `mosaic yolo` fleet agent (dogfood) |
| FLEET-OBS-003 | done | `mosaic fleet ps` — join systemd+tmux+proc+idle+heartbeat; tenant+host tagged; drift + boot-enable flags; `--json` | FLEET-OBS-001 | worker | — | commit ab47831; LIVE-verified on mosaic-fleet; caught canary-pi DRIFT + BOOT-ENABLE. Polish: idleSeconds parse returns null |
| FLEET-OBS-004 | done | `mosaic agent watch <name>` — read-only join (no resize, no keystrokes) | FLEET-OBS-000 | worker | — | `attach -r`; verb wired |
| FLEET-OBS-005 | done | `mosaic agent send --verify` — delivery/acceptance receipt | FLEET-OBS-000 | worker | — | --verify flag; draft-heuristic verify |
| FLEET-OBS-006 | done | CLI specs for ps/watch/send-verify (tenant+host shape, command construction) | FLEET-OBS-003,004,005 | worker | — | 62 tests green (31 new); re-verified by lead |
| FLEET-OBS-007 | not-started | Framework doc: fleet observability guide + verbs | FLEET-OBS-003,004,005 | lead | — | `docs/guides/` or `framework/tools/.../README` |
| FLEET-OBS-008 | not-started | Independent review + dogfood verification on live fleet | FLEET-OBS-002..007 | reviewer | — | author ≠ reviewer; capture evidence in scratchpad |
| FLEET-OBS-009 | not-started | Open PR → green CI (queue guard) → squash-merge → close `fleet-observability-1` | FLEET-OBS-008 | lead | — | trunk merge; no direct push to main |
## Proposed MVP rollup row (for the MVP orchestrator — not written by this workstream)

View File

@@ -1,7 +1,7 @@
# Local Fleet Canary
The local fleet canary runs a small tmux-backed Mosaic agent fleet on an
isolated tmux socket. The default socket is `mosaic-factory`; the commands do
isolated tmux socket. The default socket is `mosaic-fleet`; the commands do
not use or stop the default tmux server.
## Files
@@ -67,7 +67,7 @@ mosaic agent tail canary-pi -n 80
These commands read the roster and target the configured tmux socket. The
generated systemd agent services use `start-agent-session.sh`; message delivery
uses the tmux send tools with `-L mosaic-factory`.
uses the tmux send tools with `-L mosaic-fleet`.
`mosaic agent send` is operator-origin traffic unless a caller explicitly says
otherwise. The CLI always passes a deterministic source label to
@@ -82,7 +82,7 @@ impersonating a known handoff lane. The lower-level inter-agent wrapper
Use these checks before expanding the roster:
```bash
tmux -L mosaic-factory ls
tmux -L mosaic-fleet ls
tmux ls
mosaic fleet verify
systemctl --user status mosaic-tmux-holder.service
@@ -90,7 +90,7 @@ systemctl --user status mosaic-tmux-holder.service
Expected results:
- `tmux -L mosaic-factory ls` shows `_holder` and roster agent sessions.
- `tmux -L mosaic-fleet ls` shows `_holder` and roster agent sessions.
- `tmux ls` shows only the default tmux server sessions and is not changed by
fleet start/stop operations.
- `mosaic fleet verify` checks exact session targets on the isolated socket.
@@ -108,7 +108,7 @@ Run this checklist before cutting or dogfooding a fleet release:
repeated `start` against the named socket; verify the default tmux server is
unchanged.
- Liveness verification: run `mosaic fleet verify` and confirm roster sessions
with `tmux -L mosaic-factory ls` or exact `has-session` checks.
with `tmux -L mosaic-fleet ls` or exact `has-session` checks.
- Package dry-run: run `npm pack --dry-run --json` from `packages/mosaic` and
confirm `framework/fleet`, `framework/systemd/user`,
`framework/tools/fleet`, and `framework/tools/tmux` assets are included.
@@ -140,5 +140,5 @@ This rollback leaves the default tmux server untouched. If a canary session is
still present after service stop, remove only the isolated socket server:
```bash
tmux -L mosaic-factory kill-server
tmux -L mosaic-fleet kill-server
```

View File

@@ -17,7 +17,7 @@ Implement enough product surface to use the fleet locally:
- roster schema and examples
- local canary docs and rollback instructions
- tests for CLI behavior where practical
- canary verification on named tmux socket `mosaic-factory`
- canary verification on named tmux socket `mosaic-fleet`
## Non-goals
@@ -30,7 +30,7 @@ Implement enough product surface to use the fleet locally:
- CLI can initialize a minimal roster outside product defaults.
- CLI can install user systemd units and fleet helper scripts to a configurable Mosaic home.
- CLI can start/stop/status/verify a canary fleet using `mosaic-factory`.
- CLI can start/stop/status/verify a canary fleet using `mosaic-fleet`.
- `mosaic agent send` uses existing named-socket/exact-target tmux tooling.
- `mosaic agent reset` targets only the named agent session on the named socket.
- Verification proves default tmux sessions remain untouched.

View File

@@ -31,7 +31,7 @@ with a second agent on `dragon-lin`.
## Environment facts (verified 2026-06-20)
- Fleet is live on `W-jarvis` (uid 1000, `jarvis`, `Linger=yes`) on tmux socket
`mosaic-factory`: `_holder`, `canary-pi`, `dogfood-coder`, `dogfood-orchestrator`,
`mosaic-fleet`: `_holder`, `canary-pi`, `dogfood-coder`, `dogfood-orchestrator`,
`dogfood-reviewer`. All panes run `~/.config/mosaic/fleet/dogfood-agent.py` (stub),
including `canary-pi` (roster says runtime=pi → **drift**).
- Holder + `mosaic-agent@*` units are `active (exited)` but `UnitFileState=disabled`
@@ -56,7 +56,7 @@ with a second agent on `dragon-lin`.
with dragon-lin coder, commit docs, begin Phase-2 delivery (heartbeat + `fleet ps`).
- 2026-06-20 (session 2): Built Phase-2 CLI via worker (commit ab47831): `fleet ps`,
`agent watch`, `agent send --verify`, 62 tests. LIVE-verified `fleet ps` on
mosaic-factory — correctly flagged canary-pi DRIFT + BOOT-ENABLE, tenant_id+host in JSON.
mosaic-fleet — correctly flagged canary-pi DRIFT + BOOT-ENABLE, tenant_id+host in JSON.
Heartbeat responder added to dogfood-agent.py (FLEET-OBS-002) — `fleet ps` HB now
`healthy` for all 4 agents.
- Coordination: dual-engine-reviewed (Claude+Codex) and merged framework PRs #572

View File

@@ -11,14 +11,14 @@
## FIX 2 — socket default trap (absent ⇒ literal default socket, no -L everywhere)
- THE TRAP (3 sites): parseRosterText fallback was DEFAULT_SOCKET_NAME; systemd unit had
`Environment=MOSAIC_TMUX_SOCKET=mosaic-factory` + `ExecStop ${…:-mosaic-factory}`; start-agent-session
defaulted `:-mosaic-factory`. All fixed → absent socket = '' = default tmux socket (no -L).
`Environment=MOSAIC_TMUX_SOCKET=mosaic-fleet` + `ExecStop ${…:-mosaic-fleet}`; start-agent-session
defaulted `:-mosaic-fleet`. All fixed → absent socket = '' = default tmux socket (no -L).
- `socketArgs(name)` helper → `name ? ['-L', name] : []`; replaced all ~15 -L render sites in fleet.ts.
- shellEnvValue('') now emits a **bare** `VAR=` (not `''`) — unambiguous empty in systemd EnvironmentFile
(a quoted '' could become a literal socket named "''").
- start-agent-session.sh: `_tmux` wrapper passes -L only when socket set; mosaic-agent@.service: dropped the
socket default + conditional ExecStop. So spawn == observe == onboarding cheat-sheet.
- CONTAINMENT: all 6 shipped presets set socket_name: mosaic-factory explicitly → unaffected; only
- CONTAINMENT: all 6 shipped presets set socket_name: mosaic-fleet explicitly → unaffected; only
socket-less rosters (the PoC) get default-socket behavior. DEFAULT_SOCKET_NAME exported for explicit use.
## Verification