fix(fleet): consume model_hint + fix socket-default trap (#626)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful

Two spawn-side blockers found building the live PoC roster.

FIX 1 — model_hint not consumed: start-agent-session.sh built 'mosaic yolo
$RUNTIME' with no --model, so pi workers ignored the roster's model. Now
generateAgentEnv emits MOSAIC_AGENT_MODEL=<hint> and the launcher appends
${MOSAIC_AGENT_MODEL:+--model $MOSAIC_AGENT_MODEL} → workers run on e.g.
openai-codex/gpt-5.5:high.

FIX 2 — socket default trap: an ABSENT roster socket silently became
mosaic-factory in THREE places (parseRosterText fallback; the
mosaic-agent@.service Environment= default + ExecStop :-mosaic-factory;
start-agent-session :-mosaic-factory). The live PoC runs on the DEFAULT tmux
socket (socket_name absent). Now absent ⇒ '' ⇒ the literal default socket (no
-L) consistently across spawn, the systemd unit, fleet ps/watch observe, and
the onboarding cheat-sheet:
- socketArgs(name) → name ? ['-L', name] : []; replaces all ~15 -L sites in
  fleet.ts. parseRosterText fallback '' (was DEFAULT_SOCKET_NAME).
- shellEnvValue('') now emits a BARE 'VAR=' (not ''), so a socket-less .env can
  never yield a literal socket named "''" under systemd EnvironmentFile.
- start-agent-session.sh _tmux wrapper passes -L only when a socket is set;
  mosaic-agent@.service drops the socket default + uses a conditional ExecStop.

CONTAINMENT: all 6 shipped presets set socket_name: mosaic-factory explicitly,
so they are unaffected — only socket-less rosters (the PoC) get default-socket
behavior. DEFAULT_SOCKET_NAME exported as a constant for explicit isolation.

Verified: 158 fleet + 201 fleet-adjacent tests green (socketArgs none/named,
model_hint→env, explicit-socket renders -L, socket-less bare env); shell bash -n
+ end-to-end sim (socket-less→no -L, model→--model); tsc/eslint/prettier/
sanitize clean.

Refs #626

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EsgTQzV5YUGk1JtCLP4B83
This commit is contained in:
2026-06-22 13:26:25 -05:00
parent 095e19443b
commit fa2abd462d
6 changed files with 180 additions and 74 deletions

View File

@@ -74,3 +74,7 @@ Active workstream is **W1 — Federation v1**. Workers should:
## Fleet onboarding-injection — comms cheat-sheet + peer roster (#620) — feat/fleet-comms-onboarding
- Status: implemented + tested. Injects # Fleet Comms (peer roster + cross-host agent-send commands + FLIP-reply + --verify) into each spawned fleet agent via composeContract; optional per-agent host/ssh/socket roster fields (socket: named → -L, unset → default socket no -L). 10 + 2 tests green. Detail: scratchpads/fleet-comms-onboarding.md.
## Fleet stand-up fixes — model_hint→--model + socket-default trap (#626) — feat/fleet-standup-fixes
- Status: implemented + tested. FIX1 model_hint→MOSAIC_AGENT_MODEL→--model. FIX2 absent socket = default tmux socket (no -L) across parse/spawn/systemd-unit/observe (socketArgs helper, bare-empty shellEnvValue, conditional -L). 158 fleet tests green; shipped presets unaffected (explicit socket_name). Detail: scratchpads/fleet-standup-fixes.md.

View File

@@ -0,0 +1,28 @@
# Fleet stand-up fixes — model_hint→--model + socket-default trap (#626)
- **Issue:** #626 · **Branch:** `feat/fleet-standup-fixes` (off main). PoC-blocking, before doctrine doc.
## FIX 1 — model_hint consumed
- generateAgentEnv emits `MOSAIC_AGENT_MODEL=<modelHint>` (bare empty when unset).
- start-agent-session.sh default command → `mosaic yolo $RUNTIME ${MOSAIC_AGENT_MODEL:+--model $MOSAIC_AGENT_MODEL}`.
→ pi workers launch with `--model openai-codex/gpt-5.5:high`.
## FIX 2 — socket default trap (absent ⇒ literal default socket, no -L everywhere)
- THE TRAP (3 sites): parseRosterText fallback was DEFAULT_SOCKET_NAME; systemd unit had
`Environment=MOSAIC_TMUX_SOCKET=mosaic-factory` + `ExecStop ${…:-mosaic-factory}`; start-agent-session
defaulted `:-mosaic-factory`. All fixed → absent socket = '' = default tmux socket (no -L).
- `socketArgs(name)` helper → `name ? ['-L', name] : []`; replaced all ~15 -L render sites in fleet.ts.
- shellEnvValue('') now emits a **bare** `VAR=` (not `''`) — unambiguous empty in systemd EnvironmentFile
(a quoted '' could become a literal socket named "''").
- start-agent-session.sh: `_tmux` wrapper passes -L only when socket set; mosaic-agent@.service: dropped the
socket default + conditional ExecStop. So spawn == observe == onboarding cheat-sheet.
- CONTAINMENT: all 6 shipped presets set socket_name: mosaic-factory explicitly → unaffected; only
socket-less rosters (the PoC) get default-socket behavior. DEFAULT_SOCKET_NAME exported for explicit use.
## Verification
- 158 fleet + 201 fleet-adjacent tests green; new: socketArgs none/named, model_hint→env, explicit-socket
renders -L, socket-less env bare. tsc/eslint/prettier/sanitize clean. Shell bash -n + end-to-end sim
(socket-less→no -L, model→--model).