fix(fleet): durable runtime PATH for detached agent launch #581

Merged
jason.woltje merged 3 commits from feat/fleet-launch-path into main 2026-06-21 17:30:41 +00:00
Owner

Durable real-runtime launch fix for the fleet.

Problem: start-agent-session.sh launched the runtime in a detached tmux pane whose shell is login+non-interactive, so it missed ~/.npm-global/bin (added only in ~/.bashrc) -> 'mosaic: command not found' (127) -> pane died. tmux panes inherit the tmux server env, not the launcher's, so PATH must be baked into the pane command.

Fix: derive a runtime-bin prefix (MOSAIC_RUNTIME_BIN override | npm config get prefix/bin | ~/.npm-global/bin | ~/.local/bin; dedup, skip missing) and bake 'export PATH=:$PATH; exec ' into the pane command. exec makes the runtime the pane foreground (fixes a fleet ps DRIFT false-positive).

Validated LIVE under a stripped PATH (the bug condition): 'mosaic yolo pi --model openai-codex/gpt-5.5:high' came up durable (pane=node), confirming the fallback chain. Sanitization gate + shell test pass.

Also records the corrected findings + runtime-default policy (Codex/pi-on-Codex default; Claude reserved for Claude Code) in docs/fleet/. Findings also in aiguide #7 (merged).

Durable real-runtime launch fix for the fleet. Problem: start-agent-session.sh launched the runtime in a detached tmux pane whose shell is login+non-interactive, so it missed ~/.npm-global/bin (added only in ~/.bashrc) -> 'mosaic: command not found' (127) -> pane died. tmux panes inherit the tmux server env, not the launcher's, so PATH must be baked into the pane command. Fix: derive a runtime-bin prefix (MOSAIC_RUNTIME_BIN override | npm config get prefix/bin | ~/.npm-global/bin | ~/.local/bin; dedup, skip missing) and bake 'export PATH=<prefix>:$PATH; exec <cmd>' into the pane command. exec makes the runtime the pane foreground (fixes a fleet ps DRIFT false-positive). Validated LIVE under a stripped PATH (the bug condition): 'mosaic yolo pi --model openai-codex/gpt-5.5:high' came up durable (pane=node), confirming the fallback chain. Sanitization gate + shell test pass. Also records the corrected findings + runtime-default policy (Codex/pi-on-Codex default; Claude reserved for Claude Code) in docs/fleet/. Findings also in aiguide #7 (merged).
jason.woltje added 2 commits 2026-06-21 17:09:44 +00:00
Derive a runtime-bin PATH prefix (MOSAIC_RUNTIME_BIN override, then
npm-prefix/bin, then ~/.npm-global/bin / ~/.local/bin) and bake it
into the tmux pane command as `export PATH="<prefix>:${PATH}"; exec
<cmd>` so the runtime binary (mosaic/pi/codex) is always found in a
login+non-interactive pane shell that does not source ~/.bashrc.
Using `exec` makes the runtime the pane foreground process, eliminating
the DRIFT false-positive in `mosaic fleet ps`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RMoEx7hfdFGjUiCHuN1RRi
docs(fleet): record durable-launch findings + runtime-default policy
Some checks failed
ci/woodpecker/push/ci Pipeline was canceled
ci/woodpecker/pr/ci Pipeline was canceled
1908dab373
Correct the launch-path finding (PATH, not TTY), record the validated durable
real-agent recipe (pi on openai-codex/gpt-5.5), the Codex-default/Claude-reserved
policy, and the fleet-init boot-survival automation TODO.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RMoEx7hfdFGjUiCHuN1RRi
jason.woltje added 1 commit 2026-06-21 17:14:35 +00:00
fix(fleet): always bake runtime-bin into pane PATH (ignore launcher PATH)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
a2b11118e3
The previous `_build_runtime_bin_prefix()` skipped candidate dirs that
were already present in the LAUNCHER process's \$PATH.  This is wrong:
the tmux pane inherits the tmux SERVER environment, not the launcher's
env.  A dir on the launcher's \$PATH may be absent from the server env,
so the prefix could come back empty and the pane would fail with
'command not found'.

Remove the `case ":${PATH}:"` check that tested against the live launcher
PATH.  Keep the existence check (`[ -d "$dir" ]`) and the dedup-within-
the-constructed-prefix guard.  The pane command's `export PATH="<prefix>:${PATH}"`
harmlessly absorbs any overlap with the server PATH.

Add test 5 to test-start-agent-session.sh: sets FAKE_RUNTIME_BIN5 on the
launcher's \$PATH and asserts it still appears in the generated pane PATH
export — directly guarding this regression.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RMoEx7hfdFGjUiCHuN1RRi
jason.woltje merged commit fc90c89913 into main 2026-06-21 17:30:41 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaicstack/stack#581