fix(fleet): durable runtime PATH for detached agent launch #581
Reference in New Issue
Block a user
Delete Branch "feat/fleet-launch-path"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Durable real-runtime launch fix for the fleet.
Problem: start-agent-session.sh launched the runtime in a detached tmux pane whose shell is login+non-interactive, so it missed ~/.npm-global/bin (added only in ~/.bashrc) -> 'mosaic: command not found' (127) -> pane died. tmux panes inherit the tmux server env, not the launcher's, so PATH must be baked into the pane command.
Fix: derive a runtime-bin prefix (MOSAIC_RUNTIME_BIN override | npm config get prefix/bin | ~/.npm-global/bin | ~/.local/bin; dedup, skip missing) and bake 'export PATH=:$PATH; exec ' into the pane command. exec makes the runtime the pane foreground (fixes a fleet ps DRIFT false-positive).
Validated LIVE under a stripped PATH (the bug condition): 'mosaic yolo pi --model openai-codex/gpt-5.5:high' came up durable (pane=node), confirming the fallback chain. Sanitization gate + shell test pass.
Also records the corrected findings + runtime-default policy (Codex/pi-on-Codex default; Claude reserved for Claude Code) in docs/fleet/. Findings also in aiguide #7 (merged).
Derive a runtime-bin PATH prefix (MOSAIC_RUNTIME_BIN override, then npm-prefix/bin, then ~/.npm-global/bin / ~/.local/bin) and bake it into the tmux pane command as `export PATH="<prefix>:${PATH}"; exec <cmd>` so the runtime binary (mosaic/pi/codex) is always found in a login+non-interactive pane shell that does not source ~/.bashrc. Using `exec` makes the runtime the pane foreground process, eliminating the DRIFT false-positive in `mosaic fleet ps`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RMoEx7hfdFGjUiCHuN1RRiThe previous `_build_runtime_bin_prefix()` skipped candidate dirs that were already present in the LAUNCHER process's \$PATH. This is wrong: the tmux pane inherits the tmux SERVER environment, not the launcher's env. A dir on the launcher's \$PATH may be absent from the server env, so the prefix could come back empty and the pane would fail with 'command not found'. Remove the `case ":${PATH}:"` check that tested against the live launcher PATH. Keep the existence check (`[ -d "$dir" ]`) and the dedup-within- the-constructed-prefix guard. The pane command's `export PATH="<prefix>:${PATH}"` harmlessly absorbs any overlap with the server PATH. Add test 5 to test-start-agent-session.sh: sets FAKE_RUNTIME_BIN5 on the launcher's \$PATH and asserts it still appears in the generated pane PATH export — directly guarding this regression. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RMoEx7hfdFGjUiCHuN1RRi