diff --git a/docs/fleet/north-star.md b/docs/fleet/north-star.md index 22b6857..2e6dbba 100644 --- a/docs/fleet/north-star.md +++ b/docs/fleet/north-star.md @@ -115,6 +115,11 @@ Every artifact, starting Phase 2, MUST: - Observation: **read-only default, opt-in takeover**. - Multi-host: **designed-for from day one**; control plane **rides federation (W1)**. - Delivery: **CLI-first now**, dogfood against the live stub fleet; webUI deferred to Phase 5. +- Runtimes: fleet agents default to **Codex / pi-on-Codex**; **Claude is reserved for Claude + Code only** (avoid alternate-harness API pricing). Validated durable recipe: + `mosaic yolo pi --model openai-codex/gpt-5.5:high`. Durable detached launch requires the + runtime-bin on PATH (baked into the pane command) + boot-survival (`enable` + linger), + which `fleet init` should automate. ## Assumptions (veto-able) diff --git a/docs/scratchpads/fleet-observability-phase2.md b/docs/scratchpads/fleet-observability-phase2.md index 22f0694..e499f6d 100644 --- a/docs/scratchpads/fleet-observability-phase2.md +++ b/docs/scratchpads/fleet-observability-phase2.md @@ -73,3 +73,28 @@ with a second agent on `dragon-lin`. tmux session-name fallback; the systemd/tmux env handoff needs a real fix. - Next: rebase on merged main, open Phase-2 PR, dual-engine review, merge, close `fleet-observability-1`. Defer launch-path + env-propagation fixes to Phase 3. +- 2026-06-21 (session 3): Phase-2 PR #579 merged (3 dual-engine rounds hardened + verify+watch). Then closed the launch-path question with Jason's input — CORRECTING + earlier findings: + - The ad-hoc launch deaths were NOT a fundamental TTY blocker: (a) codex was a stale + version (Jason updated it); (b) pi was misconfigured to Claude auth (Jason removed it; + default is now Codex). The REAL durable-launch bug is **PATH**: the detached tmux + launch shell is login+non-interactive, so it misses `~/.npm-global/bin` (added only in + `~/.bashrc`) -> `mosaic: command not found` (127) -> pane dies. tmux panes inherit the + tmux _server_ env, so PATH must be baked into the pane command. + - **Durable real-agent recipe (validated live on gpt-5.5, Claude-free):** + `mosaic yolo pi --model openai-codex/gpt-5.5:high` — pi tolerates detached tmux; a raw + interactive TUI (codex CLI) exits without an attached client. Status line confirmed + `(openai-codex) gpt-5.5 • high`. + - PATH fix landed in `start-agent-session.sh` (commit 32efc13, branch + feat/fleet-launch-path): derive runtime-bin prefix (MOSAIC_RUNTIME_BIN | npm prefix | + ~/.npm-global/bin | ~/.local/bin), bake `export PATH=...; exec ` into the pane; + `exec` also fixes the drift false-positive. Live-tested under stripped PATH -> durable. + - Boot-survival: Jason ran `systemctl --user enable` (+ linger). TODO: auto-enable in + **fleet init** so operators never have to remember it (agentic-enhancement cycle). + - Future custom Pi harness build: pi cannot self-report its model (track + runtime/model/effort as fleet metadata); drift detection should recognize `node` as + pi's pane command (a node-wrapped pane can currently read as drift). + - Findings recorded in AI Guide playbooks/tmux-fleet.md (aiguide PR #7, merged). + - Policy: avoid Claude outside Claude Code (API pricing for alt-harness use) — fleet + runtimes default to Codex / pi-on-Codex; Claude stays in Claude Code only.