# Scratchpad — Fleet Phase 2: Observability (W-FLEET) > Append-only. Mission `mvp-20260312` / workstream W-FLEET. > Lead: Jarvis (Claude) at `W-jarvis:mos-claude-18`. Coordinating with `jwoltje@dragon-lin:coder0-0`. ## Mission prompt (2026-06-20) Establish the north star for the Mosaic Fleet feature and prepare Phase-2 observability for delivery. The USC tmux PoC is the proven base. Jason granted lead authority: "The fleet is a great way to actually build the MVP — we are building the system that builds the system." Dogfood actual agent construction + ad-hoc deployment; coordinate with a second agent on `dragon-lin`. ## Decisions of record (with Jason, 2026-06-20) - Agent model: config defines, session runs (gateway = definition/identity/auth; tmux = runtime). - Tenancy: multi-tenant from the start; isolation = per-tenant Linux uid. - Health: heartbeat required; dogfood stub implements protocol now. - Lifecycle: hybrid (core always-on + ephemeral workers). - Observation: read-only default, opt-in takeover. - Multi-host: designed-for day one; control plane rides federation (W1), not a bespoke broker. - Delivery: CLI-first, dogfood on the live stub fleet; webUI deferred to Phase 5. - Fleet is dual-role: product AND means of production (bootstrapping the MVP). - Code review = **dual-engine**: Claude **and** gpt-5.5/Codex, run together (Jason: the combination produces the best results). Launch reviewers via `mosaic yolo pi` / `codex` (proven path) or `~/.config/mosaic/tools/codex/codex-code-review.sh`. Applies to all code-review gates incl. FLEET-OBS-008. Per Jason 2026-06-20. - Worktree discipline: do fleet work in `~/src/mosaicstack-stack-worktrees/`, NOT the shared main checkout — concurrent processes mutate `main` there (learned 2026-06-20). ## Environment facts (verified 2026-06-20) - Fleet is live on `W-jarvis` (uid 1000, `jarvis`, `Linger=yes`) on tmux socket `mosaic-factory`: `_holder`, `canary-pi`, `dogfood-coder`, `dogfood-orchestrator`, `dogfood-reviewer`. All panes run `~/.config/mosaic/fleet/dogfood-agent.py` (stub), including `canary-pi` (roster says runtime=pi → **drift**). - Holder + `mosaic-agent@*` units are `active (exited)` but `UnitFileState=disabled` (reboot loses fleet → boot-enable gap to surface). - Observation blocked by: isolated socket (hidden from default `tmux ls`), `capture-pane` blank for TUIs, `attach` being read-write + resizing. - Second agent: `jwoltje@dragon-lin`, session `coder0-0` (group `coder0`), running `node`, default socket. ssh forward reach confirmed. ## Governance / collision-safety - `mosaicstack-stack` has active mission `mvp-20260312` with single-writer locks on `docs/MISSION-MANIFEST.md`, `docs/TASKS.md`, `docs/scratchpads/mvp-20260312.md`. - This workstream touches NONE of those. All Fleet docs scoped under `docs/fleet/` + this scratchpad. Rollup row proposed, not written. ## Session log - 2026-06-20: Researched AI guide + fleet code + live state. Established north star with Jason (8 forks decided). Branched `feat/fleet-observability`. Persisted `docs/fleet/{north-star.md,PRD.md,TASKS.md}` + this scratchpad. Next: establish comms with dragon-lin coder, commit docs, begin Phase-2 delivery (heartbeat + `fleet ps`). - 2026-06-20 (session 2): Built Phase-2 CLI via worker (commit ab47831): `fleet ps`, `agent watch`, `agent send --verify`, 62 tests. LIVE-verified `fleet ps` on mosaic-factory — correctly flagged canary-pi DRIFT + BOOT-ENABLE, tenant_id+host in JSON. Heartbeat responder added to dogfood-agent.py (FLEET-OBS-002) — `fleet ps` HB now `healthy` for all 4 agents. - Coordination: dual-engine-reviewed (Claude+Codex) and merged framework PRs #572 (sanitization gate) + #575 (CONSTITUTION extraction) as Lead. Codex caught an Alpine blocker on #572 (refuted by CI); Claude caught a CI-breaking format failure on #575. - **FINDINGS (north-star / Phase-3 blockers):** 1. Ad-hoc `mosaic yolo {codex,pi}` via `start-agent-session.sh` DIE immediately in a detached tmux pane (codex: "stdin is not a terminal"; pi: same). Only the python stub survives. => Real runtimes have NEVER run durably in the fleet. Launch path (PATH/TTY in the detached shell) must be fixed before Phase-3 real-runtime swap. `fleet ps` caught both dead panes instantly (tool validated). 2. `MOSAIC_AGENT_NAME` (set in systemd EnvironmentFile) is NOT propagated into tmux's global env, so agents defaulted to `unknown`. Worked around in dogfood-agent.py via tmux session-name fallback; the systemd/tmux env handoff needs a real fix. - Next: rebase on merged main, open Phase-2 PR, dual-engine review, merge, close `fleet-observability-1`. Defer launch-path + env-propagation fixes to Phase 3.