Compare commits
8 Commits
feat/fleet
...
feat/fleet
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2674daede0 | ||
| 5bef2c35eb | |||
| 2849a8f9db | |||
| 7ced5588c9 | |||
| afcbbb302f | |||
| c2c0b5fe8d | |||
| c9cfe36204 | |||
| fc90c89913 |
105
docs/fleet/PRD-fleet-suite.md
Normal file
105
docs/fleet/PRD-fleet-suite.md
Normal file
@@ -0,0 +1,105 @@
|
|||||||
|
# PRD — Mosaic Fleet Suite (init, configure, operate)
|
||||||
|
|
||||||
|
> **Workstream:** W-FLEET (Fleet) under mission `mvp-20260312` · **Phase:** 3→4 productization
|
||||||
|
> **North star:** [docs/fleet/north-star.md](./north-star.md) · prior: Phase-2 observability (#579), durable launch (#581), real-agent enablement (#583/#584/#586), releases 0.0.35–0.0.37
|
||||||
|
> **Lead:** Jarvis @ `w-jarvis`. **Collaborator:** coder agent @ `dragon-lin` (jwoltje@10.1.10.37:coder0-0).
|
||||||
|
> Owner of this file: Fleet workstream lead. Does not modify MVP single-writer control-plane files.
|
||||||
|
|
||||||
|
## Mission
|
||||||
|
|
||||||
|
Turn the proven fleet primitives into a **user-installable, AI-free-configurable fleet product**:
|
||||||
|
a user runs `mosaic fleet init`, answers a few questions (general / coding / research / hybrid),
|
||||||
|
gets a recommended set of agents plus one always-on orchestrator wired for chat-ops, and can
|
||||||
|
operate, mutate, re-create, and observe the fleet — over tmux today and Matrix tomorrow — from
|
||||||
|
CLI/TUI and (designed-for) the webUI.
|
||||||
|
|
||||||
|
**Immediate tangible goal:** the **"Mos"** orchestrator agent running on `w-jarvis`, reachable
|
||||||
|
in **Discord channel `1517622518662434996`** (server `1112631390438166618`). Once the fleet is
|
||||||
|
functional, we use the fleet itself to continue the work.
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
### A. Configure-without-AI CLI
|
||||||
|
| ID | Requirement |
|
||||||
|
|---|---|
|
||||||
|
| R1 | `mosaic fleet` command set is functional end-to-end (init/install/start/stop/status/ps/verify + agent verbs). |
|
||||||
|
| R2 | `mosaic fleet init` is an interactive, **AI-free** CLI wizard. |
|
||||||
|
| R3 | Init asks the **configuration type**: `general`, `coding`, `research`, `hybrid`, … (extensible). |
|
||||||
|
| R4 | Based on the answer, the fleet is populated with a **recommended set of agents** (a preset). |
|
||||||
|
| R5 | **Exactly one main orchestrator agent** is always configured, regardless of type. |
|
||||||
|
| R10 | A set of **recommended configurations (presets)** ships for easy duplication. |
|
||||||
|
| R8 | User can **re-create** the fleet when config needs change (idempotent re-init / reconfigure). |
|
||||||
|
| R17 | Fleet controls are **simple and intuitive**. |
|
||||||
|
|
||||||
|
### B. Comms & orchestrator chat-ops
|
||||||
|
| ID | Requirement |
|
||||||
|
|---|---|
|
||||||
|
| R6 | Init can wire the orchestrator to a chat connector — **Telegram / Discord / Matrix / Slack** — for command + comms. |
|
||||||
|
| R7 | Designed with the end-goal of **Matrix comms on a locally-controlled server**. |
|
||||||
|
| R16 | Fleet supports **tmux AND Matrix** comms, **user-configurable** at init or any time. Not all users want Matrix. |
|
||||||
|
| R19 | **"Mos" orchestrator on Discord** (`chan 1517622518662434996` / `srv 1112631390438166618`) on `w-jarvis` — the first live target. |
|
||||||
|
|
||||||
|
### C. Runtime, health, lifecycle
|
||||||
|
| ID | Requirement |
|
||||||
|
|---|---|
|
||||||
|
| R9 | Fleet is **mutable by the orchestrator agent** — add/remove agents per need. |
|
||||||
|
| R13 | Fleet **gracefully handles Pi + Claude harness updates** — keep harnesses current. |
|
||||||
|
| R14 | The **Pi harness is customized** for proper tool usage, etc. |
|
||||||
|
| R15 | **Agent heartbeat** properly configured for **Claude AND GPT/Pi** agents. |
|
||||||
|
|
||||||
|
### D. Surfaces, testing, docs
|
||||||
|
| ID | Requirement |
|
||||||
|
|---|---|
|
||||||
|
| R18 | Fleet built so the **webUI can view / monitor / terminate / butt-in** on a session. |
|
||||||
|
| R11 | Installed and **tested on both `w-jarvis` and `dragon-lin`**. |
|
||||||
|
| R12 | **Documentation**: how to install, configure, and use the fleet. |
|
||||||
|
|
||||||
|
## Architecture / approach
|
||||||
|
|
||||||
|
- **Config model:** `roster.yaml` is the source of truth (already exists). Add **presets** (`general`/`coding`/`research`/`hybrid`) as shipped example rosters; `init` selects a preset, always injects the orchestrator, and writes the roster. Re-init = regenerate roster (preserve user/site overrides — mirrors install env-merge from #567).
|
||||||
|
- **Orchestrator agent:** always present; carries the chat connector config (connector type + target IDs) so it can be commanded over chat. tmux is the substrate; the connector bridges chat ↔ the orchestrator session.
|
||||||
|
- **Comms layers (R16):** (1) **tmux** inter-agent (`agent-send`, proven) — default, always available. (2) **chat connector** for human↔orchestrator (Discord now; Matrix the strategic target). (3) **Matrix** as the locally-controlled cross-agent bus (future). Connector is pluggable + reconfigurable.
|
||||||
|
- **Heartbeat (R15):** runtime-agnostic launcher sidecar already covers pi/claude/codex (#584). Refine per-runtime (native HB) with the **custom Pi harness** (R14) + a Claude path.
|
||||||
|
- **Updates (R13):** `mosaic update` (CLI) + a fleet-aware harness-update step that refreshes pi/claude/codex and re-launches agents safely (drain → update → relaunch via the durable launcher).
|
||||||
|
- **webUI (R18):** the fleet exposes machine-readable state (`fleet ps --json` already carries tenant/host/heartbeat/managed) + control verbs (start/stop/watch/send); webUI consumes these (control plane rides federation per north star). Ensure a stable JSON contract + a terminate/attach(butt-in) path.
|
||||||
|
|
||||||
|
## Phases (incremental, each shippable)
|
||||||
|
|
||||||
|
| Phase | Deliverable | Notes |
|
||||||
|
|---|---|---|
|
||||||
|
| **F1 Presets + init wizard** | preset rosters (general/coding/research/hybrid) + always-orchestrator + AI-free `fleet init` selecting a preset; re-init idempotent | R1–R5, R8, R10, R17 |
|
||||||
|
| **F2 Connector + Mos-on-Discord** | orchestrator chat-connector config (Discord first) + **Mos live on Discord `1517…`/`1112…`** on w-jarvis | R6, R19, partial R16 |
|
||||||
|
| **F3 Heartbeat + harness** | HB confirmed for claude + pi/gpt; **custom Pi harness** (tool usage, native HB, model self-report); graceful harness updates | R13, R14, R15 |
|
||||||
|
| **F4 Matrix + comms toggle** | Matrix connector (local server) + user toggle tmux/Matrix at init/anytime | R7, R16 |
|
||||||
|
| **F5 Orchestrator-mutable fleet** | orchestrator can add/remove agents at runtime | R9 |
|
||||||
|
| **F6 webUI hooks** | stable JSON contract + terminate/attach surface for webUI view/monitor/terminate/butt-in | R18 |
|
||||||
|
| **F7 Test + docs** | install+test on w-jarvis AND dragon-lin; user docs (install/configure/use) | R11, R12 (runs alongside every phase) |
|
||||||
|
|
||||||
|
## Work division (proposed — confirm with dragon-lin)
|
||||||
|
|
||||||
|
- **Jarvis @ w-jarvis (Lead):** F1 presets+wizard, F2 connector+Mos-on-Discord, F5 mutability, F6 webUI hooks; merge authority + dual-engine reviews; co-testing on w-jarvis.
|
||||||
|
- **coder @ dragon-lin:** F3 custom Pi harness + harness-update flow (pi/codex-savvy); plus its in-flight constitution P4–P6 (P4 installer rework underpins `fleet init`/updates — coordinate the install path). Co-testing on dragon-lin (R11).
|
||||||
|
- **Shared:** F4 Matrix (whoever has bandwidth); F7 testing/docs continuous.
|
||||||
|
|
||||||
|
## Immediate target: Mos on Discord (F2 first slice)
|
||||||
|
|
||||||
|
The discord plugin is available (`~/.claude.json`). Path: configure the **orchestrator** as a durable
|
||||||
|
fleet session running Claude Code with the discord plugin bridged to channel `1517622518662434996`
|
||||||
|
(server `1112631390438166618`) on w-jarvis, with the existing Discord Bridge Protocol (ack within
|
||||||
|
~3s, reply via `mcp__discord__reply`, no `AskUserQuestion`). Heartbeat via the launcher sidecar.
|
||||||
|
|
||||||
|
## Success criteria
|
||||||
|
|
||||||
|
- A non-AI user can `mosaic fleet init`, pick a type, and get a working fleet + orchestrator.
|
||||||
|
- **Mos answers in Discord `1517…`** on w-jarvis.
|
||||||
|
- Fleet runs + is observable (`fleet ps`) on **both** w-jarvis and dragon-lin.
|
||||||
|
- Harness updates handled gracefully; HB healthy for claude + pi/gpt agents.
|
||||||
|
- Docs let a new operator install/configure/use the fleet.
|
||||||
|
- Re-init + orchestrator mutation work.
|
||||||
|
|
||||||
|
## Assumptions (veto-able)
|
||||||
|
|
||||||
|
- `ASSUMPTION:` presets ship as example rosters under the framework (`fleet/examples/*.yaml`), selected by `init`.
|
||||||
|
- `ASSUMPTION:` chat connectors are pluggable; Discord first (target exists), Matrix is the strategic default later.
|
||||||
|
- `ASSUMPTION:` "Mos" = a Claude Code orchestrator session with the discord plugin (reuses the documented Discord Bridge Protocol).
|
||||||
|
- `ASSUMPTION:` per north star, runtimes default to Codex/pi-on-Codex for workers; the orchestrator "Mos" runs Claude Code (in Claude Code, which is allowed).
|
||||||
@@ -70,6 +70,9 @@ Skills, hooks, MCP, and plugins are force multipliers you MUST use when applicab
|
|||||||
## Missing core file
|
## Missing core file
|
||||||
|
|
||||||
If `CONSTITUTION.md`, `AGENTS.md`, `SOUL.md`, or the runtime contract is missing, stop and report it.
|
If `CONSTITUTION.md`, `AGENTS.md`, `SOUL.md`, or the runtime contract is missing, stop and report it.
|
||||||
|
This agent-facing strictness is intentional and stricter than the launcher: the launcher injects
|
||||||
|
`CONSTITUTION.md` tolerantly (skipping it if absent so pre-upgrade hosts keep working), but once a host
|
||||||
|
is re-seeded a genuinely missing core file is a stop-and-report condition — not something to proceed past.
|
||||||
|
|
||||||
## Session Closure
|
## Session Closure
|
||||||
|
|
||||||
|
|||||||
@@ -2,8 +2,11 @@
|
|||||||
|
|
||||||
The irreducible, non-negotiable law for every Mosaic agent on every harness.
|
The irreducible, non-negotiable law for every Mosaic agent on every harness.
|
||||||
|
|
||||||
**Framework-owned.** This file is overwritten verbatim on every upgrade — do not edit it. To change
|
**Framework-owned.** This file is overwritten verbatim on every upgrade — do not edit it. There is
|
||||||
behavior, add a `.local.md` overlay or a `policy/` file (tighten-only; see `constitution/LAYER-MODEL.md`).
|
**no `CONSTITUTION.local.md`**: hard gates are not locally overridable. A lower layer may only make
|
||||||
|
behavior _stricter_, never relax or override a gate (see Precedence). Operator customization lives in
|
||||||
|
other layers — `SOUL.md` / `USER.md` and the tighten-only overlays `STANDARDS.local.md` /
|
||||||
|
`SOUL.local.md` / `USER.local.md` / `policy/*.md` (see `constitution/LAYER-MODEL.md`).
|
||||||
Authored in **capability verbs**: where a gate names a capability ("structured reasoning", "queue
|
Authored in **capability verbs**: where a gate names a capability ("structured reasoning", "queue
|
||||||
guard"), the runtime adapter binds it to a concrete tool and states whether absence is a hard stop.
|
guard"), the runtime adapter binds it to a concrete tool and states whether absence is a hard stop.
|
||||||
|
|
||||||
|
|||||||
@@ -6,6 +6,8 @@ MOSAIC_TMUX_SOCKET=${MOSAIC_TMUX_SOCKET:-mosaic-factory}
|
|||||||
MOSAIC_AGENT_RUNTIME=${MOSAIC_AGENT_RUNTIME:-pi}
|
MOSAIC_AGENT_RUNTIME=${MOSAIC_AGENT_RUNTIME:-pi}
|
||||||
MOSAIC_AGENT_WORKDIR=${MOSAIC_AGENT_WORKDIR:-$HOME}
|
MOSAIC_AGENT_WORKDIR=${MOSAIC_AGENT_WORKDIR:-$HOME}
|
||||||
MOSAIC_AGENT_COMMAND=${MOSAIC_AGENT_COMMAND:-}
|
MOSAIC_AGENT_COMMAND=${MOSAIC_AGENT_COMMAND:-}
|
||||||
|
MOSAIC_HEARTBEAT_RUN_DIR=${MOSAIC_HEARTBEAT_RUN_DIR:-$HOME/.config/mosaic/fleet/run}
|
||||||
|
MOSAIC_HEARTBEAT_INTERVAL=${MOSAIC_HEARTBEAT_INTERVAL:-15}
|
||||||
|
|
||||||
if [ -z "$AGENT_NAME" ]; then
|
if [ -z "$AGENT_NAME" ]; then
|
||||||
echo "ERROR: agent name argument or MOSAIC_AGENT_NAME is required" >&2
|
echo "ERROR: agent name argument or MOSAIC_AGENT_NAME is required" >&2
|
||||||
@@ -96,5 +98,55 @@ else
|
|||||||
fi
|
fi
|
||||||
|
|
||||||
mkdir -p "$MOSAIC_AGENT_WORKDIR"
|
mkdir -p "$MOSAIC_AGENT_WORKDIR"
|
||||||
exec tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" \
|
|
||||||
|
# ── Launch the tmux session (no exec — we continue to wire the heartbeat) ────
|
||||||
|
tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" \
|
||||||
bash -c "$PANE_SHELL_SNIPPET"
|
bash -c "$PANE_SHELL_SNIPPET"
|
||||||
|
|
||||||
|
# ── Resolve the pane PID (retry briefly to let the session initialise) ────────
|
||||||
|
PANE_PID=""
|
||||||
|
for _retry in 1 2 3 4 5; do
|
||||||
|
PANE_PID=$(tmux -L "$MOSAIC_TMUX_SOCKET" list-panes \
|
||||||
|
-t "=${AGENT_NAME}:0.0" -F '#{pane_pid}' 2>/dev/null || true)
|
||||||
|
[ -n "$PANE_PID" ] && break
|
||||||
|
sleep 0.2
|
||||||
|
done
|
||||||
|
|
||||||
|
# ── Spawn the heartbeat sidecar (detached, best-effort) ──────────────────────
|
||||||
|
# The sidecar writes ~/.config/mosaic/fleet/run/<AGENT>.hb atomically while the
|
||||||
|
# pane process is alive, then exits so the file goes stale (fleet ps shows stale
|
||||||
|
# then PANE=dead). It is runtime-agnostic: it only cares about the pane PID.
|
||||||
|
_start_heartbeat_sidecar() {
|
||||||
|
local agent="$1"
|
||||||
|
local pane_pid="$2"
|
||||||
|
local run_dir="$3"
|
||||||
|
local interval="$4"
|
||||||
|
local hb_file="${run_dir}/${agent}.hb"
|
||||||
|
|
||||||
|
mkdir -p "$run_dir"
|
||||||
|
|
||||||
|
# Write the sidecar as a self-contained bash one-liner so it carries no
|
||||||
|
# references to any variables from this script's environment.
|
||||||
|
local sidecar_script
|
||||||
|
sidecar_script=$(printf \
|
||||||
|
'hb=%s; pid=%s; iv=%s; mkdir -p "$(dirname "$hb")"; while kill -0 "$pid" 2>/dev/null; do tmp="$hb.tmp.$$"; printf "ts=%%s\npid=%%s\nstatus=ok\n" "$(date +%%Y-%%m-%%dT%%H:%%M:%%S%%z)" "$pid" > "$tmp" && mv "$tmp" "$hb"; sleep "$iv"; done' \
|
||||||
|
"$hb_file" "$pane_pid" "$interval")
|
||||||
|
|
||||||
|
# setsid + disown ensures the sidecar survives this script exiting.
|
||||||
|
# stderr/stdout go to /dev/null; failures are non-fatal.
|
||||||
|
if command -v setsid >/dev/null 2>&1; then
|
||||||
|
setsid bash -c "$sidecar_script" </dev/null >/dev/null 2>&1 &
|
||||||
|
else
|
||||||
|
bash -c "$sidecar_script" </dev/null >/dev/null 2>&1 &
|
||||||
|
fi
|
||||||
|
disown $! 2>/dev/null || true
|
||||||
|
}
|
||||||
|
|
||||||
|
if [ -n "$PANE_PID" ]; then
|
||||||
|
# Guard: do not let sidecar startup failures abort the launcher (set -e).
|
||||||
|
_start_heartbeat_sidecar "$AGENT_NAME" "$PANE_PID" \
|
||||||
|
"$MOSAIC_HEARTBEAT_RUN_DIR" "$MOSAIC_HEARTBEAT_INTERVAL" || \
|
||||||
|
echo "WARNING: heartbeat sidecar could not be started for $AGENT_NAME" >&2
|
||||||
|
else
|
||||||
|
echo "WARNING: could not resolve pane PID for $AGENT_NAME — heartbeat sidecar not started" >&2
|
||||||
|
fi
|
||||||
|
|||||||
@@ -50,6 +50,10 @@ grep -qF 'already running' /tmp/mosaic-start-agent-idempotent.out || fail "dupli
|
|||||||
# - Intercepts 'new-session' calls and records its arguments to a file.
|
# - Intercepts 'new-session' calls and records its arguments to a file.
|
||||||
# - For 'has-session' calls, exits 1 (session does not exist) so the script
|
# - For 'has-session' calls, exits 1 (session does not exist) so the script
|
||||||
# proceeds to launch instead of printing "already running".
|
# proceeds to launch instead of printing "already running".
|
||||||
|
# - For 'list-panes' calls, returns empty so PANE_PID stays unset and the
|
||||||
|
# heartbeat sidecar is NOT spawned (heartbeat is not the focus of this test;
|
||||||
|
# test 6 and 7 cover that path). This prevents any real-filesystem side
|
||||||
|
# effects or leaked background processes.
|
||||||
# - For all other subcommands, exits 0.
|
# - For all other subcommands, exits 0.
|
||||||
#
|
#
|
||||||
# Assertions:
|
# Assertions:
|
||||||
@@ -60,7 +64,8 @@ grep -qF 'already running' /tmp/mosaic-start-agent-idempotent.out || fail "dupli
|
|||||||
FAKE_BIN=$(mktemp -d)
|
FAKE_BIN=$(mktemp -d)
|
||||||
FAKE_RUNTIME_BIN=$(mktemp -d)
|
FAKE_RUNTIME_BIN=$(mktemp -d)
|
||||||
TMUX_ARGS_FILE=$(mktemp)
|
TMUX_ARGS_FILE=$(mktemp)
|
||||||
CLEANUP_DIRS+=("$FAKE_BIN" "$FAKE_RUNTIME_BIN")
|
HB_RUN_DIR3=$(mktemp -d)
|
||||||
|
CLEANUP_DIRS+=("$FAKE_BIN" "$FAKE_RUNTIME_BIN" "$HB_RUN_DIR3")
|
||||||
|
|
||||||
# Write the fake tmux shim (uses only positional args, no sourced vars).
|
# Write the fake tmux shim (uses only positional args, no sourced vars).
|
||||||
cat > "$FAKE_BIN/tmux" <<SHIM
|
cat > "$FAKE_BIN/tmux" <<SHIM
|
||||||
@@ -74,6 +79,11 @@ if [ "\$subcmd" = "new-session" ]; then
|
|||||||
printf '%s\n' "\$@" > "$TMUX_ARGS_FILE"
|
printf '%s\n' "\$@" > "$TMUX_ARGS_FILE"
|
||||||
exit 0
|
exit 0
|
||||||
fi
|
fi
|
||||||
|
if [ "\$subcmd" = "list-panes" ]; then
|
||||||
|
# Return empty: no sidecar spawned (heartbeat is not the focus of this test).
|
||||||
|
echo ""
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
exit 0
|
exit 0
|
||||||
SHIM
|
SHIM
|
||||||
chmod +x "$FAKE_BIN/tmux"
|
chmod +x "$FAKE_BIN/tmux"
|
||||||
@@ -89,6 +99,7 @@ MOSAIC_AGENT_WORKDIR="$WORKDIR3" \
|
|||||||
MOSAIC_AGENT_RUNTIME="pi" \
|
MOSAIC_AGENT_RUNTIME="pi" \
|
||||||
MOSAIC_RUNTIME_BIN="$FAKE_RUNTIME_BIN" \
|
MOSAIC_RUNTIME_BIN="$FAKE_RUNTIME_BIN" \
|
||||||
MOSAIC_AGENT_COMMAND="mosaic yolo pi --model openai-codex/gpt-5.5:high" \
|
MOSAIC_AGENT_COMMAND="mosaic yolo pi --model openai-codex/gpt-5.5:high" \
|
||||||
|
MOSAIC_HEARTBEAT_RUN_DIR="$HB_RUN_DIR3" \
|
||||||
"$START" "$AGENT3"
|
"$START" "$AGENT3"
|
||||||
|
|
||||||
all_args=$(cat "$TMUX_ARGS_FILE" 2>/dev/null || true)
|
all_args=$(cat "$TMUX_ARGS_FILE" 2>/dev/null || true)
|
||||||
@@ -112,7 +123,8 @@ echo "$all_args" | grep -qF "mosaic yolo pi --model openai-codex/gpt-5.5:high" |
|
|||||||
# ── Test 4: when no extra runtime-bin dirs exist, exec still appears ───────────
|
# ── Test 4: when no extra runtime-bin dirs exist, exec still appears ───────────
|
||||||
TMUX_ARGS_FILE2=$(mktemp)
|
TMUX_ARGS_FILE2=$(mktemp)
|
||||||
FAKE_BIN2=$(mktemp -d)
|
FAKE_BIN2=$(mktemp -d)
|
||||||
CLEANUP_DIRS+=("$FAKE_BIN2")
|
HB_RUN_DIR4=$(mktemp -d)
|
||||||
|
CLEANUP_DIRS+=("$FAKE_BIN2" "$HB_RUN_DIR4")
|
||||||
|
|
||||||
cat > "$FAKE_BIN2/tmux" <<SHIM2
|
cat > "$FAKE_BIN2/tmux" <<SHIM2
|
||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
@@ -122,6 +134,11 @@ if [ "\$subcmd" = "new-session" ]; then
|
|||||||
printf '%s\n' "\$@" > "$TMUX_ARGS_FILE2"
|
printf '%s\n' "\$@" > "$TMUX_ARGS_FILE2"
|
||||||
exit 0
|
exit 0
|
||||||
fi
|
fi
|
||||||
|
if [ "\$subcmd" = "list-panes" ]; then
|
||||||
|
# Return empty: no sidecar spawned (heartbeat is not the focus of this test).
|
||||||
|
echo ""
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
exit 0
|
exit 0
|
||||||
SHIM2
|
SHIM2
|
||||||
chmod +x "$FAKE_BIN2/tmux"
|
chmod +x "$FAKE_BIN2/tmux"
|
||||||
@@ -139,6 +156,7 @@ MOSAIC_AGENT_WORKDIR="$WORKDIR4" \
|
|||||||
MOSAIC_AGENT_RUNTIME="pi" \
|
MOSAIC_AGENT_RUNTIME="pi" \
|
||||||
MOSAIC_RUNTIME_BIN="/nonexistent-dir-$$" \
|
MOSAIC_RUNTIME_BIN="/nonexistent-dir-$$" \
|
||||||
MOSAIC_AGENT_COMMAND="mosaic yolo pi" \
|
MOSAIC_AGENT_COMMAND="mosaic yolo pi" \
|
||||||
|
MOSAIC_HEARTBEAT_RUN_DIR="$HB_RUN_DIR4" \
|
||||||
"$START" "$AGENT4"
|
"$START" "$AGENT4"
|
||||||
|
|
||||||
all_args4=$(cat "$TMUX_ARGS_FILE2" 2>/dev/null || true)
|
all_args4=$(cat "$TMUX_ARGS_FILE2" 2>/dev/null || true)
|
||||||
@@ -161,7 +179,8 @@ echo "$all_args4" | grep -qF "mosaic yolo pi" || fail "pane command does not inc
|
|||||||
TMUX_ARGS_FILE5=$(mktemp)
|
TMUX_ARGS_FILE5=$(mktemp)
|
||||||
FAKE_BIN5=$(mktemp -d)
|
FAKE_BIN5=$(mktemp -d)
|
||||||
FAKE_RUNTIME_BIN5=$(mktemp -d) # this dir IS on the launcher's PATH below
|
FAKE_RUNTIME_BIN5=$(mktemp -d) # this dir IS on the launcher's PATH below
|
||||||
CLEANUP_DIRS+=("$FAKE_BIN5" "$FAKE_RUNTIME_BIN5")
|
HB_RUN_DIR5=$(mktemp -d)
|
||||||
|
CLEANUP_DIRS+=("$FAKE_BIN5" "$FAKE_RUNTIME_BIN5" "$HB_RUN_DIR5")
|
||||||
|
|
||||||
cat > "$FAKE_BIN5/tmux" <<SHIM5
|
cat > "$FAKE_BIN5/tmux" <<SHIM5
|
||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
@@ -171,6 +190,11 @@ if [ "\$subcmd" = "new-session" ]; then
|
|||||||
printf '%s\n' "\$@" > "$TMUX_ARGS_FILE5"
|
printf '%s\n' "\$@" > "$TMUX_ARGS_FILE5"
|
||||||
exit 0
|
exit 0
|
||||||
fi
|
fi
|
||||||
|
if [ "\$subcmd" = "list-panes" ]; then
|
||||||
|
# Return empty: no sidecar spawned (heartbeat is not the focus of this test).
|
||||||
|
echo ""
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
exit 0
|
exit 0
|
||||||
SHIM5
|
SHIM5
|
||||||
chmod +x "$FAKE_BIN5/tmux"
|
chmod +x "$FAKE_BIN5/tmux"
|
||||||
@@ -190,6 +214,7 @@ MOSAIC_AGENT_WORKDIR="$WORKDIR5" \
|
|||||||
MOSAIC_AGENT_RUNTIME="pi" \
|
MOSAIC_AGENT_RUNTIME="pi" \
|
||||||
MOSAIC_RUNTIME_BIN="$FAKE_RUNTIME_BIN5" \
|
MOSAIC_RUNTIME_BIN="$FAKE_RUNTIME_BIN5" \
|
||||||
MOSAIC_AGENT_COMMAND="mosaic yolo pi" \
|
MOSAIC_AGENT_COMMAND="mosaic yolo pi" \
|
||||||
|
MOSAIC_HEARTBEAT_RUN_DIR="$HB_RUN_DIR5" \
|
||||||
"$START" "$AGENT5"
|
"$START" "$AGENT5"
|
||||||
|
|
||||||
all_args5=$(cat "$TMUX_ARGS_FILE5" 2>/dev/null || true)
|
all_args5=$(cat "$TMUX_ARGS_FILE5" 2>/dev/null || true)
|
||||||
@@ -205,4 +230,123 @@ echo "$all_args5" | grep -qF "export PATH=" || \
|
|||||||
echo "$all_args5" | grep -qF "$FAKE_RUNTIME_BIN5" || \
|
echo "$all_args5" | grep -qF "$FAKE_RUNTIME_BIN5" || \
|
||||||
fail "test5: candidate dir (already on launcher PATH) was NOT baked into pane PATH — regression"
|
fail "test5: candidate dir (already on launcher PATH) was NOT baked into pane PATH — regression"
|
||||||
|
|
||||||
|
# ── Test 6: heartbeat sidecar — pane PID resolved + .hb file written ──────────
|
||||||
|
#
|
||||||
|
# Uses a real tmux session (same socket as test 1 which already has $AGENT) so
|
||||||
|
# list-panes returns a real pane PID. We override MOSAIC_HEARTBEAT_RUN_DIR to
|
||||||
|
# a temp dir and set a 1-second interval, then wait up to 3 s for the .hb file
|
||||||
|
# to appear and check its content.
|
||||||
|
|
||||||
|
HB_RUN_DIR=$(mktemp -d)
|
||||||
|
CLEANUP_DIRS+=("$HB_RUN_DIR")
|
||||||
|
|
||||||
|
# Re-use the session+agent created in Test 1 (still alive on $SOCKET / $AGENT).
|
||||||
|
# We need to invoke the script for a NEW agent on the same socket to exercise
|
||||||
|
# the heartbeat path with a real pane PID.
|
||||||
|
AGENT6="agent6-$RANDOM"
|
||||||
|
MOSAIC_TMUX_SOCKET="$SOCKET" \
|
||||||
|
MOSAIC_AGENT_WORKDIR="$WORKDIR" \
|
||||||
|
MOSAIC_AGENT_COMMAND='bash --noprofile --norc -i' \
|
||||||
|
MOSAIC_HEARTBEAT_RUN_DIR="$HB_RUN_DIR" \
|
||||||
|
MOSAIC_HEARTBEAT_INTERVAL="1" \
|
||||||
|
"$START" "$AGENT6"
|
||||||
|
|
||||||
|
HB_FILE="$HB_RUN_DIR/${AGENT6}.hb"
|
||||||
|
|
||||||
|
# Wait up to 5 seconds for the heartbeat file to appear.
|
||||||
|
_waited=0
|
||||||
|
until [ -f "$HB_FILE" ] || [ "$_waited" -ge 5 ]; do
|
||||||
|
sleep 0.5
|
||||||
|
_waited=$((_waited + 1))
|
||||||
|
done
|
||||||
|
|
||||||
|
[ -f "$HB_FILE" ] || fail "test6: heartbeat file not written at $HB_FILE within 5s"
|
||||||
|
|
||||||
|
hb_content=$(cat "$HB_FILE")
|
||||||
|
echo "--- test 6: heartbeat file content ---"
|
||||||
|
echo "$hb_content"
|
||||||
|
echo "--- end test 6 ---"
|
||||||
|
|
||||||
|
# Verify required fields are present.
|
||||||
|
echo "$hb_content" | grep -qE '^ts=[0-9]{4}-[0-9]{2}-[0-9]{2}T' || \
|
||||||
|
fail "test6: heartbeat ts field missing or malformed"
|
||||||
|
echo "$hb_content" | grep -qE '^pid=[0-9]+' || \
|
||||||
|
fail "test6: heartbeat pid field missing or malformed"
|
||||||
|
echo "$hb_content" | grep -qF 'status=ok' || \
|
||||||
|
fail "test6: heartbeat status=ok missing"
|
||||||
|
|
||||||
|
# ── Test 7: heartbeat sidecar — targets correct .hb path per agent name ────────
|
||||||
|
#
|
||||||
|
# Uses the fake-tmux shim approach (like tests 3-5) to capture the sidecar
|
||||||
|
# invocation without needing a real session. A fake setsid shim records its
|
||||||
|
# arguments so we can assert the sidecar script targets the expected .hb path
|
||||||
|
# and uses the configured interval.
|
||||||
|
|
||||||
|
FAKE_BIN7=$(mktemp -d)
|
||||||
|
FAKE_RUNTIME_BIN7=$(mktemp -d)
|
||||||
|
SETSID_ARGS_FILE=$(mktemp)
|
||||||
|
HB_RUN_DIR7=$(mktemp -d)
|
||||||
|
CLEANUP_DIRS+=("$FAKE_BIN7" "$FAKE_RUNTIME_BIN7" "$HB_RUN_DIR7")
|
||||||
|
|
||||||
|
AGENT7="my-fleet-agent-$RANDOM"
|
||||||
|
INTERVAL7="42"
|
||||||
|
|
||||||
|
# Fake tmux: has-session → not found; new-session → ok; list-panes → known PID.
|
||||||
|
cat > "$FAKE_BIN7/tmux" <<SHIM7
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
subcmd="\$3"
|
||||||
|
if [ "\$subcmd" = "has-session" ]; then exit 1; fi
|
||||||
|
if [ "\$subcmd" = "new-session" ]; then exit 0; fi
|
||||||
|
if [ "\$subcmd" = "list-panes" ]; then echo "88888"; exit 0; fi
|
||||||
|
exit 0
|
||||||
|
SHIM7
|
||||||
|
chmod +x "$FAKE_BIN7/tmux"
|
||||||
|
|
||||||
|
# Fake setsid: capture the bash -c <script> argument for inspection, then
|
||||||
|
# background an actual bash subshell so disown succeeds in the caller.
|
||||||
|
cat > "$FAKE_BIN7/setsid" <<'SETSID_SHIM'
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
# argv: setsid bash -c <sidecar_script>
|
||||||
|
# Record the full argument list to the capture file, then exit cleanly.
|
||||||
|
printf '%s\0' "$@" > __SETSID_ARGS_FILE__
|
||||||
|
exit 0
|
||||||
|
SETSID_SHIM
|
||||||
|
# Patch the placeholder with the real capture-file path (avoids heredoc expansion issues).
|
||||||
|
sed -i "s|__SETSID_ARGS_FILE__|${SETSID_ARGS_FILE}|g" "$FAKE_BIN7/setsid"
|
||||||
|
chmod +x "$FAKE_BIN7/setsid"
|
||||||
|
|
||||||
|
SOCKET7="mosaic-agent-test7-$RANDOM-$$"
|
||||||
|
WORKDIR7=$(mktemp -d)
|
||||||
|
CLEANUP_DIRS+=("$WORKDIR7")
|
||||||
|
|
||||||
|
PATH="$FAKE_BIN7:$PATH" \
|
||||||
|
MOSAIC_TMUX_SOCKET="$SOCKET7" \
|
||||||
|
MOSAIC_AGENT_WORKDIR="$WORKDIR7" \
|
||||||
|
MOSAIC_AGENT_RUNTIME="pi" \
|
||||||
|
MOSAIC_RUNTIME_BIN="$FAKE_RUNTIME_BIN7" \
|
||||||
|
MOSAIC_AGENT_COMMAND="mosaic yolo pi" \
|
||||||
|
MOSAIC_HEARTBEAT_RUN_DIR="$HB_RUN_DIR7" \
|
||||||
|
MOSAIC_HEARTBEAT_INTERVAL="$INTERVAL7" \
|
||||||
|
"$START" "$AGENT7"
|
||||||
|
|
||||||
|
# Give the background setsid shim a moment to finish writing the capture file.
|
||||||
|
sleep 0.5
|
||||||
|
|
||||||
|
setsid_args=$(cat "$SETSID_ARGS_FILE" 2>/dev/null | tr '\0' '\n' || true)
|
||||||
|
rm -f "$SETSID_ARGS_FILE"
|
||||||
|
rm -rf "$WORKDIR7"
|
||||||
|
|
||||||
|
echo "--- test 7: captured setsid args ---"
|
||||||
|
echo "$setsid_args"
|
||||||
|
echo "--- end test 7 ---"
|
||||||
|
|
||||||
|
# The sidecar script (bash -c <script>) must reference the correct .hb path.
|
||||||
|
expected_hb="${HB_RUN_DIR7}/${AGENT7}.hb"
|
||||||
|
echo "$setsid_args" | grep -qF "$expected_hb" || \
|
||||||
|
fail "test7: sidecar script does not reference correct .hb path ($expected_hb)"
|
||||||
|
|
||||||
|
# The sidecar script must use the configured interval.
|
||||||
|
echo "$setsid_args" | grep -qF "$INTERVAL7" || \
|
||||||
|
fail "test7: sidecar script does not reference configured interval ($INTERVAL7)"
|
||||||
|
|
||||||
echo "ok - start-agent-session"
|
echo "ok - start-agent-session"
|
||||||
|
|||||||
@@ -12,7 +12,7 @@
|
|||||||
# 2. STRUCTURAL (private $HOME default in *.sh) — scanned everywhere EXCEPT examples/,
|
# 2. STRUCTURAL (private $HOME default in *.sh) — scanned everywhere EXCEPT examples/,
|
||||||
# because worked example overlays/personas legitimately show placeholder paths.
|
# because worked example overlays/personas legitimately show placeholder paths.
|
||||||
#
|
#
|
||||||
# File types: *.md, *.sh, *.ps1, *.json, and the extensionless CLI scripts under
|
# File types: *.md, *.sh, *.ps1, *.json, *.yml/*.yaml, *.toml, *.env, *.service, and the CLI scripts under
|
||||||
# tools/_scripts/. Excludes node_modules/ and this gate file.
|
# tools/_scripts/. Excludes node_modules/ and this gate file.
|
||||||
#
|
#
|
||||||
# NOTE: '\bPDA\b' intentionally matches "PDA-friendly" (the contamination removed in P2);
|
# NOTE: '\bPDA\b' intentionally matches "PDA-friendly" (the contamination removed in P2);
|
||||||
@@ -39,7 +39,7 @@ cd "$FRAMEWORK_ROOT" || { echo "FRAMEWORK_ROOT not found: $FRAMEWORK_ROOT" >&2;
|
|||||||
# Identity scope = ALL shipped text files (examples/ INCLUDED).
|
# Identity scope = ALL shipped text files (examples/ INCLUDED).
|
||||||
_files_identity() {
|
_files_identity() {
|
||||||
find . -type f \
|
find . -type f \
|
||||||
\( -name '*.md' -o -name '*.sh' -o -name '*.ps1' -o -name '*.json' -o -path '*/tools/_scripts/*' \) \
|
\( -name '*.md' -o -name '*.sh' -o -name '*.ps1' -o -name '*.json' -o -name '*.yml' -o -name '*.yaml' -o -name '*.toml' -o -name '*.env' -o -name '*.service' -o -path '*/tools/_scripts/*' \) \
|
||||||
-not -path '*/node_modules/*' -not -path "./$SELF_REL" -print0
|
-not -path '*/node_modules/*' -not -path "./$SELF_REL" -print0
|
||||||
}
|
}
|
||||||
# Structural scope = shipped scripts, examples/ EXCLUDED.
|
# Structural scope = shipped scripts, examples/ EXCLUDED.
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "@mosaicstack/mosaic",
|
"name": "@mosaicstack/mosaic",
|
||||||
"version": "0.0.34",
|
"version": "0.0.36",
|
||||||
"repository": {
|
"repository": {
|
||||||
"type": "git",
|
"type": "git",
|
||||||
"url": "https://git.mosaicstack.dev/mosaicstack/stack.git",
|
"url": "https://git.mosaicstack.dev/mosaicstack/stack.git",
|
||||||
|
|||||||
@@ -10,11 +10,15 @@ import {
|
|||||||
buildAgentWatchCreateViewerCommand,
|
buildAgentWatchCreateViewerCommand,
|
||||||
buildAgentWatchKillViewerCommand,
|
buildAgentWatchKillViewerCommand,
|
||||||
buildAgentVerifyAcceptedCommand,
|
buildAgentVerifyAcceptedCommand,
|
||||||
|
buildEnableLingerCommand,
|
||||||
buildFleetServiceCommand,
|
buildFleetServiceCommand,
|
||||||
|
buildSystemdEnableCommand,
|
||||||
buildSystemdShowCommand,
|
buildSystemdShowCommand,
|
||||||
buildTmuxListPanesCommand,
|
buildTmuxListPanesCommand,
|
||||||
|
buildTmuxListSessionsCommand,
|
||||||
classifySendResult,
|
classifySendResult,
|
||||||
detectDrift,
|
detectDrift,
|
||||||
|
enableFleetUnits,
|
||||||
generateAgentEnv,
|
generateAgentEnv,
|
||||||
getDefaultOperatorSourceLabel,
|
getDefaultOperatorSourceLabel,
|
||||||
getDefaultTenantAndHost,
|
getDefaultTenantAndHost,
|
||||||
@@ -26,12 +30,15 @@ import {
|
|||||||
parseHeartbeat,
|
parseHeartbeat,
|
||||||
parseSystemdShow,
|
parseSystemdShow,
|
||||||
parseTmuxListPanes,
|
parseTmuxListPanes,
|
||||||
|
parseTmuxListSessions,
|
||||||
registerFleetCommand,
|
registerFleetCommand,
|
||||||
resolveFleetPaths,
|
resolveFleetPaths,
|
||||||
|
RUNTIME_ACCEPTABLE_COMMANDS,
|
||||||
VERIFY_DEFAULT_TIMEOUT_MS,
|
VERIFY_DEFAULT_TIMEOUT_MS,
|
||||||
VERIFY_POLL_INTERVAL_MS,
|
VERIFY_POLL_INTERVAL_MS,
|
||||||
type AgentPsRow,
|
type AgentPsRow,
|
||||||
type CommandRunner,
|
type CommandRunner,
|
||||||
|
type FleetRoster,
|
||||||
type InteractiveRunner,
|
type InteractiveRunner,
|
||||||
type SleepFn,
|
type SleepFn,
|
||||||
} from './fleet.js';
|
} from './fleet.js';
|
||||||
@@ -909,6 +916,118 @@ describe('fleet ps — drift detection', () => {
|
|||||||
it('does NOT flag drift when pane command is null (pane dead)', () => {
|
it('does NOT flag drift when pane command is null (pane dead)', () => {
|
||||||
expect(detectDrift('pi', null)).toBe(false);
|
expect(detectDrift('pi', null)).toBe(false);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('does NOT flag drift when pane=node for wrapped pi agent (mosaic yolo pi)', () => {
|
||||||
|
expect(detectDrift('pi', 'node')).toBe(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('does NOT flag drift when pane=node for wrapped codex agent (mosaic yolo codex)', () => {
|
||||||
|
expect(detectDrift('codex', 'node')).toBe(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('flags drift when pane=python3 for pi runtime (canary-pi dogfood regression guard)', () => {
|
||||||
|
expect(detectDrift('pi', 'python3')).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('does NOT flag drift when pane=python3 for dogfood runtime', () => {
|
||||||
|
expect(detectDrift('dogfood', 'python3')).toBe(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('flags drift for unknown pane command on known runtime', () => {
|
||||||
|
expect(detectDrift('claude', 'bash')).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('RUNTIME_ACCEPTABLE_COMMANDS is exported and contains expected entries', () => {
|
||||||
|
expect(RUNTIME_ACCEPTABLE_COMMANDS['pi']).toContain('node');
|
||||||
|
expect(RUNTIME_ACCEPTABLE_COMMANDS['pi']).not.toContain('python3');
|
||||||
|
expect(RUNTIME_ACCEPTABLE_COMMANDS['dogfood']).toContain('python3');
|
||||||
|
expect(RUNTIME_ACCEPTABLE_COMMANDS['codex']).toContain('node');
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('fleet install — auto-enable units for boot-survival', () => {
|
||||||
|
it('buildSystemdEnableCommand and buildEnableLingerCommand return correct command arrays', () => {
|
||||||
|
expect(buildSystemdEnableCommand('mosaic-tmux-holder.service')).toEqual([
|
||||||
|
'systemctl',
|
||||||
|
'--user',
|
||||||
|
'enable',
|
||||||
|
'mosaic-tmux-holder.service',
|
||||||
|
]);
|
||||||
|
expect(buildEnableLingerCommand('testuser')).toEqual(['loginctl', 'enable-linger', 'testuser']);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('enables holder and each agent unit via injected runner after install', async () => {
|
||||||
|
const minimalRoster: FleetRoster = {
|
||||||
|
version: 1,
|
||||||
|
transport: 'tmux',
|
||||||
|
tmux: { socketName: 'mosaic-factory', holderSession: '_holder' },
|
||||||
|
defaults: { workingDirectory: '~/src' },
|
||||||
|
runtimes: { codex: { resetCommand: '/clear' } },
|
||||||
|
agents: [{ name: 'coder0', runtime: 'codex', className: 'worker' }],
|
||||||
|
};
|
||||||
|
|
||||||
|
const calls: string[][] = [];
|
||||||
|
const runner: CommandRunner = async (command, args) => {
|
||||||
|
calls.push([command, ...args]);
|
||||||
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
|
};
|
||||||
|
|
||||||
|
await enableFleetUnits(runner, minimalRoster, {});
|
||||||
|
|
||||||
|
expect(calls).toContainEqual(['systemctl', '--user', 'enable', 'mosaic-tmux-holder.service']);
|
||||||
|
expect(calls).toContainEqual(['systemctl', '--user', 'enable', 'mosaic-agent@coder0.service']);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('install still succeeds when systemctl enable returns non-zero (non-fatal)', async () => {
|
||||||
|
const minimalRoster: FleetRoster = {
|
||||||
|
version: 1,
|
||||||
|
transport: 'tmux',
|
||||||
|
tmux: { socketName: 'mosaic-factory', holderSession: '_holder' },
|
||||||
|
defaults: { workingDirectory: '~/src' },
|
||||||
|
runtimes: { codex: { resetCommand: '/clear' } },
|
||||||
|
agents: [{ name: 'coder0', runtime: 'codex', className: 'worker' }],
|
||||||
|
};
|
||||||
|
|
||||||
|
const calls: string[][] = [];
|
||||||
|
const runner: CommandRunner = async (command, args) => {
|
||||||
|
calls.push([command, ...args]);
|
||||||
|
// Simulate systemctl enable failure
|
||||||
|
if (command === 'systemctl' && args.includes('enable')) {
|
||||||
|
return { stdout: '', stderr: 'Unit not found', exitCode: 1 };
|
||||||
|
}
|
||||||
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
|
};
|
||||||
|
|
||||||
|
// Must NOT reject/throw even when enable calls fail
|
||||||
|
await expect(enableFleetUnits(runner, minimalRoster, {})).resolves.toBeUndefined();
|
||||||
|
|
||||||
|
// The enable attempt must have been made
|
||||||
|
expect(calls.some((c) => c.includes('enable'))).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('--no-enable skips all systemctl enable and loginctl linger calls', async () => {
|
||||||
|
const minimalRoster: FleetRoster = {
|
||||||
|
version: 1,
|
||||||
|
transport: 'tmux',
|
||||||
|
tmux: { socketName: 'mosaic-factory', holderSession: '_holder' },
|
||||||
|
defaults: { workingDirectory: '~/src' },
|
||||||
|
runtimes: { codex: { resetCommand: '/clear' } },
|
||||||
|
agents: [{ name: 'coder0', runtime: 'codex', className: 'worker' }],
|
||||||
|
};
|
||||||
|
|
||||||
|
const calls: string[][] = [];
|
||||||
|
const runner: CommandRunner = async (command, args) => {
|
||||||
|
calls.push([command, ...args]);
|
||||||
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
|
};
|
||||||
|
|
||||||
|
await enableFleetUnits(runner, minimalRoster, { enable: false });
|
||||||
|
|
||||||
|
// No calls should include 'enable'
|
||||||
|
expect(calls.every((c) => !c.includes('enable'))).toBe(true);
|
||||||
|
// No loginctl calls at all
|
||||||
|
expect(calls.every((c) => c[0] !== 'loginctl')).toBe(true);
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
describe('fleet ps — tenant and host', () => {
|
describe('fleet ps — tenant and host', () => {
|
||||||
@@ -957,6 +1076,10 @@ describe('fleet ps — JSON output shape (FR-6)', () => {
|
|||||||
exitCode: 0,
|
exitCode: 0,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
if (fullArgs.includes('list-sessions')) {
|
||||||
|
// Only the roster agent session on the socket (no unmanaged sessions)
|
||||||
|
return { stdout: 'canary-pi\n', stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
return { stdout: '', stderr: '', exitCode: 0 };
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -1000,11 +1123,15 @@ describe('fleet ps — JSON output shape (FR-6)', () => {
|
|||||||
expect(row.runtime).toBe('pi');
|
expect(row.runtime).toBe('pi');
|
||||||
expect(row.systemdActive).toBe('active');
|
expect(row.systemdActive).toBe('active');
|
||||||
expect(row.systemdEnabled).toBe('disabled');
|
expect(row.systemdEnabled).toBe('disabled');
|
||||||
|
|
||||||
|
// managed/source fields for roster agents
|
||||||
|
expect(row.managed).toBe(true);
|
||||||
|
expect(row.source).toBe('roster');
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
describe('fleet ps — command sequences issued', () => {
|
describe('fleet ps — command sequences issued', () => {
|
||||||
it('issues systemd show + tmux list-panes per agent', async () => {
|
it('issues systemd show + tmux list-panes per agent, then list-sessions for socket discovery', async () => {
|
||||||
const home = await mkdtemp(join(tmpdir(), 'mosaic-fleet-'));
|
const home = await mkdtemp(join(tmpdir(), 'mosaic-fleet-'));
|
||||||
const rosterPath = join(home, 'fleet', 'roster.yaml');
|
const rosterPath = join(home, 'fleet', 'roster.yaml');
|
||||||
await mkdir(join(home, 'fleet'), { recursive: true });
|
await mkdir(join(home, 'fleet'), { recursive: true });
|
||||||
@@ -1018,6 +1145,10 @@ describe('fleet ps — command sequences issued', () => {
|
|||||||
const calls: string[][] = [];
|
const calls: string[][] = [];
|
||||||
const runner: CommandRunner = async (command, args) => {
|
const runner: CommandRunner = async (command, args) => {
|
||||||
calls.push([command, ...args]);
|
calls.push([command, ...args]);
|
||||||
|
if ([command, ...args].join(' ').includes('list-sessions')) {
|
||||||
|
// Only the roster agent — no unmanaged sessions
|
||||||
|
return { stdout: 'coder0\n', stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
return {
|
return {
|
||||||
stdout: 'ActiveState=inactive\nSubState=dead\nUnitFileState=enabled\n',
|
stdout: 'ActiveState=inactive\nSubState=dead\nUnitFileState=enabled\n',
|
||||||
stderr: '',
|
stderr: '',
|
||||||
@@ -1038,6 +1169,7 @@ describe('fleet ps — command sequences issued', () => {
|
|||||||
expect(calls).toEqual([
|
expect(calls).toEqual([
|
||||||
buildSystemdShowCommand('coder0'),
|
buildSystemdShowCommand('coder0'),
|
||||||
buildTmuxListPanesCommand('coder0', 'mosaic-factory'),
|
buildTmuxListPanesCommand('coder0', 'mosaic-factory'),
|
||||||
|
buildTmuxListSessionsCommand('mosaic-factory'),
|
||||||
]);
|
]);
|
||||||
} finally {
|
} finally {
|
||||||
console.log = origLog;
|
console.log = origLog;
|
||||||
@@ -1046,6 +1178,258 @@ describe('fleet ps — command sequences issued', () => {
|
|||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
describe('buildTmuxListSessionsCommand', () => {
|
||||||
|
it('builds exact list-sessions command with session_name format', () => {
|
||||||
|
expect(buildTmuxListSessionsCommand('mosaic-factory')).toEqual([
|
||||||
|
'tmux',
|
||||||
|
'-L',
|
||||||
|
'mosaic-factory',
|
||||||
|
'list-sessions',
|
||||||
|
'-F',
|
||||||
|
'#{session_name}',
|
||||||
|
]);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('uses DEFAULT_SOCKET_NAME when socket is omitted', () => {
|
||||||
|
const cmd = buildTmuxListSessionsCommand();
|
||||||
|
expect(cmd[2]).toBe('mosaic-factory');
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('parseTmuxListSessions', () => {
|
||||||
|
it('splits newline-delimited session names', () => {
|
||||||
|
expect(parseTmuxListSessions('canary-pi\n_holder\nsome-adhoc\n')).toEqual([
|
||||||
|
'canary-pi',
|
||||||
|
'_holder',
|
||||||
|
'some-adhoc',
|
||||||
|
]);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('returns empty array for blank output', () => {
|
||||||
|
expect(parseTmuxListSessions('')).toEqual([]);
|
||||||
|
expect(parseTmuxListSessions(' \n \n')).toEqual([]);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('trims whitespace from each line', () => {
|
||||||
|
expect(parseTmuxListSessions(' canary-pi \n some-adhoc \n')).toEqual([
|
||||||
|
'canary-pi',
|
||||||
|
'some-adhoc',
|
||||||
|
]);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('fleet ps — unmanaged socket sessions', () => {
|
||||||
|
it('includes unmanaged session row flagged UNMANAGED and excludes _holder', async () => {
|
||||||
|
const home = await mkdtemp(join(tmpdir(), 'mosaic-fleet-'));
|
||||||
|
const rosterPath = join(home, 'fleet', 'roster.yaml');
|
||||||
|
await mkdir(join(home, 'fleet'), { recursive: true });
|
||||||
|
await writeFile(
|
||||||
|
rosterPath,
|
||||||
|
[
|
||||||
|
'version: 1',
|
||||||
|
'transport: tmux',
|
||||||
|
'agents:',
|
||||||
|
' - name: canary-pi',
|
||||||
|
' runtime: pi',
|
||||||
|
' class: canary',
|
||||||
|
].join('\n'),
|
||||||
|
);
|
||||||
|
|
||||||
|
const nowMs = Date.now();
|
||||||
|
const activityEpoch = Math.floor((nowMs - 10_000) / 1000);
|
||||||
|
|
||||||
|
const runner: CommandRunner = async (command, args) => {
|
||||||
|
const full = [command, ...args].join(' ');
|
||||||
|
if (full.includes('list-sessions')) {
|
||||||
|
// Socket has: canary-pi (roster), _holder (excluded), some-adhoc (unmanaged)
|
||||||
|
return { stdout: 'canary-pi\n_holder\nsome-adhoc\n', stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
|
if (full.includes('list-panes')) {
|
||||||
|
return { stdout: `99999 bash 0 ${activityEpoch}\n`, stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
|
if (full.includes('systemctl') && full.includes('show')) {
|
||||||
|
return {
|
||||||
|
stdout: 'ActiveState=inactive\nSubState=dead\nUnitFileState=unknown\n',
|
||||||
|
stderr: '',
|
||||||
|
exitCode: 0,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
|
};
|
||||||
|
|
||||||
|
const lines: string[] = [];
|
||||||
|
const origLog = console.log;
|
||||||
|
console.log = (msg: string) => {
|
||||||
|
lines.push(msg);
|
||||||
|
};
|
||||||
|
|
||||||
|
const program = new Command();
|
||||||
|
program.exitOverride();
|
||||||
|
registerFleetCommand(program, { runner, mosaicHome: home });
|
||||||
|
|
||||||
|
try {
|
||||||
|
await program.parseAsync(['node', 'mosaic', 'fleet', 'ps', '--json']);
|
||||||
|
} finally {
|
||||||
|
console.log = origLog;
|
||||||
|
await rm(home, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
|
||||||
|
const json = JSON.parse(lines.join('')) as AgentPsRow[];
|
||||||
|
expect(Array.isArray(json)).toBe(true);
|
||||||
|
|
||||||
|
// Should have 2 rows: canary-pi (roster) + some-adhoc (unmanaged); _holder excluded
|
||||||
|
expect(json).toHaveLength(2);
|
||||||
|
|
||||||
|
// Roster agent comes first
|
||||||
|
const rosterRow = json[0]!;
|
||||||
|
expect(rosterRow.name).toBe('canary-pi');
|
||||||
|
expect(rosterRow.managed).toBe(true);
|
||||||
|
expect(rosterRow.source).toBe('roster');
|
||||||
|
|
||||||
|
// Unmanaged session comes second
|
||||||
|
const unmanagedRow = json[1]!;
|
||||||
|
expect(unmanagedRow.name).toBe('some-adhoc');
|
||||||
|
expect(unmanagedRow.managed).toBe(false);
|
||||||
|
expect(unmanagedRow.source).toBe('socket');
|
||||||
|
expect(unmanagedRow.runtime).toBe('unknown');
|
||||||
|
|
||||||
|
// _holder must not appear
|
||||||
|
expect(json.map((r) => r.name)).not.toContain('_holder');
|
||||||
|
|
||||||
|
// tenant_id and host must be present on unmanaged rows
|
||||||
|
expect(typeof unmanagedRow.tenant_id).toBe('string');
|
||||||
|
expect(unmanagedRow.tenant_id.length).toBeGreaterThan(0);
|
||||||
|
expect(typeof unmanagedRow.host).toBe('string');
|
||||||
|
expect(unmanagedRow.host.length).toBeGreaterThan(0);
|
||||||
|
|
||||||
|
// driftFlag must be false for unmanaged (no roster runtime to compare)
|
||||||
|
expect(unmanagedRow.driftFlag).toBe(false);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('shows UNMANAGED flag in table output for unmanaged sessions', async () => {
|
||||||
|
const home = await mkdtemp(join(tmpdir(), 'mosaic-fleet-'));
|
||||||
|
const rosterPath = join(home, 'fleet', 'roster.yaml');
|
||||||
|
await mkdir(join(home, 'fleet'), { recursive: true });
|
||||||
|
await writeFile(
|
||||||
|
rosterPath,
|
||||||
|
[
|
||||||
|
'version: 1',
|
||||||
|
'transport: tmux',
|
||||||
|
'agents:',
|
||||||
|
' - name: canary-pi',
|
||||||
|
' runtime: pi',
|
||||||
|
' class: canary',
|
||||||
|
].join('\n'),
|
||||||
|
);
|
||||||
|
|
||||||
|
const runner: CommandRunner = async (command, args) => {
|
||||||
|
const full = [command, ...args].join(' ');
|
||||||
|
if (full.includes('list-sessions')) {
|
||||||
|
return { stdout: 'canary-pi\nsome-adhoc\n', stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
|
if (full.includes('list-panes')) {
|
||||||
|
return { stdout: '0 bash 1 0\n', stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
|
if (full.includes('systemctl') && full.includes('show')) {
|
||||||
|
return {
|
||||||
|
stdout: 'ActiveState=inactive\nSubState=dead\nUnitFileState=unknown\n',
|
||||||
|
stderr: '',
|
||||||
|
exitCode: 0,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
|
};
|
||||||
|
|
||||||
|
const lines: string[] = [];
|
||||||
|
const origLog = console.log;
|
||||||
|
console.log = (msg: string) => {
|
||||||
|
lines.push(msg);
|
||||||
|
};
|
||||||
|
|
||||||
|
const program = new Command();
|
||||||
|
program.exitOverride();
|
||||||
|
registerFleetCommand(program, { runner, mosaicHome: home });
|
||||||
|
|
||||||
|
try {
|
||||||
|
await program.parseAsync(['node', 'mosaic', 'fleet', 'ps']);
|
||||||
|
} finally {
|
||||||
|
console.log = origLog;
|
||||||
|
await rm(home, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
|
||||||
|
const tableOutput = lines.join('\n');
|
||||||
|
// some-adhoc row must appear with UNMANAGED flag
|
||||||
|
expect(tableOutput).toMatch(/some-adhoc/);
|
||||||
|
expect(tableOutput).toMatch(/UNMANAGED/);
|
||||||
|
// canary-pi roster row must not have UNMANAGED
|
||||||
|
const rosterLine = lines.find((l) => l.includes('canary-pi'));
|
||||||
|
expect(rosterLine).toBeDefined();
|
||||||
|
expect(rosterLine).not.toMatch(/UNMANAGED/);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('gracefully shows only roster rows when list-sessions fails (socket missing)', async () => {
|
||||||
|
const home = await mkdtemp(join(tmpdir(), 'mosaic-fleet-'));
|
||||||
|
const rosterPath = join(home, 'fleet', 'roster.yaml');
|
||||||
|
await mkdir(join(home, 'fleet'), { recursive: true });
|
||||||
|
await writeFile(
|
||||||
|
rosterPath,
|
||||||
|
[
|
||||||
|
'version: 1',
|
||||||
|
'transport: tmux',
|
||||||
|
'agents:',
|
||||||
|
' - name: canary-pi',
|
||||||
|
' runtime: pi',
|
||||||
|
' class: canary',
|
||||||
|
].join('\n'),
|
||||||
|
);
|
||||||
|
|
||||||
|
const runner: CommandRunner = async (command, args) => {
|
||||||
|
const full = [command, ...args].join(' ');
|
||||||
|
if (full.includes('list-sessions')) {
|
||||||
|
// Simulate socket missing
|
||||||
|
return { stdout: '', stderr: 'no server running on /tmp/...', exitCode: 1 };
|
||||||
|
}
|
||||||
|
if (full.includes('list-panes')) {
|
||||||
|
return { stdout: '12345 pi 0 0\n', stderr: '', exitCode: 0 };
|
||||||
|
}
|
||||||
|
if (full.includes('systemctl') && full.includes('show')) {
|
||||||
|
return {
|
||||||
|
stdout: 'ActiveState=inactive\nSubState=dead\nUnitFileState=enabled\n',
|
||||||
|
stderr: '',
|
||||||
|
exitCode: 0,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
return { stdout: '', stderr: '', exitCode: 0 };
|
||||||
|
};
|
||||||
|
|
||||||
|
const lines: string[] = [];
|
||||||
|
const origLog = console.log;
|
||||||
|
console.log = (msg: string) => {
|
||||||
|
lines.push(msg);
|
||||||
|
};
|
||||||
|
|
||||||
|
const program = new Command();
|
||||||
|
program.exitOverride();
|
||||||
|
registerFleetCommand(program, { runner, mosaicHome: home });
|
||||||
|
|
||||||
|
try {
|
||||||
|
// Must not throw
|
||||||
|
await expect(
|
||||||
|
program.parseAsync(['node', 'mosaic', 'fleet', 'ps', '--json']),
|
||||||
|
).resolves.toBeDefined();
|
||||||
|
} finally {
|
||||||
|
console.log = origLog;
|
||||||
|
await rm(home, { recursive: true, force: true });
|
||||||
|
}
|
||||||
|
|
||||||
|
const json = JSON.parse(lines.join('')) as AgentPsRow[];
|
||||||
|
// Only roster agent visible; no crash
|
||||||
|
expect(json).toHaveLength(1);
|
||||||
|
expect(json[0]!.name).toBe('canary-pi');
|
||||||
|
expect(json[0]!.managed).toBe(true);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
describe('agent watch', () => {
|
describe('agent watch', () => {
|
||||||
it('builds exact grouped-viewer creation command', () => {
|
it('builds exact grouped-viewer creation command', () => {
|
||||||
expect(
|
expect(
|
||||||
|
|||||||
@@ -210,6 +210,93 @@ export function buildFleetServiceCommand(action: FleetServiceAction, agentName?:
|
|||||||
return ['systemctl', '--user', action, service];
|
return ['systemctl', '--user', action, service];
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns the systemctl --user enable command for a given unit.
|
||||||
|
* Used by the install auto-enable step to persist units across reboots.
|
||||||
|
*/
|
||||||
|
export function buildSystemdEnableCommand(unit: string): string[] {
|
||||||
|
return ['systemctl', '--user', 'enable', unit];
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns the loginctl enable-linger command for a given user.
|
||||||
|
* Linger allows user systemd services to survive logout.
|
||||||
|
*/
|
||||||
|
export function buildEnableLingerCommand(user: string): string[] {
|
||||||
|
return ['loginctl', 'enable-linger', user];
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Enable fleet units for boot-survival after install.
|
||||||
|
* Non-fatal: if systemctl enable returns non-zero, a warning is printed and we continue.
|
||||||
|
* If opts.enable === false (--no-enable flag), the whole step is skipped.
|
||||||
|
*/
|
||||||
|
export async function enableFleetUnits(
|
||||||
|
runner: CommandRunner,
|
||||||
|
roster: FleetRoster,
|
||||||
|
opts: { enable?: boolean },
|
||||||
|
): Promise<void> {
|
||||||
|
if (opts.enable === false) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
let succeeded = 0;
|
||||||
|
let failed = 0;
|
||||||
|
|
||||||
|
const holderResult = await runner(
|
||||||
|
...splitCommand(buildSystemdEnableCommand('mosaic-tmux-holder.service')),
|
||||||
|
);
|
||||||
|
if (holderResult.exitCode === 0) {
|
||||||
|
succeeded++;
|
||||||
|
} else {
|
||||||
|
failed++;
|
||||||
|
process.stderr.write(
|
||||||
|
`Warning: could not enable mosaic-tmux-holder.service: ${holderResult.stderr || holderResult.stdout || 'non-zero exit'}\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const agent of roster.agents) {
|
||||||
|
const unit = `mosaic-agent@${agent.name}.service`;
|
||||||
|
const result = await runner(...splitCommand(buildSystemdEnableCommand(unit)));
|
||||||
|
if (result.exitCode === 0) {
|
||||||
|
succeeded++;
|
||||||
|
} else {
|
||||||
|
failed++;
|
||||||
|
process.stderr.write(
|
||||||
|
`Warning: could not enable ${unit}: ${result.stderr || result.stdout || 'non-zero exit'}\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (succeeded > 0) {
|
||||||
|
console.log(`Enabled ${succeeded} unit(s) for boot-survival.`);
|
||||||
|
}
|
||||||
|
if (failed > 0) {
|
||||||
|
process.stderr.write(
|
||||||
|
`Warning: ${failed} unit(s) could not be enabled (systemctl unavailable?). Run manually if needed.\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Best-effort linger
|
||||||
|
let username: string;
|
||||||
|
try {
|
||||||
|
username = userInfo().username;
|
||||||
|
} catch {
|
||||||
|
username = process.env['USER'] ?? process.env['LOGNAME'] ?? 'unknown';
|
||||||
|
}
|
||||||
|
const lingerResult = await runner(...splitCommand(buildEnableLingerCommand(username)));
|
||||||
|
if (lingerResult.exitCode !== 0) {
|
||||||
|
process.stderr.write(
|
||||||
|
`Hint: run 'loginctl enable-linger ${username}' as root to survive logout.\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
process.stderr.write(
|
||||||
|
`Warning: auto-enable step failed unexpectedly: ${err instanceof Error ? err.message : String(err)}\n`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
export function buildAgentSendCommand(
|
export function buildAgentSendCommand(
|
||||||
paths: FleetPaths,
|
paths: FleetPaths,
|
||||||
agentName: string,
|
agentName: string,
|
||||||
@@ -302,6 +389,10 @@ export interface AgentPsRow {
|
|||||||
driftFlag: boolean;
|
driftFlag: boolean;
|
||||||
/** active but UnitFileState=disabled */
|
/** active but UnitFileState=disabled */
|
||||||
bootEnableWarning: boolean;
|
bootEnableWarning: boolean;
|
||||||
|
/** true = came from roster; false = found on socket but not in roster */
|
||||||
|
managed: boolean;
|
||||||
|
/** "roster" = defined in roster.yaml; "socket" = discovered via tmux list-sessions */
|
||||||
|
source: 'roster' | 'socket';
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -344,6 +435,26 @@ export function buildTmuxListPanesCommand(
|
|||||||
];
|
];
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns the tmux list-sessions command to enumerate all sessions on a socket.
|
||||||
|
* Format: `tmux -L <socket> list-sessions -F '#{session_name}'`
|
||||||
|
* Used to discover ad-hoc sessions that are not in the roster.
|
||||||
|
*/
|
||||||
|
export function buildTmuxListSessionsCommand(socketName = DEFAULT_SOCKET_NAME): string[] {
|
||||||
|
return ['tmux', '-L', socketName, 'list-sessions', '-F', '#{session_name}'];
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Parse the output of `tmux list-sessions -F '#{session_name}'` into an array of session names.
|
||||||
|
* Returns an empty array on empty/blank output.
|
||||||
|
*/
|
||||||
|
export function parseTmuxListSessions(output: string): string[] {
|
||||||
|
return output
|
||||||
|
.split('\n')
|
||||||
|
.map((line) => line.trim())
|
||||||
|
.filter((line) => line.length > 0);
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Returns the heartbeat file path for an agent.
|
* Returns the heartbeat file path for an agent.
|
||||||
*/
|
*/
|
||||||
@@ -437,32 +548,41 @@ export function parseTmuxListPanes(
|
|||||||
return { pid, command, dead, idleSeconds };
|
return { pid, command, dead, idleSeconds };
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Maps each known runtime to the set of acceptable pane commands.
|
||||||
|
* A pane running any of these commands for the given runtime is NOT considered drifted.
|
||||||
|
* Runtimes launched via `mosaic yolo` wrap in node, so 'node' is acceptable for most.
|
||||||
|
* The dogfood runtime accepts python3/python (the canary-pi dogfood stub).
|
||||||
|
*/
|
||||||
|
export const RUNTIME_ACCEPTABLE_COMMANDS: Record<string, readonly string[]> = {
|
||||||
|
claude: ['claude', 'node'],
|
||||||
|
codex: ['codex', 'node'],
|
||||||
|
opencode: ['opencode', 'node'],
|
||||||
|
pi: ['pi', 'node'],
|
||||||
|
dogfood: ['python3', 'python'],
|
||||||
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Determine if there is a runtime drift: roster says one runtime but the pane
|
* Determine if there is a runtime drift: roster says one runtime but the pane
|
||||||
* is actually running something from a different runtime. We detect this by
|
* is actually running something from a different runtime. We detect this by
|
||||||
* checking if the pane command doesn't match a known canonical command for the
|
* checking if the pane command doesn't match a known acceptable command for the
|
||||||
* roster's declared runtime.
|
* roster's declared runtime.
|
||||||
*
|
*
|
||||||
* Known canonical commands per runtime:
|
* Known acceptable commands per runtime (see RUNTIME_ACCEPTABLE_COMMANDS):
|
||||||
* claude → claude
|
* claude → claude, node (node covers mosaic yolo wrapper)
|
||||||
* codex → codex
|
* codex → codex, node
|
||||||
* opencode → opencode
|
* opencode → opencode, node
|
||||||
* pi → pi
|
* pi → pi, node (python3 still flags drift for canary-pi dogfood stub)
|
||||||
|
* dogfood → python3, python
|
||||||
*
|
*
|
||||||
* If the pane is running something else (e.g., python3/dogfood-agent.py) for
|
* If the pane is running something else (e.g., python3/dogfood-agent.py) for
|
||||||
* an agent whose roster runtime is "pi", that's a drift.
|
* an agent whose roster runtime is "pi", that's a drift.
|
||||||
*/
|
*/
|
||||||
export function detectDrift(rosterRuntime: string, paneCommand: string | null): boolean {
|
export function detectDrift(rosterRuntime: string, paneCommand: string | null): boolean {
|
||||||
if (!paneCommand) return false;
|
if (!paneCommand) return false;
|
||||||
const knownCommands: Record<string, string[]> = {
|
const acceptable = RUNTIME_ACCEPTABLE_COMMANDS[rosterRuntime];
|
||||||
claude: ['claude'],
|
if (!acceptable) return false;
|
||||||
codex: ['codex'],
|
return !acceptable.includes(paneCommand);
|
||||||
opencode: ['opencode'],
|
|
||||||
pi: ['pi'],
|
|
||||||
};
|
|
||||||
const expected = knownCommands[rosterRuntime];
|
|
||||||
if (!expected) return false;
|
|
||||||
return !expected.includes(paneCommand);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -706,12 +826,22 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
cmd
|
cmd
|
||||||
.command('install')
|
.command('install')
|
||||||
.description('Install local fleet tools and user systemd units')
|
.description('Install local fleet tools and user systemd units')
|
||||||
.action(async () => installFleet(cmd, frameworkRoot));
|
.option('--no-enable', 'Skip enabling units for boot-survival')
|
||||||
|
.action(async (opts: { enable?: boolean }) => {
|
||||||
|
await installFleet(cmd, frameworkRoot);
|
||||||
|
const roster = await loadRosterForCommand(cmd);
|
||||||
|
await enableFleetUnits(runner, roster, opts);
|
||||||
|
});
|
||||||
|
|
||||||
cmd
|
cmd
|
||||||
.command('install-systemd')
|
.command('install-systemd')
|
||||||
.description('Install local fleet tools and user systemd units')
|
.description('Install local fleet tools and user systemd units')
|
||||||
.action(async () => installFleet(cmd, frameworkRoot));
|
.option('--no-enable', 'Skip enabling units for boot-survival')
|
||||||
|
.action(async (opts: { enable?: boolean }) => {
|
||||||
|
await installFleet(cmd, frameworkRoot);
|
||||||
|
const roster = await loadRosterForCommand(cmd);
|
||||||
|
await enableFleetUnits(runner, roster, opts);
|
||||||
|
});
|
||||||
|
|
||||||
for (const action of ['start', 'stop', 'restart'] as const) {
|
for (const action of ['start', 'stop', 'restart'] as const) {
|
||||||
cmd
|
cmd
|
||||||
@@ -791,7 +921,9 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
|
|
||||||
cmd
|
cmd
|
||||||
.command('ps')
|
.command('ps')
|
||||||
.description('Show real-time status for all roster agents (systemd + tmux + heartbeat)')
|
.description(
|
||||||
|
'Show real-time status for all roster agents and unmanaged socket sessions (systemd + tmux + heartbeat)',
|
||||||
|
)
|
||||||
.option('--json', 'Print JSON array')
|
.option('--json', 'Print JSON array')
|
||||||
.action(async (opts: { json?: boolean }) => {
|
.action(async (opts: { json?: boolean }) => {
|
||||||
const commandOpts = cmd.opts<{ mosaicHome: string; roster?: string }>();
|
const commandOpts = cmd.opts<{ mosaicHome: string; roster?: string }>();
|
||||||
@@ -802,6 +934,9 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
|
|
||||||
const rows: AgentPsRow[] = [];
|
const rows: AgentPsRow[] = [];
|
||||||
|
|
||||||
|
// Build the set of roster agent names for quick lookup when filtering socket sessions.
|
||||||
|
const rosterAgentNames = new Set(roster.agents.map((a) => a.name));
|
||||||
|
|
||||||
for (const agent of roster.agents) {
|
for (const agent of roster.agents) {
|
||||||
// systemd show
|
// systemd show
|
||||||
const showResult = await runner(...splitCommand(buildSystemdShowCommand(agent.name)));
|
const showResult = await runner(...splitCommand(buildSystemdShowCommand(agent.name)));
|
||||||
@@ -842,9 +977,75 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
heartbeat: hb,
|
heartbeat: hb,
|
||||||
driftFlag,
|
driftFlag,
|
||||||
bootEnableWarning,
|
bootEnableWarning,
|
||||||
|
managed: true,
|
||||||
|
source: 'roster',
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Enumerate all live sessions on the socket to surface unmanaged (ad-hoc) sessions.
|
||||||
|
// If list-sessions fails (socket not up), silently skip — show roster rows only.
|
||||||
|
try {
|
||||||
|
const listSessionsResult = await runner(
|
||||||
|
...splitCommand(buildTmuxListSessionsCommand(roster.tmux.socketName)),
|
||||||
|
);
|
||||||
|
if (listSessionsResult.exitCode === 0) {
|
||||||
|
const socketSessions = parseTmuxListSessions(listSessionsResult.stdout);
|
||||||
|
const holderSession = roster.tmux.holderSession;
|
||||||
|
|
||||||
|
for (const sessionName of socketSessions) {
|
||||||
|
// Skip roster agents (already in rows) and the holder session (infrastructure).
|
||||||
|
if (rosterAgentNames.has(sessionName) || sessionName === holderSession) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// tmux list-panes for pane info
|
||||||
|
const panesResult = await runner(
|
||||||
|
...splitCommand(buildTmuxListPanesCommand(sessionName, roster.tmux.socketName)),
|
||||||
|
);
|
||||||
|
const paneInfo = parseTmuxListPanes(panesResult.stdout, nowMs);
|
||||||
|
|
||||||
|
// heartbeat — try reading the .hb file using the same path convention
|
||||||
|
const hbFile = heartbeatPath(sessionName, activePaths.mosaicHome);
|
||||||
|
let hbContent: string | null = null;
|
||||||
|
try {
|
||||||
|
hbContent = await readFile(hbFile, 'utf8');
|
||||||
|
} catch {
|
||||||
|
hbContent = null;
|
||||||
|
}
|
||||||
|
const hb = parseHeartbeat(hbContent, nowMs);
|
||||||
|
|
||||||
|
// systemd — check if mosaic-agent@<name>.service exists (usually inactive for ad-hoc)
|
||||||
|
const showResult = await runner(...splitCommand(buildSystemdShowCommand(sessionName)));
|
||||||
|
const sysInfo = parseSystemdShow(showResult.stdout);
|
||||||
|
|
||||||
|
const bootEnableWarning =
|
||||||
|
sysInfo.ActiveState === 'active' && sysInfo.UnitFileState === 'disabled';
|
||||||
|
|
||||||
|
rows.push({
|
||||||
|
name: sessionName,
|
||||||
|
tenant_id,
|
||||||
|
host,
|
||||||
|
// runtime unknown — not in roster
|
||||||
|
runtime: 'unknown',
|
||||||
|
systemdActive: sysInfo.ActiveState,
|
||||||
|
systemdEnabled: sysInfo.UnitFileState,
|
||||||
|
paneAlive: !paneInfo.dead,
|
||||||
|
panePid: paneInfo.pid,
|
||||||
|
paneCommand: paneInfo.command,
|
||||||
|
idleSeconds: paneInfo.idleSeconds,
|
||||||
|
heartbeat: hb,
|
||||||
|
// No roster runtime to compare — drift is not meaningful for unmanaged sessions
|
||||||
|
driftFlag: false,
|
||||||
|
bootEnableWarning,
|
||||||
|
managed: false,
|
||||||
|
source: 'socket',
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// list-sessions failed (socket missing or permission error) — show roster rows only
|
||||||
|
}
|
||||||
|
|
||||||
if (opts.json) {
|
if (opts.json) {
|
||||||
console.log(JSON.stringify(rows, null, 2));
|
console.log(JSON.stringify(rows, null, 2));
|
||||||
return;
|
return;
|
||||||
@@ -876,6 +1077,7 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
|
|||||||
? `${Math.round(row.heartbeat.ageMs / 1000)}s/${row.heartbeat.health}`
|
? `${Math.round(row.heartbeat.ageMs / 1000)}s/${row.heartbeat.health}`
|
||||||
: `unknown`;
|
: `unknown`;
|
||||||
const flags: string[] = [];
|
const flags: string[] = [];
|
||||||
|
if (!row.managed) flags.push('UNMANAGED');
|
||||||
if (row.driftFlag) flags.push('DRIFT');
|
if (row.driftFlag) flags.push('DRIFT');
|
||||||
if (row.bootEnableWarning) flags.push('BOOT-ENABLE');
|
if (row.bootEnableWarning) flags.push('BOOT-ENABLE');
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user