Compare commits

...

12 Commits

Author SHA1 Message Date
Jarvis
cd80ca1025 feat(fleet): add machine-readable NORTH_STAR.yaml + Markdown projection
All checks were successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Add docs/fleet/NORTH_STAR.yaml as the single machine-readable source of
truth for fleet planning, plus a pure, deterministic generator in
fleet.ts that projects it to docs/fleet/NORTH_STAR.md. The Markdown is a
projection (regenerate, do not hand-edit).

Self-contained Mosaic: the substrate names the native Postgres storage
service (@mosaicstack/db drizzle; PGlite-embedded by default, full
Postgres by config) as the backlog of record. No Hermes runtime
dependency.

- NORTH_STAR.yaml: mission, substrate, NS-1..NS-8 standing objectives,
  AC-NS-1..AC-NS-5 success criteria, workstreams A..F, seeded goal
  backlog (A1,A2,A3a,A3b,A4,B1,B2,B3a,B3b,G1) with depends_on DAG,
  vetoable assumptions, advisory spend block.
- parseNorthStar/renderNorthStarMarkdown/generateNorthStarMarkdown:
  pure parse + projection (no network/CLI/clock); writer is the only IO.
  Tables column-padded to stay prettier-clean and round-trip stable.
- fleet-north-star.spec.ts: YAML validates + reuses ids, NS-*/AC-NS-*
  present, depends_on DAG coherent, projection deterministic and matches
  the committed .md, no network/CLI.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 09:19:17 -05:00
937077f6be fix(fleet): report idle agents as available, reserve stuck for genuine blocks (#653)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 13:58:22 +00:00
1020cfaf9b chore(release): mosaic CLI 0.0.44 (#652)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 06:49:04 +00:00
70661e3fab fix(fleet): derive pane idle from window activity fallback (#651)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 06:37:45 +00:00
ec8dd7ca86 chore(release): mosaic CLI 0.0.43 (#650)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 06:08:20 +00:00
d887555852 feat(fleet): classify agent readiness in fleet ps (#649)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:55:47 +00:00
e3adc6a1bc chore(release): mosaic CLI 0.0.42 (#648)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:28:28 +00:00
aa27c42129 fix(fleet): pre-trust claude agent workdir to clear the folder-trust gate (#644) (#645)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:16:46 +00:00
16ae809442 fix(update): re-seed framework on version drift, not just in-command updates (#642) (#646)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:04:34 +00:00
6980e40e51 fix(db): stop pglite migration tests flaking CI (timeout + WASM OOM) (#647)
Some checks failed
ci/woodpecker/push/ci Pipeline was canceled
ci/woodpecker/push/publish Pipeline was canceled
2026-06-24 05:04:28 +00:00
e6b53ea103 fix(tools): default AGENT_WORK_ROOT to $HOME/mosaic/agent-work (#641)
Some checks failed
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was canceled
2026-06-23 13:40:13 +00:00
4da87640e8 feat(tmux): agent-send.sh --class triage tag for the comms daemon (#552)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-23 03:25:16 +00:00
20 changed files with 1722 additions and 71 deletions

7
.gitignore vendored
View File

@@ -15,3 +15,10 @@ infra/step-ca/dev-password
# Scratch dirs created by the framework git-wrapper shell test harnesses
.mosaic-test-work/
# Transient config files vite/vitest/esbuild write next to a *.config.ts while
# loading it, then unlink. They are untracked but were not ignored, so turbo's
# package traversal hashed them and intermittently failed CI with "Package
# traversal error: ... .timestamp-*.mjs: No such file or directory" when the
# file vanished mid-scan. Ignoring them removes the race.
*.timestamp-*.mjs

70
docs/fleet/NORTH_STAR.md Normal file
View File

@@ -0,0 +1,70 @@
# Mosaic Fleet — NORTH STAR
> **Generated file — do not edit by hand.**
> Projected deterministically from [`NORTH_STAR.yaml`](./NORTH_STAR.yaml) by the pure
> generator in `packages/mosaic/src/commands/fleet.ts` (`renderNorthStarMarkdown`).
> Edit the YAML, then regenerate. Self-contained Mosaic — no Hermes dependency.
## Mission
A self-driving Mosaic delivery fleet that 24/7 unattended converts a machine-readable goal set into merged, CI-green, budget-bounded change — looping plan→backlog→assign→execute→verify→merge→reassess — on Mosaic's OWN native backlog/dispatch engine.
## Substrate
The Mosaic Backlog is the backlog of record + dispatch engine, built on Mosaic's native Postgres storage service (@mosaicstack/db drizzle; PGlite-embedded by default, full Postgres by config). NOT Hermes.
## Standing objectives
- **NS-1** — Single machine-readable source (this file) drives planning; prose docs are projections.
- **NS-2** — Every backlog item is an independently-shippable unit with stable id, priority, depends_on DAG, represented as a Mosaic Backlog card; spend tracked as advisory projection.
- **NS-3** — The supervisor guarantees movement: no idle agent while ready dependency-satisfied work exists; no empty backlog without a replan request; assignment via Mosaic native dispatch/claim.
- **NS-4** — Exactly one merge-gate approver; nothing reaches main except via pr-merge.sh after pr-ci-wait.sh success; Gitea branch protection is the backstop.
- **NS-5** — Every unit bounded by wall-clock TTL on its claim; token caps enforced only where a real meter exists, else advisory.
- **NS-6** — Context cleared between tasks for ephemeral runners (reset_between_tasks); persona+mission re-injected per task.
- **NS-7** — Meta-loop (session-review + enhancer) continuously proposes small fleet-improvement PRs.
- **NS-8** — Single operator-flippable PAUSE kill-switch (fleet/run/PAUSED) honored before every dispatch and every merge.
## Success criteria
- **AC-NS-1** — The supervisor keeps a two-agent floor (1 orchestrator + >=1 enhancer) healthy across reboot.
- **AC-NS-2** — A goal added to this YAML is decomposed to cards and either merged or escalated, with no human in the loop.
- **AC-NS-3** — No PR merges with failure/error/no-status/timeout CI, and none bypass pr-merge.sh.
- **AC-NS-4** — TTL is enforced on claims; token caps remain advisory until a real meter exists.
- **AC-NS-5** — Flipping fleet/run/PAUSED halts dispatch and merges within one tick.
## Workstreams
| id | title |
| --- | ---------------------------------------------------------------- |
| A | Substrate — Mosaic Backlog on native Postgres storage service |
| B | Supervisor — movement guarantee, two-agent floor, dispatch/claim |
| C | Planner — goal decomposition into independently-shippable cards |
| D | Merge-gate — single approver, pr-merge.sh after CI wait |
| E | Meta-loop — session-review + enhancer improvement PRs |
| F | Safety-rails — TTL claims, advisory spend, PAUSE kill-switch |
## Goals (backlog projection)
| id | title | phase | priority | depends_on |
| --- | ---------------------------------------------------------------------- | ----- | ----------- | ---------- |
| A1 | Machine-readable NORTH_STAR.yaml + Markdown projection | 1 | must-have | — |
| A2 | Mosaic Backlog schema + storage-service card store (drizzle/PGlite) | 1 | must-have | A1 |
| A3a | Card lifecycle — create/claim/release with stable ids + depends_on DAG | 1 | must-have | A2 |
| A3b | TTL-bounded claim enforcement (wall-clock) on cards | 1 | must-have | A3a |
| A4 | Advisory spend projection per card (degrades to TTL, no real meter) | 1 | should-have | A3a |
| B1 | Supervisor tick — readiness scan, two-agent-floor health check | 2 | must-have | A3a |
| B2 | Native dispatch/claim — assign ready dependency-satisfied work | 2 | must-have | A3b, B1 |
| B3a | Planner decompose — goal added to YAML → cards | 2 | must-have | A2, B1 |
| B3b | Replan request on empty backlog; escalate on no-decompose | 2 | should-have | B3a |
| G1 | PAUSE kill-switch + merge-gate honored before dispatch and merge | 2 | must-have | B2 |
## Assumptions (vetoable)
- **ASM-1** (vetoable) — The Mosaic Backlog on the native Postgres storage service is the backlog of record.
- **ASM-2** (vetoable) — Claude gate roles have no native busy status, so readiness = pane-idle + heartbeat.
- **ASM-3** (vetoable) — Two-agent floor = 1 orchestrator + >=1 enhancer.
## Spend
- **advisory:** true
- No per-task token meter yet; budgets degrade to TTL. Spend is tracked only as an advisory projection alongside each card.

169
docs/fleet/NORTH_STAR.yaml Normal file
View File

@@ -0,0 +1,169 @@
# Mosaic Fleet — NORTH_STAR (machine-readable source of truth)
#
# This file is the single machine-readable source of truth for fleet planning.
# Prose docs (including NORTH_STAR.md) are deterministic PROJECTIONS of this file.
# Regenerate the Markdown projection with the pure generator in
# packages/mosaic/src/commands/fleet.ts (renderNorthStarMarkdown). Edit the YAML,
# never the .md.
#
# Self-contained Mosaic. NO Hermes runtime dependency. The backlog of record is
# the Mosaic Backlog on Mosaic's OWN native Postgres storage service.
version: 1
mission: >-
A self-driving Mosaic delivery fleet that 24/7 unattended converts a
machine-readable goal set into merged, CI-green, budget-bounded change —
looping plan→backlog→assign→execute→verify→merge→reassess — on Mosaic's OWN
native backlog/dispatch engine.
substrate:
note: >-
The Mosaic Backlog is the backlog of record + dispatch engine, built on
Mosaic's native Postgres storage service (@mosaicstack/db drizzle;
PGlite-embedded by default, full Postgres by config). NOT Hermes.
standing_objectives:
- id: NS-1
text: >-
Single machine-readable source (this file) drives planning; prose docs are
projections.
- id: NS-2
text: >-
Every backlog item is an independently-shippable unit with stable id,
priority, depends_on DAG, represented as a Mosaic Backlog card; spend
tracked as advisory projection.
- id: NS-3
text: >-
The supervisor guarantees movement: no idle agent while ready
dependency-satisfied work exists; no empty backlog without a replan
request; assignment via Mosaic native dispatch/claim.
- id: NS-4
text: >-
Exactly one merge-gate approver; nothing reaches main except via
pr-merge.sh after pr-ci-wait.sh success; Gitea branch protection is the
backstop.
- id: NS-5
text: >-
Every unit bounded by wall-clock TTL on its claim; token caps enforced
only where a real meter exists, else advisory.
- id: NS-6
text: >-
Context cleared between tasks for ephemeral runners
(reset_between_tasks); persona+mission re-injected per task.
- id: NS-7
text: >-
Meta-loop (session-review + enhancer) continuously proposes small
fleet-improvement PRs.
- id: NS-8
text: >-
Single operator-flippable PAUSE kill-switch (fleet/run/PAUSED) honored
before every dispatch and every merge.
success_criteria:
- id: AC-NS-1
text: >-
The supervisor keeps a two-agent floor (1 orchestrator + >=1 enhancer)
healthy across reboot.
- id: AC-NS-2
text: >-
A goal added to this YAML is decomposed to cards and either merged or
escalated, with no human in the loop.
- id: AC-NS-3
text: >-
No PR merges with failure/error/no-status/timeout CI, and none bypass
pr-merge.sh.
- id: AC-NS-4
text: >-
TTL is enforced on claims; token caps remain advisory until a real meter
exists.
- id: AC-NS-5
text: >-
Flipping fleet/run/PAUSED halts dispatch and merges within one tick.
workstreams:
- id: A
title: Substrate — Mosaic Backlog on native Postgres storage service
- id: B
title: Supervisor — movement guarantee, two-agent floor, dispatch/claim
- id: C
title: Planner — goal decomposition into independently-shippable cards
- id: D
title: Merge-gate — single approver, pr-merge.sh after CI wait
- id: E
title: Meta-loop — session-review + enhancer improvement PRs
- id: F
title: Safety-rails — TTL claims, advisory spend, PAUSE kill-switch
goals:
- id: A1
title: Machine-readable NORTH_STAR.yaml + Markdown projection
phase: 1
priority: must-have
depends_on: []
- id: A2
title: Mosaic Backlog schema + storage-service card store (drizzle/PGlite)
phase: 1
priority: must-have
depends_on: [A1]
- id: A3a
title: Card lifecycle — create/claim/release with stable ids + depends_on DAG
phase: 1
priority: must-have
depends_on: [A2]
- id: A3b
title: TTL-bounded claim enforcement (wall-clock) on cards
phase: 1
priority: must-have
depends_on: [A3a]
- id: A4
title: Advisory spend projection per card (degrades to TTL, no real meter)
phase: 1
priority: should-have
depends_on: [A3a]
- id: B1
title: Supervisor tick — readiness scan, two-agent-floor health check
phase: 2
priority: must-have
depends_on: [A3a]
- id: B2
title: Native dispatch/claim — assign ready dependency-satisfied work
phase: 2
priority: must-have
depends_on: [A3b, B1]
- id: B3a
title: Planner decompose — goal added to YAML → cards
phase: 2
priority: must-have
depends_on: [A2, B1]
- id: B3b
title: Replan request on empty backlog; escalate on no-decompose
phase: 2
priority: should-have
depends_on: [B3a]
- id: G1
title: PAUSE kill-switch + merge-gate honored before dispatch and merge
phase: 2
priority: must-have
depends_on: [B2]
assumptions:
- id: ASM-1
vetoable: true
text: >-
The Mosaic Backlog on the native Postgres storage service is the backlog
of record.
- id: ASM-2
vetoable: true
text: >-
Claude gate roles have no native busy status, so readiness = pane-idle +
heartbeat.
- id: ASM-3
vetoable: true
text: 'Two-agent floor = 1 orchestrator + >=1 enhancer.'
spend:
advisory: true
note: >-
No per-task token meter yet; budgets degrade to TTL. Spend is tracked only
as an advisory projection alongside each card.

View File

@@ -0,0 +1,66 @@
# H1 — heartbeat readiness detection
## Objective
Add runtime-agnostic readiness classification to `mosaic fleet ps` so an agent can be reported as working/idle/stuck/stale/dead/unknown instead of treating pane liveness as progress.
## Scope
- `packages/mosaic/src/commands/fleet.ts`
- exported readiness state/types/default thresholds/helpers/classifier
- `AgentPsRow.readiness` additive JSON field
- table HB column and IDLE/STUCK flags
- `packages/mosaic/src/commands/fleet.spec.ts`
- pure classifier branch/boundary coverage
- threshold helper coverage
- legitimate render/JSON assertion updates for new HB text
## Acceptance Criteria
- Branches covered: dead, unknown, stale, busy working, null-idle working, stuck boundary, idle boundary, working below idle.
- Threshold env helpers default to 300s/900s and honor positive integer env values.
- `fleet ps` rows populate `readiness` for roster and unmanaged socket sessions.
- Table HB text becomes `<age>s/<readiness>` when heartbeat age exists; remains `unknown` when absent.
- Flags include `IDLE`/`STUCK` for matching readiness.
- Local gates green: `pnpm typecheck`, `pnpm lint`, `pnpm format:check`, fleet vitest.
- Pre-push queue guard passes; PR opened off `origin/main`; no merge by worker.
## Constraints / Assumptions
- Source branch: `origin/main` @ `e3adc6a`.
- No scope creep beyond readiness detection.
- `docs/TASKS.md` and `docs/fleet/TASKS.md` are orchestrator-owned; worker will not modify them.
- PRD alignment source: `docs/fleet/PRD.md` Phase 2 observability; this is a refinement of heartbeat observability, preserving existing unknown/stale behavior.
## Plan
1. Install dependencies with requested PNPM environment.
2. Add readiness types/helpers/classifier near heartbeat constants.
3. Add `readiness` to `AgentPsRow` and populate both row paths.
4. Update table render and flags.
5. Add unit tests and update affected ps render/JSON assertions.
6. Run build precheck + required gates.
7. Run automated independent review, remediate findings.
8. Queue guard, push, open PR.
## Progress
- 2026-06-24: Branch created from `origin/main` @ `e3adc6a`.
- 2026-06-24: Implemented readiness thresholds/classifier, JSON row field, HB column label, and IDLE/STUCK flags.
- 2026-06-24: Added classifier branch/boundary tests, threshold helper tests, JSON shape assertions, and readiness table rendering assertions.
## Verification Evidence
- `pnpm install --store-dir "$HOME/.pnpm-store"` — pass.
- `npx turbo build --filter=@mosaicstack/mosaic^...` — pass, 12/12 tasks successful.
- `pnpm typecheck` — pass, 41/41 tasks successful.
- `pnpm lint` — pass, 23/23 tasks successful.
- `pnpm format:check` — pass, all matched files use Prettier style.
- `pnpm --filter @mosaicstack/mosaic exec vitest run src/commands/fleet.spec.ts` — pass, 171 tests.
- `pnpm --filter @mosaicstack/mosaic test` — pass, 39 files / 547 tests; `fleet.spec.ts` 171 tests.
- `~/.config/mosaic/tools/codex/codex-code-review.sh --uncommitted` — approve, 0 findings (reviewed supplied diff; sandbox file-inspection limitation noted by tool).
## Risks / Blockers
- No current blocker.
- Review tool could not inspect repo files directly due sandbox wrapper limitation, but it reviewed the supplied diff and approved with no findings.

View File

@@ -0,0 +1,53 @@
# H1b — tmux pane idle signal wiring
## Objective
Feed `classifyReadiness()` a real idle signal on tmux 3.4 by deriving `idleSeconds` from the first available tmux timestamp source: pane activity, then window activity, then session activity.
## Scope
- `packages/mosaic/src/commands/fleet.ts`
- Extend `buildTmuxListPanesCommand()` format to include `#{window_activity}` and `#{session_activity}` after the existing fields.
- Update `parseTmuxListPanes()` to choose the first non-empty finite positive timestamp and clamp future idle values to 0.
- `packages/mosaic/src/commands/fleet.spec.ts`
- Cover pane/window/session activity parsing behavior, empty-field index alignment, null idle, future clamping, math correctness, and exact tmux format.
## Out of Scope
- No changes to `classifyReadiness()`, thresholds, `AgentPsRow`, or `fleet ps` rendering.
- No merge by worker; orchestrator routes review/merge.
- Workers do not modify `docs/TASKS.md`.
## PRD Alignment
Aligned with `docs/fleet/PRD.md` FR-1 and acceptance criteria for truthful `mosaic fleet ps` pane/pid/idle observability.
## Plan
1. Sync branch from latest `origin/main` and install dependencies with required pnpm env.
2. Add/confirm reproducer tests for tmux 3.4 empty `pane_activity` and new fallback behavior.
3. Implement the focused parser/format change only.
4. Run required build, baseline gates, fleet vitest, and independent review.
5. Run pre-push queue guard, push branch, and open PR to `main` with Mosaic wrapper.
## Progress
- 2026-06-24: Branch `fix/fleet-pane-idle-activity` created from `origin/main` @ `ec8dd7c` after fetching.
- 2026-06-24: Session-start generated local `.mosaic/orchestrator/*` changes on the previous release branch; stashed as `coder1 session-start state before H1b` to keep this branch clean.
- 2026-06-24: Added TDD coverage for the tmux 3.4 production case (`pane_activity` empty, `window_activity` populated), exact new list-panes format, null/future/multiple-source behavior.
- 2026-06-24: Implemented parser fallback without changing readiness classifier thresholds or render shape.
## Verification Evidence
- `pnpm install --store-dir "$HOME/.pnpm-store"` — pass.
- Reproducer before implementation: `pnpm --filter @mosaicstack/mosaic exec vitest run src/commands/fleet.spec.ts` — failed as expected (old format, no fallback, negative future idle).
- `npx turbo build --filter=@mosaicstack/mosaic^...` — pass, 12/12 tasks successful.
- `pnpm typecheck` — pass, 41/41 tasks successful.
- `pnpm lint` — pass, 23/23 tasks successful.
- `pnpm format:check` — pass, all matched files use Prettier style.
- `pnpm --filter @mosaicstack/mosaic exec vitest run src/commands/fleet.spec.ts` — pass, 176 tests.
- `~/.config/mosaic/tools/codex/codex-code-review.sh --uncommitted` — approve, 0 findings (reviewed supplied diff; sandbox file-inspection limitation noted by tool).
## Risks / Blockers
- No current blocker.

View File

@@ -0,0 +1,70 @@
# H2 — readiness semantics: available, not stuck
## Objective
Correct fleet readiness semantics so a healthy long-idle agent is reported as `available` (good/assignable) instead of `stuck` (fault). Reserve `stuck` in the type/JSON value space for future positive block evidence.
## Scope
- `packages/mosaic/src/commands/fleet.ts`
- replace `idle` readiness state with `available`
- keep `stuck` in the union but stop emitting it from idle-only heuristics
- remove stuck threshold helper/env handling
- remove IDLE/STUCK alarm flags from table rendering
- `packages/mosaic/src/commands/fleet.spec.ts`
- update classifier branch/boundary tests
- assert very long idle maps to `available`, not `stuck`
- update table/JSON assertions for available with no alarm flags
- remove stuck threshold helper tests
## Acceptance Criteria
- `classifyReadiness()` remains pure/total/never-throw and maps:
- dead/stale/unknown unchanged
- busy/null/undefined/non-finite idle to `working`
- idle >= activity threshold to `available`
- idle < activity threshold to `working`
- No idle-derived path emits `stuck`.
- `MOSAIC_HEARTBEAT_IDLE_THRESHOLD` remains backward compatible as the working→available activity threshold.
- `MOSAIC_HEARTBEAT_STUCK_THRESHOLD` and helper/default are removed.
- `fleet ps` keeps the idle-seconds column header `IDLE`, renders `available` in HB label, and does not add IDLE/STUCK warning flags.
- Local gates green: build precheck, typecheck, lint, format:check, fleet vitest.
- PR opened against `main`; no merge by worker.
## Constraints / Assumptions
- Source branch: `origin/main` @ `1020cfa`.
- `docs/TASKS.md` is orchestrator-owned; worker will not modify it.
- Documentation impact is captured in this scratchpad and PR description; no user/admin guide behavior beyond CLI readiness label semantics.
## Plan
1. Install dependencies with requested PNPM environment.
2. Inspect current H1/H1b readiness implementation and tests.
3. Update classifier types/helpers/rendering.
4. Update focused tests.
5. Run build precheck + required gates.
6. Run automated code review, remediate any findings.
7. Queue guard, push, open PR.
## Progress
- 2026-06-24: Branch created from `origin/main` @ `1020cfa`.
- 2026-06-24: Replaced idle-derived `idle`/`stuck` outputs with `available`; retained `stuck` in type union for future positive block evidence.
- 2026-06-24: Removed stuck threshold env/helper plumbing and IDLE/STUCK alarm flags.
- 2026-06-24: Updated classifier and table-render tests for available semantics.
## Verification Evidence
- `pnpm install --store-dir "$HOME/.pnpm-store"` — pass.
- `npx turbo build --filter=@mosaicstack/mosaic^...` — pass, 12/12 tasks successful.
- `pnpm typecheck` — pass, 41/41 tasks successful.
- `pnpm lint` — pass, 23/23 tasks successful.
- `pnpm format:check` — pass, all matched files use Prettier style.
- `pnpm --filter @mosaicstack/mosaic exec vitest run src/commands/fleet.spec.ts` — pass, 177 tests.
- `~/.config/mosaic/tools/codex/codex-code-review.sh --uncommitted` — approve, 0 findings (reviewed supplied diff; sandbox file-inspection limitation noted by tool).
## Risks / Blockers
- No current blocker.
- Review tool could not inspect repo files directly due sandbox wrapper limitation, but it reviewed the supplied diff and approved with no findings.

View File

@@ -28,6 +28,7 @@ export default tseslint.config(
'apps/web/e2e/helpers/*.ts',
'apps/web/playwright.config.ts',
'apps/gateway/vitest.config.ts',
'packages/db/vitest.config.ts',
'packages/storage/vitest.config.ts',
'packages/mosaic/__tests__/*.ts',
'tools/federation-harness/*.ts',

View File

@@ -4,5 +4,22 @@ export default defineConfig({
test: {
globals: true,
environment: 'node',
// The migration suite spins up a real PGlite (WASM Postgres) instance per
// test and applies the full drizzle migration set. Each case legitimately
// takes ~5s locally and considerably longer on CI, where turbo runs many
// packages' test suites concurrently. The 5s vitest default then expires
// mid-migration and the run fails as a phantom "Test timed out in 5000ms"
// (often surfacing the underlying WASM `memory access out of bounds` when
// the heap is starved). Give migrations real headroom.
testTimeout: 120_000,
hookTimeout: 120_000,
// Each PGlite instance carries a multi-hundred-MB WASM heap. Running test
// files in parallel forks multiplies that peak and is what tips the CI
// runner into the WASM OOM. A single fork keeps only one instance resident
// at a time — slightly slower, but deterministic.
pool: 'forks',
poolOptions: {
forks: { singleFork: true },
},
},
});

View File

@@ -122,6 +122,85 @@ fi
mkdir -p "$MOSAIC_AGENT_WORKDIR"
# ── Pre-trust the workdir for the Claude runtime ─────────────────────────────
# Claude Code shows a one-time "Is this a project you trust?" folder-trust gate
# the first time it opens a directory. A fleet-launched agent has no human to
# answer it, so the pane stalls forever at the prompt while its heartbeat keeps
# reporting "healthy" (the pane process IS alive — it's just blocked).
#
# IMPORTANT: --dangerously-skip-permissions does NOT bypass this gate, and
# neither does `trustedProjectDirectories` in settings.json (verified empirically
# 2026-06-24). The ONLY thing the gate honors is the per-project record in
# ~/.claude.json: projects["<dir>"].hasTrustDialogAccepted == true (exactly what
# answering the prompt writes). So we pre-seed that record here.
#
# Idempotent, atomic, best-effort: any failure is non-fatal (the agent still
# launches — worst case it stalls on the gate, i.e. the pre-fix status quo).
# Only the claude runtime needs this; codex/pi have no such gate.
_ensure_claude_workdir_trusted() {
local workdir="$1"
# The path claude keys on is the resolved cwd it is launched in.
local rp
rp=$(cd "$workdir" 2>/dev/null && pwd -P) || rp="$workdir"
# ~/.claude.json lives next to the claude config dir; honor CLAUDE_CONFIG_DIR.
local claude_json="${MOSAIC_CLAUDE_JSON:-${CLAUDE_CONFIG_DIR:+$CLAUDE_CONFIG_DIR/.claude.json}}"
claude_json="${claude_json:-$HOME/.claude.json}"
if ! command -v python3 >/dev/null 2>&1; then
echo "WARNING: python3 not found; cannot pre-trust '$rp' for claude (agent may stall on the folder-trust gate)" >&2
return 1
fi
# Serialize concurrent agent launches that share ~/.claude.json (flock if available).
local lock="${claude_json}.mosaic-lock"
_seed() {
MOSAIC_CJ="$claude_json" MOSAIC_TRUST_DIR="$rp" python3 - <<'PY'
import json, os, sys, tempfile
cj = os.environ["MOSAIC_CJ"]
d = os.environ["MOSAIC_TRUST_DIR"]
try:
data = json.load(open(cj)) if os.path.exists(cj) else {}
if not isinstance(data, dict):
data = {}
except Exception:
# Never corrupt an unreadable/partial file — bail without writing.
sys.exit(2)
projects = data.setdefault("projects", {})
entry = projects.get(d)
if not isinstance(entry, dict):
entry = {}
projects[d] = entry
if entry.get("hasTrustDialogAccepted") is True:
sys.exit(0) # already trusted — nothing to do
entry["hasTrustDialogAccepted"] = True
tmp_dir = os.path.dirname(cj) or "."
fd, tmp = tempfile.mkstemp(dir=tmp_dir, prefix=".claude.json.mosaic.")
try:
with os.fdopen(fd, "w") as f:
json.dump(data, f, indent=2)
os.replace(tmp, cj) # atomic
except Exception:
try:
os.unlink(tmp)
except OSError:
pass
sys.exit(3)
PY
}
if command -v flock >/dev/null 2>&1; then
( flock 9; _seed ) 9>"$lock" 2>/dev/null || _seed
else
_seed
fi
}
case "$MOSAIC_AGENT_RUNTIME" in
claude)
_ensure_claude_workdir_trusted "$MOSAIC_AGENT_WORKDIR" \
|| echo "WARNING: could not pre-trust workdir for claude agent $AGENT_NAME" >&2
;;
esac
# ── Launch the tmux session (no exec — we continue to wire the heartbeat) ────
_tmux new-session -d -s "$AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" \
bash -c "$PANE_SHELL_SNIPPET"

View File

@@ -128,8 +128,8 @@ PY
merge_gitea_with_api() {
local host="$1" api_url token basic_auth body_file raw_code payload
api_url="https://${host}/api/v1/repos/${OWNER}/${REPO}/pulls/${PR_NUMBER}/merge"
mkdir -p "${AGENT_WORK_ROOT:-/home/hermes/agent-work}"
body_file=$(mktemp "${AGENT_WORK_ROOT:-/home/hermes/agent-work}/pr-merge-api-response.XXXXXX")
mkdir -p "${AGENT_WORK_ROOT:-${HOME:-/tmp}/mosaic/agent-work}"
body_file=$(mktemp "${AGENT_WORK_ROOT:-${HOME:-/tmp}/mosaic/agent-work}/pr-merge-api-response.XXXXXX")
payload='{"Do":"squash"}'
token=$(get_gitea_token "$host" || true)
@@ -214,8 +214,8 @@ case "$PLATFORM" in
TEA_LOGIN="$(get_gitea_login_for_host "$HOST" || true)"
if [[ -n "$TEA_LOGIN" ]]; then
mkdir -p "${AGENT_WORK_ROOT:-/home/hermes/agent-work}"
TEA_ERROR_FILE=$(mktemp "${AGENT_WORK_ROOT:-/home/hermes/agent-work}/pr-merge-tea-error.XXXXXX")
mkdir -p "${AGENT_WORK_ROOT:-${HOME:-/tmp}/mosaic/agent-work}"
TEA_ERROR_FILE=$(mktemp "${AGENT_WORK_ROOT:-${HOME:-/tmp}/mosaic/agent-work}/pr-merge-tea-error.XXXXXX")
if tea pr merge "$PR_NUMBER" --style squash --repo "$OWNER/$REPO" --login "$TEA_LOGIN" 2> "$TEA_ERROR_FILE"; then
rm -f "$TEA_ERROR_FILE"
elif is_known_tea_empty_identity_failure "$TEA_ERROR_FILE"; then

View File

@@ -4,7 +4,7 @@
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
WORK_ROOT="${AGENT_WORK_ROOT:-/home/hermes/agent-work}"
WORK_ROOT="${AGENT_WORK_ROOT:-${HOME:-/tmp}/mosaic/agent-work}"
SANDBOX="$WORK_ROOT/pr-merge-empty-uid-test-$$"
MOCK_BIN="$SANDBOX/bin"
REPO_DIR="$SANDBOX/repo"

View File

@@ -12,6 +12,10 @@
# ambiguity about lanes or origin. Recipients replying should FLIP the
# preamble: [<dst> -> <src>] ... (this tool sends; it does not auto-reply).
#
# Optionally tags the message with a TRIAGE CLASS (see -C / --class) so a
# comms daemon can route it (deliver-to-agent vs log-and-drop) from an exact
# field instead of re-deriving intent from the body.
#
# WHY A WRAPPER
# Reliable submission into an interactive REPL (Claude Code / Codex) is fiddly:
# a trailing Enter is often swallowed and the message sits as an unsubmitted
@@ -26,6 +30,7 @@
# agent-send.sh [-L socket] -s <dst_session> -m "message" # local target
# agent-send.sh [-L socket] -H user@host -s <dst_session> -m "message" # remote target
# agent-send.sh [-L socket] -H user@host -n <dst_hostname> -s <sess> -f msg.txt
# agent-send.sh -s mos-claude --class terminal-log -m "ACK — received"
# echo "msg" | agent-send.sh [-L socket] -H user@host -s <dst_session>
#
# OPTIONS
@@ -36,27 +41,61 @@
# Default: local hostname, or (remote) resolved via one ssh.
# -m MESSAGE message text (single- or multi-line)
# -f FILE read message from FILE instead of -m
# -C CLASS triage class for a comms daemon. One of:
# terminal-log log-only; never needs the agent's attention
# actionable carries a decision/blocker/gate — deliver
# human from a human operator — deliver
# reaction an emoji/ack reaction
# Long form: --class CLASS (or --class=CLASS). When SET, the
# preamble carries a ` class=<CLASS>` token INSIDE the bracket:
# [<src> -> <dst> class=terminal-log] <message>
# When OMITTED, NO token is emitted and the preamble is
# byte-for-byte identical to the classic format. Consumers MUST
# treat an absent class as 'actionable' (fail-safe: agent sees it).
# -S SRC_LABEL override source label "<host>:<session>" (default: auto)
# -r N Enter-flush attempts passed through (default 2)
# -v verbose: print pane tail after delivery
# -h help
#
# PREAMBLE GRAMMAR (for consumers / daemons mirroring this producer)
# ^\[(\S+) -> (\S+?)(?: class=(terminal-log|actionable|human|reaction))?\] (.*)$
# group 1 = src label group 2 = dst host:session
# group 3 = class (absent => actionable) group 4 = message body
#
# EXIT CODES (passed through from send-message.sh)
# 0 delivered/queued · 1 target not found · 2 still draft · 3 usage error
set -uo pipefail
SELF_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
SENDER="$SELF_DIR/send-message.sh"
# Sender is overridable via env purely for testing (inject a capture stub). The
# default is the canonical send-message.sh beside this script; production callers
# never set AGENT_SEND_SENDER, so behavior is unchanged.
SENDER="${AGENT_SEND_SENDER:-$SELF_DIR/send-message.sh}"
# Translate the long option --class[=value] into "-C value" so getopts (which is
# short-option-only) can parse it. Every other argument passes through untouched,
# so callers that never use --class hit the exact original getopts path.
args=()
while [ $# -gt 0 ]; do
case "$1" in
--class) [ $# -ge 2 ] || { echo "ERROR: --class requires a value" >&2; exit 3; }
args+=(-C "$2"); shift 2 ;;
--class=*) args+=(-C "${1#*=}"); shift ;;
*) args+=("$1"); shift ;;
esac
done
set -- ${args[@]+"${args[@]}"}
DST_SESSION=""; SSH_TARGET=""; DST_HOST=""; MSG=""; FILE=""; SOCKET_NAME=""
SRC_LABEL=""; RETRIES=2; VERBOSE=0
usage() { sed -n '2,44p' "$0"; exit "${1:-3}"; }
SRC_LABEL=""; RETRIES=2; VERBOSE=0; CLASS=""
usage() { sed -n '2,/^set -uo pipefail/{/^set -uo pipefail/d;p}' "$0"; exit "${1:-3}"; }
while getopts "L:s:H:n:m:f:S:r:vh" o; do
while getopts "L:s:H:n:m:f:S:r:C:vh" o; do
case "$o" in
L) SOCKET_NAME=$OPTARG ;;
s) DST_SESSION=$OPTARG ;; H) SSH_TARGET=$OPTARG ;; n) DST_HOST=$OPTARG ;;
m) MSG=$OPTARG ;; f) FILE=$OPTARG ;; S) SRC_LABEL=$OPTARG ;;
C) CLASS=$OPTARG ;;
r) RETRIES=$OPTARG ;; v) VERBOSE=1 ;; h) usage 0 ;; *) usage 3 ;;
esac
done
@@ -64,6 +103,17 @@ done
[ -n "$DST_SESSION" ] || { echo "ERROR: -s DST_SESSION is required" >&2; usage 3; }
[ -x "$SENDER" ] || { echo "ERROR: send-message.sh not found beside this script" >&2; exit 3; }
# Validate the triage class only when one was given. An absent class emits NO
# token (preamble byte-identical to the classic format); the consumer defaults
# absent => actionable.
CLASS_TOKEN=""
if [ -n "$CLASS" ]; then
case "$CLASS" in
terminal-log|actionable|human|reaction) CLASS_TOKEN=" class=${CLASS}" ;;
*) echo "ERROR: invalid --class '$CLASS' (allowed: terminal-log, actionable, human, reaction)" >&2; exit 3 ;;
esac
fi
# Message body from -f / -m / stdin.
if [ -n "$FILE" ]; then [ -r "$FILE" ] || { echo "ERROR: cannot read $FILE" >&2; exit 3; }; MSG=$(cat -- "$FILE")
elif [ -z "$MSG" ] && [ ! -t 0 ]; then MSG=$(cat)
@@ -90,7 +140,7 @@ if [ -z "$DST_HOST" ]; then
fi
fi
PREAMBLE="[${SRC_LABEL} -> ${DST_HOST}:${DST_SESSION}]"
PREAMBLE="[${SRC_LABEL} -> ${DST_HOST}:${DST_SESSION}${CLASS_TOKEN}]"
FULL="${PREAMBLE} ${MSG}"
B64=$(printf '%s' "$FULL" | base64 -w0)

View File

@@ -0,0 +1,97 @@
#!/usr/bin/env bash
# agent-send.test.sh — regression + grammar lock for agent-send.sh --class.
#
# Strategy: inject a capture stub via AGENT_SEND_SENDER that decodes the -b
# base64 payload and prints the FULL message (preamble + body) so we can assert
# the exact bytes on the wire. Local path only (no ssh), -n pins the dst host so
# the preamble is deterministic across machines.
#
# Guarantees locked here:
# 1. REGRESSION BAR — no --class => preamble byte-for-byte identical to classic.
# 2. --class <c> => ` class=<c>` token emitted inside the bracket.
# 3. --class=<c> (equals form) parses identically to the space form.
# 4. -C <c> short form parses identically.
# 5. invalid class => exit 3, nothing sent.
# 6. --class with no value => exit 3.
# 7. the documented consumer regex parses producer output for every class.
set -uo pipefail
HERE=$(cd -- "$(dirname -- "$0")" && pwd)
TOOL="$HERE/agent-send.sh"
# Capture stub: stands in for send-message.sh. Decodes -b and prints the payload.
STUB=$(mktemp)
trap 'rm -f "$STUB"' EXIT
cat >"$STUB" <<'STUB_EOF'
#!/usr/bin/env bash
set -uo pipefail
b64=""
while getopts "t:b:r:v" o; do case "$o" in b) b64=$OPTARG ;; *) : ;; esac; done
printf '%s' "$b64" | base64 -d
STUB_EOF
chmod +x "$STUB"
PASS=0; FAIL=0
ok() { PASS=$((PASS+1)); printf 'ok %s\n' "$1"; }
no() { FAIL=$((FAIL+1)); printf 'FAIL %s\n %s\n' "$1" "$2"; }
# Run the tool with the stub injected; echoes captured payload on stdout.
run() { AGENT_SEND_SENDER="$STUB" bash "$TOOL" -S a:src -n dsthost "$@"; }
# Documented consumer grammar — the daemon will mirror exactly this.
GRAMMAR='^\[(\S+) -> (\S+) class=(terminal-log|actionable|human|reaction)\] (.*)$'
GRAMMAR_NOCLASS='^\[(\S+) -> (\S+)\] (.*)$'
# 1. REGRESSION BAR: classic preamble, byte-for-byte.
got=$(run -s mos -m "hello world")
want='[a:src -> dsthost:mos] hello world'
[ "$got" = "$want" ] && ok "regression: no --class is byte-identical" \
|| no "regression: no --class is byte-identical" "got=[$got] want=[$want]"
# 2. --class space form emits the token.
got=$(run -s mos --class terminal-log -m "ACK")
want='[a:src -> dsthost:mos class=terminal-log] ACK'
[ "$got" = "$want" ] && ok "--class terminal-log emits token" \
|| no "--class terminal-log emits token" "got=[$got] want=[$want]"
# 3. --class=value equals form.
got=$(run -s mos --class=actionable -m "decide X")
want='[a:src -> dsthost:mos class=actionable] decide X'
[ "$got" = "$want" ] && ok "--class=actionable (equals form)" \
|| no "--class=actionable (equals form)" "got=[$got] want=[$want]"
# 4. -C short form.
got=$(run -s mos -C human -m "from a person")
want='[a:src -> dsthost:mos class=human] from a person'
[ "$got" = "$want" ] && ok "-C human (short form)" \
|| no "-C human (short form)" "got=[$got] want=[$want]"
# 5. invalid class => exit 3, no send.
if out=$(run -s mos --class bogus -m "x" 2>/dev/null); then
no "invalid class rejected" "expected non-zero exit, got 0 (out=[$out])"
else
rc=$?
[ "$rc" = 3 ] && [ -z "$out" ] && ok "invalid class => exit 3, nothing sent" \
|| no "invalid class => exit 3, nothing sent" "rc=$rc out=[$out]"
fi
# 6. --class with no value => exit 3.
if run -s mos -m "x" --class 2>/dev/null; then
no "--class with no value rejected" "expected non-zero exit, got 0"
else
[ "$?" = 3 ] && ok "--class with no value => exit 3" || no "--class with no value => exit 3" "wrong rc"
fi
# 7. consumer grammar parses every class + classic line.
for c in terminal-log actionable human reaction; do
line=$(run -s mos --class "$c" -m "body $c")
[[ "$line" =~ $GRAMMAR ]] && [ "${BASH_REMATCH[3]}" = "$c" ] && [ "${BASH_REMATCH[4]}" = "body $c" ] \
&& ok "grammar parses class=$c" || no "grammar parses class=$c" "line=[$line]"
done
classic=$(run -s mos -m "plain body")
[[ "$classic" =~ $GRAMMAR_NOCLASS ]] && [ "${BASH_REMATCH[3]}" = "plain body" ] \
&& ok "grammar (no-class) parses classic line" || no "grammar (no-class) parses classic line" "line=[$classic]"
echo "---"
echo "PASS=$PASS FAIL=$FAIL"
[ "$FAIL" -eq 0 ]

View File

@@ -1,6 +1,6 @@
{
"name": "@mosaicstack/mosaic",
"version": "0.0.41",
"version": "0.0.44",
"repository": {
"type": "git",
"url": "https://git.mosaicstack.dev/mosaicstack/stack.git",

View File

@@ -30,6 +30,7 @@ import {
refreshActiveFleetUnits,
readRosterAgentNames,
buildRelaunchCommands,
checkFrameworkDrift,
FRAMEWORK_RESEED_PACKAGE,
} from './runtime/update-checker.js';
import { runWizard } from './wizard.js';
@@ -418,6 +419,48 @@ program
// checkForAllUpdates imported statically above
const { execSync } = await import('node:child_process');
// Re-seed the framework from the freshly-installed package, propagate shipped
// systemd unit fixes to the active units, and (opt-in) relaunch durable
// agents. Shared by the "packages updated" and the "framework drift" paths.
const reseedFramework = (reason: string): void => {
console.log(reason);
const reseed = runFrameworkReseed();
if (!reseed.ok) {
console.error(
`\n⚠ Framework re-seed skipped: ${reseed.reason ?? 'unknown'}.\n` +
' Activate manually: bash "$(npm root -g)/@mosaicstack/mosaic/framework/install.sh" ' +
'(MOSAIC_SYNC_ONLY=1 MOSAIC_INSTALL_MODE=keep)',
);
return;
}
console.log('✔ Framework re-seeded.');
// Propagate shipped systemd unit fixes to the ACTIVE units (re-seed only
// touches ~/.config/mosaic/systemd/user; systemd runs ~/.config/systemd/user).
const units = refreshActiveFleetUnits();
if (units.refreshed.length > 0) {
console.log(`✔ Refreshed ${units.refreshed.length} active systemd unit(s).`);
}
const agents = readRosterAgentNames();
if (agents.length === 0) return;
if (opts.relaunch) {
console.log(`\nRelaunching ${agents.length} fleet agent(s) to pick up the new runtime…`);
for (const restart of buildRelaunchCommands(agents)) {
try {
execSync(restart.join(' '), { stdio: 'inherit', timeout: 30_000 });
} catch {
console.error(` ⚠ failed to restart agent — run: ${restart.join(' ')}`);
}
}
console.log('✔ Agents relaunched.');
} else {
console.log(
`\n ${agents.length} fleet agent(s) are still running the previous runtime. ` +
'Restart them to activate the update:\n mosaic update --relaunch ' +
'(or: mosaic fleet restart <agent>)',
);
}
};
console.log('Checking for updates…');
const results = checkForAllUpdates({ skipCache: true });
@@ -432,6 +475,18 @@ program
process.exit(1);
}
console.log('\n✔ All packages up to date.');
// #642: the CLI may have been upgraded outside `mosaic update` (e.g. a
// direct `npm i -g`), leaving the framework files stale even though no
// package is reported outdated. Detect that via the framework version and
// re-seed so shipped launcher/runtime fixes still activate.
const drift = checkFrameworkDrift();
if (drift.drifted && opts.reseed !== false) {
reseedFramework(
`\nFramework drift detected (on-disk v${drift.installed} < bundled v${drift.bundled}) — ` +
'the CLI was updated outside `mosaic update`. Re-seeding framework files into ' +
'~/.config/mosaic (data-safe; keeps your edits)…',
);
}
return;
}
@@ -456,52 +511,17 @@ program
// F3-m3 / R13: the CLI is updated, but the framework files in
// ~/.config/mosaic/ are still the previous version. Re-seed them from the
// freshly-installed package so shipped launcher/runtime changes ACTIVATE.
// Only when the framework-bearing package itself updated.
// Re-seed when the framework-bearing package itself updated OR the on-disk
// framework is older than the freshly-installed one (#642 — e.g. only
// sibling packages were outdated but the CLI was already ahead).
const mosaicUpdated = outdated.some(
(r: { package: string }) => r.package === FRAMEWORK_RESEED_PACKAGE,
);
if (mosaicUpdated && opts.reseed !== false) {
console.log(
const drift = checkFrameworkDrift();
if ((mosaicUpdated || drift.drifted) && opts.reseed !== false) {
reseedFramework(
'\nRe-seeding framework files into ~/.config/mosaic (data-safe; keeps your edits)…',
);
const reseed = runFrameworkReseed();
if (reseed.ok) {
console.log('✔ Framework re-seeded.');
// Propagate shipped systemd unit fixes to the ACTIVE units (re-seed only
// touches ~/.config/mosaic/systemd/user; systemd runs ~/.config/systemd/user).
const units = refreshActiveFleetUnits();
if (units.refreshed.length > 0) {
console.log(`✔ Refreshed ${units.refreshed.length} active systemd unit(s).`);
}
const agents = readRosterAgentNames();
if (agents.length > 0) {
if (opts.relaunch) {
console.log(
`\nRelaunching ${agents.length} fleet agent(s) to pick up the new runtime…`,
);
for (const restart of buildRelaunchCommands(agents)) {
try {
execSync(restart.join(' '), { stdio: 'inherit', timeout: 30_000 });
} catch {
console.error(` ⚠ failed to restart agent — run: ${restart.join(' ')}`);
}
}
console.log('✔ Agents relaunched.');
} else {
console.log(
`\n ${agents.length} fleet agent(s) are still running the previous runtime. ` +
'Restart them to activate the update:\n mosaic update --relaunch ' +
'(or: mosaic fleet restart <agent>)',
);
}
}
} else {
console.error(
`\n⚠ Framework re-seed skipped: ${reseed.reason ?? 'unknown'}.\n` +
' Activate manually: bash "$(npm root -g)/@mosaicstack/mosaic/framework/install.sh" ' +
'(MOSAIC_SYNC_ONLY=1 MOSAIC_INSTALL_MODE=keep)',
);
}
}
});

View File

@@ -0,0 +1,199 @@
import { readFile } from 'node:fs/promises';
import { dirname, join, resolve } from 'node:path';
import { fileURLToPath } from 'node:url';
import { describe, expect, it, vi } from 'vitest';
import {
parseNorthStar,
renderNorthStarMarkdown,
resolveNorthStarPaths,
type NorthStar,
} from './fleet.js';
// Repo root resolved from this spec file: packages/mosaic/src/commands → up 4.
const repoRoot = resolve(dirname(fileURLToPath(import.meta.url)), '..', '..', '..', '..');
const yamlPath = join(repoRoot, 'docs', 'fleet', 'NORTH_STAR.yaml');
async function loadYamlText(): Promise<string> {
return readFile(yamlPath, 'utf8');
}
async function loadParsed(): Promise<NorthStar> {
return parseNorthStar(await loadYamlText());
}
describe('NORTH_STAR.yaml', () => {
it('parses to a typed object with the required top-level keys', async () => {
const ns = await loadParsed();
expect(ns.version).toBeTypeOf('number');
expect(ns.mission).toContain('self-driving Mosaic delivery fleet');
expect(ns.substrate.note).toBeTruthy();
expect(ns.standing_objectives.length).toBeGreaterThan(0);
expect(ns.success_criteria.length).toBeGreaterThan(0);
expect(ns.workstreams.length).toBeGreaterThan(0);
expect(ns.goals.length).toBeGreaterThan(0);
expect(ns.assumptions.length).toBeGreaterThan(0);
expect(ns.spend.advisory).toBe(true);
});
it('names the native Postgres storage layer and declares no Hermes runtime dependency', async () => {
const rawText = await loadYamlText();
const lower = rawText.toLowerCase();
expect(rawText).toContain('@mosaicstack/db');
expect(lower).toContain('postgres');
expect(lower).toContain('pglite');
// The doctrine explicitly disowns Hermes ("NOT Hermes"); the only mentions
// are negations. Assert there is no Hermes RUNTIME dependency: no hermes
// CLI/kanban invocation and no ~/.hermes storage reference.
expect(lower).not.toContain('hermes kanban');
expect(lower).not.toContain('~/.hermes');
expect(lower).not.toContain('hermes mcp');
});
it('declares all NS-1..NS-8 standing objectives', async () => {
const ns = await loadParsed();
const ids = ns.standing_objectives.map((o) => o.id);
for (let n = 1; n <= 8; n += 1) {
expect(ids).toContain(`NS-${n}`);
}
});
it('declares all AC-NS-1..AC-NS-5 success criteria', async () => {
const ns = await loadParsed();
const ids = ns.success_criteria.map((c) => c.id);
for (let n = 1; n <= 5; n += 1) {
expect(ids).toContain(`AC-NS-${n}`);
}
});
it('seeds the expected backlog goal ids', async () => {
const ns = await loadParsed();
const ids = ns.goals.map((g) => g.id);
expect(ids).toEqual(
expect.arrayContaining(['A1', 'A2', 'A3a', 'A3b', 'A4', 'B1', 'B2', 'B3a', 'B3b', 'G1']),
);
});
it('has a coherent depends_on DAG (every dependency references a known goal)', async () => {
const ns = await loadParsed();
const ids = new Set(ns.goals.map((g) => g.id));
for (const goal of ns.goals) {
for (const dep of goal.depends_on) {
expect(ids.has(dep)).toBe(true);
}
// No goal may depend on itself.
expect(goal.depends_on).not.toContain(goal.id);
}
// A1 is the root: no dependencies.
const a1 = ns.goals.find((g) => g.id === 'A1');
expect(a1?.depends_on).toEqual([]);
});
it('marks spend as advisory with a degrade-to-TTL note', async () => {
const ns = await loadParsed();
expect(ns.spend.advisory).toBe(true);
expect(ns.spend.note.toLowerCase()).toContain('ttl');
});
});
describe('renderNorthStarMarkdown', () => {
it('is a pure deterministic projection (round-trip stable)', async () => {
const ns = await loadParsed();
const first = renderNorthStarMarkdown(ns);
const second = renderNorthStarMarkdown(ns);
expect(first).toBe(second);
// Re-parsing the same YAML and re-rendering yields identical bytes.
const reparsed = parseNorthStar(await loadYamlText());
expect(renderNorthStarMarkdown(reparsed)).toBe(first);
});
it('matches the committed NORTH_STAR.md projection (regenerate if this fails)', async () => {
const ns = await loadParsed();
const rendered = `${renderNorthStarMarkdown(ns)}\n`;
const committed = await readFile(join(repoRoot, 'docs', 'fleet', 'NORTH_STAR.md'), 'utf8');
expect(rendered).toBe(committed);
});
it('projects mission, objectives, criteria, goals, assumptions, and spend', async () => {
const ns = await loadParsed();
const md = renderNorthStarMarkdown(ns);
expect(md).toContain('# Mosaic Fleet — NORTH STAR');
expect(md).toContain('## Mission');
expect(md).toContain('## Standing objectives');
expect(md).toContain('**NS-1**');
expect(md).toContain('**AC-NS-5**');
expect(md).toContain('## Goals (backlog projection)');
// Tables are column-padded (prettier-style); match the row id, not exact spacing.
expect(md).toMatch(/\| A1\s+\|/);
expect(md).toContain('## Assumptions (vetoable)');
expect(md).toContain('**advisory:** true');
// The banner disowns Hermes; the projection carries no Hermes runtime hook.
expect(md.toLowerCase()).not.toContain('hermes kanban');
expect(md.toLowerCase()).not.toContain('~/.hermes');
});
it('does no network or CLI work (pure functions; only the writer touches IO)', () => {
// parseNorthStar + renderNorthStarMarkdown take strings and return strings.
// Guard against accidental IO by asserting fetch/spawn are never invoked.
const fetchSpy = vi.spyOn(globalThis, 'fetch' as never).mockImplementation((() => {
throw new Error('network access is forbidden in the NORTH_STAR generator');
}) as never);
try {
const yaml = [
'version: 1',
'mission: m',
'substrate:',
' note: n',
'standing_objectives:',
' - { id: NS-1, text: t }',
'success_criteria:',
' - { id: AC-NS-1, text: t }',
'workstreams:',
' - { id: A, title: t }',
'goals:',
' - { id: A1, title: t, phase: 1, priority: must-have, depends_on: [] }',
'assumptions:',
' - { id: ASM-1, vetoable: true, text: t }',
'spend:',
' advisory: true',
' note: TTL',
'',
].join('\n');
const ns = parseNorthStar(yaml);
const md = renderNorthStarMarkdown(ns);
expect(md).toContain('# Mosaic Fleet — NORTH STAR');
expect(fetchSpy).not.toHaveBeenCalled();
} finally {
fetchSpy.mockRestore();
}
});
});
describe('parseNorthStar validation', () => {
it('throws on a missing required key', () => {
expect(() => parseNorthStar('version: 1\n')).toThrow();
});
it('throws when spend.advisory is not a boolean', () => {
const yaml = [
'version: 1',
'mission: m',
'substrate: { note: n }',
'standing_objectives: [{ id: NS-1, text: t }]',
'success_criteria: [{ id: AC-NS-1, text: t }]',
'workstreams: [{ id: A, title: t }]',
'goals: [{ id: A1, title: t, phase: 1, priority: must-have, depends_on: [] }]',
'assumptions: [{ id: ASM-1, vetoable: true, text: t }]',
'spend: { advisory: maybe, note: TTL }',
'',
].join('\n');
expect(() => parseNorthStar(yaml)).toThrow(/spend\.advisory/);
});
});
describe('resolveNorthStarPaths', () => {
it('resolves YAML + Markdown under docs/fleet from a given repo root', () => {
const paths = resolveNorthStarPaths('/repo');
expect(paths.yamlPath).toBe('/repo/docs/fleet/NORTH_STAR.yaml');
expect(paths.markdownPath).toBe('/repo/docs/fleet/NORTH_STAR.md');
});
});

View File

@@ -19,17 +19,20 @@ import {
buildSystemdShowCommand,
buildTmuxListPanesCommand,
buildTmuxListSessionsCommand,
classifyReadiness,
classifySendResult,
countOrchestrators,
countEnhancers,
detectDrift,
enableFleetUnits,
FLEET_PROFILES,
HEARTBEAT_IDLE_THRESHOLD_SECONDS,
generateAgentEnv,
getDefaultOperatorSourceLabel,
getDefaultTenantAndHost,
getRosterAgent,
heartbeatPath,
idleThresholdSeconds,
isSendAccepted,
loadFleetRoster,
mergeAgentEnv,
@@ -850,7 +853,7 @@ describe('fleet ps — command construction', () => {
'-t',
'=canary-pi:0.0',
'-F',
'#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity}',
'#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity} #{window_activity} #{session_activity}',
]);
});
@@ -933,6 +936,125 @@ describe('fleet ps — heartbeat parsing', () => {
});
});
describe('fleet ps — readiness thresholds', () => {
const savedIdle = process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD;
afterEach(() => {
if (savedIdle === undefined) delete process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD;
else process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD = savedIdle;
});
it('uses the default activity threshold when env is unset', () => {
delete process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD;
expect(idleThresholdSeconds()).toBe(HEARTBEAT_IDLE_THRESHOLD_SECONDS);
});
it('honors a positive integer activity threshold from env', () => {
process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD = '120';
expect(idleThresholdSeconds()).toBe(120);
});
it('falls back to the default for invalid activity thresholds', () => {
process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD = '0';
expect(idleThresholdSeconds()).toBe(HEARTBEAT_IDLE_THRESHOLD_SECONDS);
});
});
describe('fleet ps — readiness classification', () => {
const thresholds = { idleThresholdSeconds: 300 };
it('reports dead when the pane is not alive', () => {
expect(
classifyReadiness(
{ paneAlive: false, hbHealth: 'healthy', hbStatus: 'busy', idleSeconds: 0 },
thresholds,
),
).toBe('dead');
});
it('reports unknown when heartbeat health is unknown', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'unknown', hbStatus: null, idleSeconds: 0 },
thresholds,
),
).toBe('unknown');
});
it('reports stale when heartbeat health is stale', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'stale', hbStatus: 'busy', idleSeconds: 1_000 },
thresholds,
),
).toBe('stale');
});
it('reports working when heartbeat status is busy, even after the activity threshold', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'healthy', hbStatus: 'busy', idleSeconds: 2_000 },
thresholds,
),
).toBe('working');
});
it('reports working when pane idle seconds are null', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'healthy', hbStatus: 'ok', idleSeconds: null },
thresholds,
),
).toBe('working');
});
it('reports working when pane idle seconds are undefined', () => {
expect(
classifyReadiness({ paneAlive: true, hbHealth: 'healthy', hbStatus: 'ok' }, thresholds),
).toBe('working');
});
it('reports working when pane idle seconds are non-finite', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'healthy', hbStatus: 'ok', idleSeconds: Number.NaN },
thresholds,
),
).toBe('working');
});
it('reports available at the activity threshold boundary', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'healthy', hbStatus: 'ok', idleSeconds: 300 },
thresholds,
),
).toBe('available');
});
it('reports working below the activity threshold', () => {
expect(
classifyReadiness(
{ paneAlive: true, hbHealth: 'healthy', hbStatus: 'ok', idleSeconds: 299 },
thresholds,
),
).toBe('working');
});
it('reports very long idle as available, not stuck', () => {
const readiness = classifyReadiness(
{ paneAlive: true, hbHealth: 'healthy', hbStatus: 'ok', idleSeconds: 100_000 },
thresholds,
);
expect(readiness).toBe('available');
expect(readiness).not.toBe('stuck');
});
});
describe('fleet ps — systemd show parsing', () => {
it('parses ActiveState, SubState, UnitFileState from systemctl show output', () => {
const output = 'ActiveState=active\nSubState=running\nUnitFileState=enabled\n';
@@ -953,9 +1075,11 @@ describe('fleet ps — systemd show parsing', () => {
describe('fleet ps — tmux list-panes parsing', () => {
const NOW_MS = 1_700_000_000_000;
it('parses alive pane with pid, command, and idle time', () => {
const activityEpoch = Math.floor((NOW_MS - 30_000) / 1000); // 30s ago
const output = `12345 claude 0 ${activityEpoch}\n`;
it('uses pane_activity when present', () => {
const paneActivityEpoch = Math.floor((NOW_MS - 30_000) / 1000); // 30s ago
const windowActivityEpoch = Math.floor((NOW_MS - 60_000) / 1000); // 60s ago
const sessionActivityEpoch = Math.floor((NOW_MS - 90_000) / 1000); // 90s ago
const output = `12345 claude 0 ${paneActivityEpoch} ${windowActivityEpoch} ${sessionActivityEpoch}\n`;
const result = parseTmuxListPanes(output, NOW_MS);
expect(result.pid).toBe(12345);
expect(result.command).toBe('claude');
@@ -963,8 +1087,45 @@ describe('fleet ps — tmux list-panes parsing', () => {
expect(result.idleSeconds).toBe(30);
});
it('uses window_activity when pane_activity is empty', () => {
const windowActivityEpoch = Math.floor((NOW_MS - 45_000) / 1000); // 45s ago
const sessionActivityEpoch = Math.floor((NOW_MS - 90_000) / 1000); // 90s ago
const output = `12345 node 0 ${windowActivityEpoch} ${sessionActivityEpoch}\n`;
expect(output).toContain('0 '); // empty pane_activity preserves index alignment
const result = parseTmuxListPanes(output, NOW_MS);
expect(result.pid).toBe(12345);
expect(result.command).toBe('node');
expect(result.dead).toBe(false);
expect(result.idleSeconds).toBe(45);
});
it('uses session_activity when pane_activity and window_activity are empty', () => {
const sessionActivityEpoch = Math.floor((NOW_MS - 75_000) / 1000); // 75s ago
const output = `12345 node 0 ${sessionActivityEpoch}\n`;
const result = parseTmuxListPanes(output, NOW_MS);
expect(result.idleSeconds).toBe(75);
});
it('reports null idleSeconds when all activity sources are empty', () => {
const output = '12345 node 0 \n';
const result = parseTmuxListPanes(output, NOW_MS);
expect(result.idleSeconds).toBeNull();
});
it('computes exact idle seconds from now minus epoch seconds', () => {
const activityEpoch = 1_699_999_877;
const result = parseTmuxListPanes(`12345 claude 0 ${activityEpoch} 0 0\n`, NOW_MS);
expect(result.idleSeconds).toBe(123);
});
it('clamps future activity epochs to 0 idle seconds', () => {
const futureActivityEpoch = Math.floor((NOW_MS + 30_000) / 1000);
const result = parseTmuxListPanes(`12345 claude 0 ${futureActivityEpoch} 0 0\n`, NOW_MS);
expect(result.idleSeconds).toBe(0);
});
it('reports dead pane when pane_dead=1', () => {
const output = `0 bash 1 0\n`;
const output = `0 bash 1 0 0 0\n`;
const result = parseTmuxListPanes(output, NOW_MS);
expect(result.dead).toBe(true);
});
@@ -1324,8 +1485,9 @@ describe('fleet ps — JSON output shape (FR-6)', () => {
// boot-enable warning: active + disabled
expect(row.bootEnableWarning).toBe(true);
// heartbeat missing → unknown
// heartbeat missing → unknown readiness preserves existing display semantics
expect(row.heartbeat.health).toBe('unknown');
expect(row.readiness).toBe('unknown');
expect(row.name).toBe('canary-pi');
expect(row.runtime).toBe('pi');
@@ -1387,6 +1549,88 @@ describe('fleet ps — command sequences issued', () => {
});
});
describe('fleet ps — readiness table output', () => {
it('renders available in HB column without idle/stuck alarm flags', async () => {
const home = await mkdtemp(join(tmpdir(), 'mosaic-fleet-'));
const rosterPath = join(home, 'fleet', 'roster.yaml');
const runDir = join(home, 'fleet', 'run');
await mkdir(runDir, { recursive: true });
await writeFile(
rosterPath,
[
'version: 1',
'transport: tmux',
'agents:',
' - name: working-agent',
' runtime: pi',
' - name: available-agent',
' runtime: pi',
].join('\n'),
);
const nowMs = 1_700_000_000_000;
const workingActivityEpoch = Math.floor((nowMs - 2_000) / 1000);
const availableActivityEpoch = Math.floor((nowMs - 40_000) / 1000);
const hbTs = new Date(nowMs - 1_000).toISOString();
await writeFile(join(runDir, 'working-agent.hb'), `ts=${hbTs}\npid=111\nstatus=ok\n`);
await writeFile(join(runDir, 'available-agent.hb'), `ts=${hbTs}\npid=222\nstatus=ok\n`);
const savedIdle = process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD;
process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD = '5';
const dateNow = vi.spyOn(Date, 'now').mockReturnValue(nowMs);
const runner: CommandRunner = async (command, args) => {
const full = [command, ...args].join(' ');
if (full.includes('list-sessions')) {
return { stdout: 'working-agent\navailable-agent\n', stderr: '', exitCode: 0 };
}
if (full.includes('=working-agent:0.0')) {
return { stdout: `111 pi 0 ${workingActivityEpoch}\n`, stderr: '', exitCode: 0 };
}
if (full.includes('=available-agent:0.0')) {
return { stdout: `222 pi 0 ${availableActivityEpoch}\n`, stderr: '', exitCode: 0 };
}
if (full.includes('systemctl') && full.includes('show')) {
return {
stdout: 'ActiveState=active\nSubState=running\nUnitFileState=enabled\n',
stderr: '',
exitCode: 0,
};
}
return { stdout: '', stderr: '', exitCode: 0 };
};
const lines: string[] = [];
const origLog = console.log;
console.log = (msg: string) => {
lines.push(msg);
};
const program = new Command();
program.exitOverride();
registerFleetCommand(program, { runner, mosaicHome: home });
try {
await program.parseAsync(['node', 'mosaic', 'fleet', 'ps']);
} finally {
console.log = origLog;
dateNow.mockRestore();
if (savedIdle === undefined) delete process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD;
else process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD = savedIdle;
await rm(home, { recursive: true, force: true });
}
const workingLine = lines.find((line) => line.includes('working-agent'));
const availableLine = lines.find((line) => line.includes('available-agent'));
expect(workingLine).toBeDefined();
expect(workingLine).toContain('1s/working');
expect(availableLine).toBeDefined();
expect(availableLine).toContain('1s/available');
expect(availableLine).not.toMatch(/\bIDLE\b/);
expect(availableLine).not.toMatch(/\bSTUCK\b/);
});
});
describe('buildTmuxListSessionsCommand', () => {
it('builds exact list-sessions command with session_name format', () => {
expect(buildTmuxListSessionsCommand('mosaic-fleet')).toEqual([
@@ -1514,6 +1758,7 @@ describe('fleet ps — unmanaged socket sessions', () => {
// driftFlag must be false for unmanaged (no roster runtime to compare)
expect(unmanagedRow.driftFlag).toBe(false);
expect(unmanagedRow.readiness).toBe('unknown');
});
it('shows UNMANAGED flag in table output for unmanaged sessions', async () => {

View File

@@ -197,6 +197,292 @@ export function getRosterAgent(roster: FleetRoster, name: string): FleetAgent {
return agent;
}
// ---------------------------------------------------------------------------
// NORTH_STAR — machine-readable fleet planning source + Markdown projection
//
// docs/fleet/NORTH_STAR.yaml is the single source of truth. The Markdown file
// (docs/fleet/NORTH_STAR.md) is a deterministic, pure projection of the YAML —
// no network, no CLI, no clock. Edit the YAML, regenerate the .md.
// ---------------------------------------------------------------------------
export interface NorthStarIdText {
id: string;
text: string;
}
export interface NorthStarWorkstream {
id: string;
title: string;
}
export interface NorthStarGoal {
id: string;
title: string;
phase: number;
priority: string;
depends_on: string[];
}
export interface NorthStarAssumption {
id: string;
vetoable: boolean;
text: string;
}
export interface NorthStarSpend {
advisory: boolean;
note: string;
}
export interface NorthStar {
version: number;
mission: string;
substrate: { note: string };
standing_objectives: NorthStarIdText[];
success_criteria: NorthStarIdText[];
workstreams: NorthStarWorkstream[];
goals: NorthStarGoal[];
assumptions: NorthStarAssumption[];
spend: NorthStarSpend;
}
/**
* Parse + validate the NORTH_STAR YAML text into a typed NorthStar object.
* Pure: no IO, no network. Throws a descriptive error when a required key is
* missing or malformed so the generator/tests fail loudly rather than emit a
* partial projection.
*/
export function parseNorthStar(rawText: string): NorthStar {
const parsed = YAML.parse(rawText) as Record<string, unknown> | null;
if (!parsed || typeof parsed !== 'object') {
throw new Error('NORTH_STAR.yaml did not parse to a mapping.');
}
const requireString = (value: unknown, key: string): string => {
if (typeof value !== 'string' || value.trim() === '') {
throw new Error(`NORTH_STAR.yaml: "${key}" must be a non-empty string.`);
}
return value.trim();
};
const requireArray = (value: unknown, key: string): unknown[] => {
if (!Array.isArray(value) || value.length === 0) {
throw new Error(`NORTH_STAR.yaml: "${key}" must be a non-empty array.`);
}
return value;
};
const idText = (value: unknown, key: string, index: number): NorthStarIdText => {
const row = value as Record<string, unknown>;
return {
id: requireString(row?.id, `${key}[${index}].id`),
text: requireString(row?.text, `${key}[${index}].text`),
};
};
const version = parsed.version;
if (typeof version !== 'number') {
throw new Error('NORTH_STAR.yaml: "version" must be a number.');
}
const substrate = parsed.substrate as Record<string, unknown> | undefined;
const spendRaw = parsed.spend as Record<string, unknown> | undefined;
if (!spendRaw || typeof spendRaw.advisory !== 'boolean') {
throw new Error('NORTH_STAR.yaml: "spend.advisory" must be a boolean.');
}
return {
version,
mission: requireString(parsed.mission, 'mission'),
substrate: { note: requireString(substrate?.note, 'substrate.note') },
standing_objectives: requireArray(parsed.standing_objectives, 'standing_objectives').map(
(row, i) => idText(row, 'standing_objectives', i),
),
success_criteria: requireArray(parsed.success_criteria, 'success_criteria').map((row, i) =>
idText(row, 'success_criteria', i),
),
workstreams: requireArray(parsed.workstreams, 'workstreams').map((row, i) => {
const ws = row as Record<string, unknown>;
return {
id: requireString(ws?.id, `workstreams[${i}].id`),
title: requireString(ws?.title, `workstreams[${i}].title`),
};
}),
goals: requireArray(parsed.goals, 'goals').map((row, i) => {
const goal = row as Record<string, unknown>;
const dependsRaw = goal?.depends_on ?? [];
if (!Array.isArray(dependsRaw)) {
throw new Error(`NORTH_STAR.yaml: goals[${i}].depends_on must be an array.`);
}
const phase = goal?.phase;
if (typeof phase !== 'number') {
throw new Error(`NORTH_STAR.yaml: goals[${i}].phase must be a number.`);
}
return {
id: requireString(goal?.id, `goals[${i}].id`),
title: requireString(goal?.title, `goals[${i}].title`),
phase,
priority: requireString(goal?.priority, `goals[${i}].priority`),
depends_on: dependsRaw.map((dep, j) => requireString(dep, `goals[${i}].depends_on[${j}]`)),
};
}),
assumptions: requireArray(parsed.assumptions, 'assumptions').map((row, i) => {
const asm = row as Record<string, unknown>;
if (typeof asm?.vetoable !== 'boolean') {
throw new Error(`NORTH_STAR.yaml: assumptions[${i}].vetoable must be a boolean.`);
}
return {
id: requireString(asm?.id, `assumptions[${i}].id`),
vetoable: asm.vetoable,
text: requireString(asm?.text, `assumptions[${i}].text`),
};
}),
spend: {
advisory: spendRaw.advisory,
note: requireString(spendRaw?.note, 'spend.note'),
},
};
}
/**
* Render a GitHub-Flavored-Markdown table with prettier-compatible column
* alignment: each column is padded to the widest cell (minimum 3 for the
* `---` divider) so the generated bytes survive `prettier --check` unchanged.
* Pure; the row strings use the same single-code-unit dash/arrow glyphs that
* prettier's string-width counts as width 1.
*/
function renderMarkdownTable(headers: string[], rows: string[][]): string[] {
const widths = headers.map((header, col) =>
Math.max(3, header.length, ...rows.map((row) => row[col]?.length ?? 0)),
);
const pad = (cell: string, col: number): string => cell.padEnd(widths[col] ?? 0, ' ');
const formatRow = (cells: string[]): string =>
`| ${cells.map((cell, col) => pad(cell, col)).join(' | ')} |`;
const divider = `| ${widths.map((w) => '-'.repeat(w)).join(' | ')} |`;
return [formatRow(headers), divider, ...rows.map(formatRow)];
}
/**
* Deterministically project a parsed NorthStar into the Markdown doctrine doc.
* Pure function of its input — same input always yields byte-identical output,
* so the round-trip (YAML → render → write) is stable across runs. No clock, no
* network, no CLI. Layout follows the repo's existing doctrine-doc convention
* (heading, blockquote banner, then sections + tables, e.g. north-star.md /
* mission-control/BOARD.md).
*/
export function renderNorthStarMarkdown(ns: NorthStar): string {
const lines: string[] = [];
lines.push('# Mosaic Fleet — NORTH STAR');
lines.push('');
lines.push('> **Generated file — do not edit by hand.**');
lines.push(
'> Projected deterministically from [`NORTH_STAR.yaml`](./NORTH_STAR.yaml) by the pure',
);
lines.push('> generator in `packages/mosaic/src/commands/fleet.ts` (`renderNorthStarMarkdown`).');
lines.push('> Edit the YAML, then regenerate. Self-contained Mosaic — no Hermes dependency.');
lines.push('');
lines.push('## Mission');
lines.push('');
lines.push(ns.mission);
lines.push('');
lines.push('## Substrate');
lines.push('');
lines.push(ns.substrate.note);
lines.push('');
lines.push('## Standing objectives');
lines.push('');
for (const obj of ns.standing_objectives) {
lines.push(`- **${obj.id}** — ${obj.text}`);
}
lines.push('');
lines.push('## Success criteria');
lines.push('');
for (const ac of ns.success_criteria) {
lines.push(`- **${ac.id}** — ${ac.text}`);
}
lines.push('');
lines.push('## Workstreams');
lines.push('');
lines.push(
...renderMarkdownTable(
['id', 'title'],
ns.workstreams.map((ws) => [ws.id, ws.title]),
),
);
lines.push('');
lines.push('## Goals (backlog projection)');
lines.push('');
lines.push(
...renderMarkdownTable(
['id', 'title', 'phase', 'priority', 'depends_on'],
ns.goals.map((goal) => [
goal.id,
goal.title,
String(goal.phase),
goal.priority,
goal.depends_on.length > 0 ? goal.depends_on.join(', ') : '—',
]),
),
);
lines.push('');
lines.push('## Assumptions (vetoable)');
lines.push('');
for (const asm of ns.assumptions) {
const veto = asm.vetoable ? 'vetoable' : 'fixed';
lines.push(`- **${asm.id}** (${veto}) — ${asm.text}`);
}
lines.push('');
lines.push('## Spend');
lines.push('');
lines.push(`- **advisory:** ${ns.spend.advisory ? 'true' : 'false'}`);
lines.push(`- ${ns.spend.note}`);
// No trailing blank line: the writer appends a single newline, yielding the
// one-newline EOF prettier expects (round-trip stays format:check-clean).
return lines.join('\n');
}
/**
* Resolve the repo's docs/fleet directory from this compiled module's location.
* fleet.ts lives at packages/mosaic/src/commands; docs/fleet sits at the repo
* root. Exposed so the generator + tests share one path resolution.
*/
export function resolveNorthStarPaths(repoRoot?: string): {
yamlPath: string;
markdownPath: string;
} {
const root = repoRoot ?? resolve(dirname(fileURLToPath(import.meta.url)), '..', '..', '..', '..');
const dir = join(root, 'docs', 'fleet');
return {
yamlPath: join(dir, 'NORTH_STAR.yaml'),
markdownPath: join(dir, 'NORTH_STAR.md'),
};
}
/**
* Read NORTH_STAR.yaml, project it to Markdown, and write NORTH_STAR.md.
* The only IO in the NORTH_STAR pipeline; the parse + render steps it composes
* are pure. Returns the rendered Markdown so callers/tests can assert on it.
*/
export async function generateNorthStarMarkdown(repoRoot?: string): Promise<string> {
const { yamlPath, markdownPath } = resolveNorthStarPaths(repoRoot);
const rawText = await readFile(yamlPath, 'utf8');
const ns = parseNorthStar(rawText);
const markdown = renderNorthStarMarkdown(ns);
await writeFile(markdownPath, `${markdown}\n`, 'utf8');
return markdown;
}
export function generateAgentEnv(roster: FleetRoster, agent: FleetAgent): string {
const workingDirectory = agent.workingDirectory ?? roster.defaults.workingDirectory;
return [
@@ -394,6 +680,7 @@ export function buildAgentTailCommand(agentName: string, lines: number, socketNa
// ---------------------------------------------------------------------------
export const HEARTBEAT_INTERVAL_MS = 15_000;
export const HEARTBEAT_IDLE_THRESHOLD_SECONDS = 300;
/**
* Heartbeat interval in ms, honoring MOSAIC_HEARTBEAT_INTERVAL (seconds) so the
@@ -404,8 +691,57 @@ export function heartbeatIntervalMs(): number {
const sec = Number.parseInt(process.env.MOSAIC_HEARTBEAT_INTERVAL ?? '', 10);
return Number.isFinite(sec) && sec > 0 ? sec * 1000 : HEARTBEAT_INTERVAL_MS;
}
/** Activity threshold in seconds, honoring MOSAIC_HEARTBEAT_IDLE_THRESHOLD. */
export function idleThresholdSeconds(): number {
const sec = Number.parseInt(process.env.MOSAIC_HEARTBEAT_IDLE_THRESHOLD ?? '', 10);
return Number.isFinite(sec) && sec > 0 ? sec : HEARTBEAT_IDLE_THRESHOLD_SECONDS;
}
export const HEARTBEAT_HEALTHY_MULTIPLIER = 3;
export type ReadinessState = 'working' | 'available' | 'stuck' | 'stale' | 'dead' | 'unknown';
export interface ReadinessSignals {
paneAlive: boolean;
hbHealth: 'healthy' | 'stale' | 'unknown';
hbStatus: 'ok' | 'busy' | null;
idleSeconds: number | null;
}
export interface ReadinessThresholds {
idleThresholdSeconds: number;
}
/**
* Classify whether an agent is progressing based on already-parsed heartbeat/tmux signals.
* Best-effort and runtime-agnostic: it never probes, never throws, and preserves existing
* unknown/stale behavior when heartbeat data is absent or old.
*/
export function classifyReadiness(
signals: Partial<ReadinessSignals> | null | undefined,
thresholds: Partial<ReadinessThresholds> | null | undefined = {},
): ReadinessState {
try {
if (signals?.paneAlive !== true) return 'dead';
if (signals.hbHealth === 'unknown' || signals.hbHealth === undefined) return 'unknown';
if (signals.hbHealth === 'stale') return 'stale';
if (signals.hbStatus === 'busy') return 'working';
if (signals.idleSeconds === null || signals.idleSeconds === undefined) return 'working';
const idleSeconds = Number.isFinite(signals.idleSeconds) ? signals.idleSeconds : null;
if (idleSeconds === null) return 'working';
const idleThreshold = Number.isFinite(thresholds?.idleThresholdSeconds)
? Number(thresholds?.idleThresholdSeconds)
: idleThresholdSeconds();
// Follow-up: stuck pending per-agent assignment awareness: assigned task + idle past threshold => stuck.
if (idleSeconds >= idleThreshold) return 'available';
return 'working';
} catch {
return 'unknown';
}
}
export interface HeartbeatInfo {
ts: Date | null;
pid: number | null;
@@ -429,6 +765,7 @@ export interface AgentPsRow {
paneCommand: string | null;
idleSeconds: number | null;
heartbeat: HeartbeatInfo;
readiness: ReadinessState;
/** roster runtime !== actual pane command */
driftFlag: boolean;
/** active but UnitFileState=disabled */
@@ -461,7 +798,7 @@ export function buildSystemdShowCommand(agentName: string): string[] {
/**
* Returns the tmux list-panes command for an agent pane.
* Format: `#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity}`
* Format: `#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity} #{window_activity} #{session_activity}`
*/
export function buildTmuxListPanesCommand(agentName: string, socketName = ''): string[] {
return [
@@ -471,7 +808,7 @@ export function buildTmuxListPanesCommand(agentName: string, socketName = ''): s
'-t',
`=${agentName}:0.0`,
'-F',
'#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity}',
'#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity} #{window_activity} #{session_activity}',
];
}
@@ -571,8 +908,8 @@ export function parseSystemdShow(output: string): {
}
/**
* Parse the output of `tmux list-panes -F '#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity}'`
* pane_activity is a Unix epoch timestamp (seconds).
* Parse the output of `tmux list-panes -F '#{pane_pid} #{pane_current_command} #{pane_dead} #{pane_activity} #{window_activity} #{session_activity}'`
* Activity fields are Unix epoch timestamps (seconds), ordered most precise to coarsest.
*/
export function parseTmuxListPanes(
output: string,
@@ -582,16 +919,18 @@ export function parseTmuxListPanes(
if (!line) {
return { pid: null, command: null, dead: true, idleSeconds: null };
}
// format: <pid> <command> <dead(0|1)> <activity_epoch>
// format: <pid> <command> <dead(0|1)> <pane_activity> <window_activity> <session_activity>
const parts = line.split(' ');
const pid = parts[0] ? (Number.isFinite(Number(parts[0])) ? Number(parts[0]) : null) : null;
const command = parts[1] ?? null;
const dead = parts[2] === '1';
const activityEpoch = parts[3] ? Number(parts[3]) : NaN;
const idleSeconds =
Number.isFinite(activityEpoch) && activityEpoch > 0
? Math.floor((nowMs - activityEpoch * 1000) / 1000)
: null;
const activityEpoch = parts
.slice(3, 6)
.map((part) => (part ? Number(part) : NaN))
.find((epoch) => Number.isFinite(epoch) && epoch > 0);
const idleSeconds = activityEpoch
? Math.max(0, Math.floor((nowMs - activityEpoch * 1000) / 1000))
: null;
return { pid, command, dead, idleSeconds };
}
@@ -1022,6 +1361,9 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
const nowMs = Date.now();
const rows: AgentPsRow[] = [];
const readinessThresholds = {
idleThresholdSeconds: idleThresholdSeconds(),
};
// Build the set of roster agent names for quick lookup when filtering socket sessions.
const rosterAgentNames = new Set(roster.agents.map((a) => a.name));
@@ -1052,6 +1394,17 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
const bootEnableWarning =
sysInfo.ActiveState === 'active' && sysInfo.UnitFileState === 'disabled';
const paneAlive = !paneInfo.dead;
const readiness = classifyReadiness(
{
paneAlive,
hbHealth: hb.health,
hbStatus: hb.status,
idleSeconds: paneInfo.idleSeconds,
},
readinessThresholds,
);
rows.push({
name: agent.name,
tenant_id,
@@ -1059,11 +1412,12 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
runtime: agent.runtime,
systemdActive: sysInfo.ActiveState,
systemdEnabled: sysInfo.UnitFileState,
paneAlive: !paneInfo.dead,
paneAlive,
panePid: paneInfo.pid,
paneCommand: paneInfo.command,
idleSeconds: paneInfo.idleSeconds,
heartbeat: hb,
readiness,
driftFlag,
bootEnableWarning,
managed: true,
@@ -1110,6 +1464,17 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
const bootEnableWarning =
sysInfo.ActiveState === 'active' && sysInfo.UnitFileState === 'disabled';
const paneAlive = !paneInfo.dead;
const readiness = classifyReadiness(
{
paneAlive,
hbHealth: hb.health,
hbStatus: hb.status,
idleSeconds: paneInfo.idleSeconds,
},
readinessThresholds,
);
rows.push({
name: sessionName,
tenant_id,
@@ -1118,11 +1483,12 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
runtime: 'unknown',
systemdActive: sysInfo.ActiveState,
systemdEnabled: sysInfo.UnitFileState,
paneAlive: !paneInfo.dead,
paneAlive,
panePid: paneInfo.pid,
paneCommand: paneInfo.command,
idleSeconds: paneInfo.idleSeconds,
heartbeat: hb,
readiness,
// No roster runtime to compare — drift is not meaningful for unmanaged sessions
driftFlag: false,
bootEnableWarning,
@@ -1164,7 +1530,7 @@ export function registerFleetCommand(program: Command, deps: FleetCommandDeps =
const idle = row.idleSeconds !== null ? `${row.idleSeconds}s` : '-';
const hbAge =
row.heartbeat.ageMs !== null
? `${Math.round(row.heartbeat.ageMs / 1000)}s/${row.heartbeat.health}`
? `${Math.round(row.heartbeat.ageMs / 1000)}s/${row.readiness}`
: `unknown`;
const model = row.heartbeat.model ?? '-';
const flags: string[] = [];

View File

@@ -8,6 +8,9 @@ import {
readRosterAgentNames,
runFrameworkReseed,
refreshActiveFleetUnits,
readInstalledFrameworkVersion,
readBundledFrameworkVersion,
checkFrameworkDrift,
} from './update-checker.js';
import { existsSync, readFileSync } from 'node:fs';
@@ -123,3 +126,73 @@ describe('refreshActiveFleetUnits', () => {
expect(existsSync(join(configHome, 'systemd', 'user', 'mosaic-agent@.service'))).toBe(false);
});
});
/**
* #642: re-seed when the on-disk framework is older than the bundled one even
* if no package is reported outdated (CLI upgraded outside `mosaic update`).
*/
describe('framework drift detection', () => {
let home: string; // stand-in for ~/.config/mosaic
let fw: string; // stand-in for the bundled framework root
beforeEach(() => {
const root = mkdtempSync(join(tmpdir(), 'mosaic-drift-'));
home = join(root, 'mosaic');
fw = join(root, 'framework');
mkdirSync(home, { recursive: true });
mkdirSync(fw, { recursive: true });
});
afterEach(() => {
rmSync(join(home, '..'), { recursive: true, force: true });
});
const writeInstalled = (v: string) => writeFileSync(join(home, '.framework-version'), v);
const writeBundled = (v: string) =>
writeFileSync(join(fw, 'install.sh'), `#!/usr/bin/env bash\nFRAMEWORK_VERSION=${v}\n`);
describe('readInstalledFrameworkVersion', () => {
it('returns undefined when the version file is absent', () => {
expect(readInstalledFrameworkVersion(home)).toBeUndefined();
});
it('parses the integer (tolerating surrounding whitespace)', () => {
writeInstalled(' 3\n');
expect(readInstalledFrameworkVersion(home)).toBe(3);
});
it('returns undefined for non-numeric content', () => {
writeInstalled('not-a-number\n');
expect(readInstalledFrameworkVersion(home)).toBeUndefined();
});
});
describe('readBundledFrameworkVersion', () => {
it('returns undefined when install.sh is absent', () => {
expect(readBundledFrameworkVersion(fw)).toBeUndefined();
});
it('parses FRAMEWORK_VERSION=<n> from install.sh', () => {
writeBundled('4');
expect(readBundledFrameworkVersion(fw)).toBe(4);
});
});
describe('checkFrameworkDrift', () => {
it('reports drift when on-disk is older than bundled', () => {
writeInstalled('3');
writeBundled('4');
expect(checkFrameworkDrift(home, fw)).toEqual({ drifted: true, installed: 3, bundled: 4 });
});
it('no drift when versions match', () => {
writeInstalled('4');
writeBundled('4');
expect(checkFrameworkDrift(home, fw)).toMatchObject({ drifted: false });
});
it('no drift when on-disk is newer than bundled', () => {
writeInstalled('5');
writeBundled('4');
expect(checkFrameworkDrift(home, fw)).toMatchObject({ drifted: false });
});
it('no drift (conservative) when a version cannot be read', () => {
writeBundled('4'); // installed version file missing
expect(checkFrameworkDrift(home, fw)).toMatchObject({ drifted: false, bundled: 4 });
});
});
});

View File

@@ -521,6 +521,75 @@ export function runFrameworkReseed(
}
}
// ─── Framework drift detection (#642) ────────────────────────────────────────
//
// `mosaic update` only re-seeds the framework when the @mosaicstack/mosaic
// package itself is upgraded *within that command*. When the CLI is upgraded
// some OTHER way — a direct `npm i -g @mosaicstack/mosaic`, or an upgrade run
// where only sibling packages were outdated — the framework files in
// ~/.config/mosaic stay stale and shipped launcher/runtime fixes never
// activate. Comparing the on-disk framework schema version against the version
// bundled in the installed package detects exactly that situation.
/** Read the framework schema version recorded on disk (~/.config/mosaic/.framework-version). */
export function readInstalledFrameworkVersion(
mosaicHome = join(homedir(), '.config', 'mosaic'),
): number | undefined {
const vf = join(mosaicHome, '.framework-version');
if (!existsSync(vf)) return undefined;
try {
const n = parseInt(readFileSync(vf, 'utf-8').trim(), 10);
return Number.isFinite(n) ? n : undefined;
} catch {
return undefined;
}
}
/**
* Read the framework schema version shipped in the installed package by parsing
* `FRAMEWORK_VERSION=<n>` out of the bundled install.sh (the authoritative
* source the installer writes to .framework-version).
*/
export function readBundledFrameworkVersion(
frameworkRoot = resolveBundledFrameworkRoot(),
): number | undefined {
const installer = join(frameworkRoot, 'install.sh');
if (!existsSync(installer)) return undefined;
try {
const m = readFileSync(installer, 'utf-8').match(/^\s*FRAMEWORK_VERSION=(\d+)/m);
const raw = m?.[1];
if (!raw) return undefined;
const n = parseInt(raw, 10);
return Number.isFinite(n) ? n : undefined;
} catch {
return undefined;
}
}
export interface FrameworkDrift {
/** True only when both versions are known AND the on-disk one is older. */
drifted: boolean;
installed?: number;
bundled?: number;
}
/**
* Detect whether the on-disk framework is older than the framework bundled in
* the installed CLI (#642). Conservative: if either version can't be read the
* result is no-drift, so a missing/unreadable version file never triggers an
* unexpected re-seed.
*/
export function checkFrameworkDrift(
mosaicHome = join(homedir(), '.config', 'mosaic'),
frameworkRoot = resolveBundledFrameworkRoot(),
): FrameworkDrift {
const installed = readInstalledFrameworkVersion(mosaicHome);
const bundled = readBundledFrameworkVersion(frameworkRoot);
const drifted =
typeof installed === 'number' && typeof bundled === 'number' && installed < bundled;
return { drifted, installed, bundled };
}
/**
* Best-effort parse of the fleet roster for agent names (used to relaunch
* durable agents after a re-seed). Returns [] when no roster exists.