Commit Graph

175 Commits

Author SHA1 Message Date
Jarvis
786762587d fix(fleet): serialize restart-lock transitions to close concurrent-breaker race (review #680)
All checks were successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Stale/max-wait takeover was not safe against concurrent breakers: two
breakers could both judge the lock stale and both proceed, re-introducing
the tight-loop. POSIX/Node has no content- or inode-conditional unlink or
rename, so "judge stale, then replace" can never be atomic with pure path
ops.

Serialize ALL lock transitions (acquire, release, takeover) under one
short-lived registry mutex held only across a few fs ops, never across the
restart itself. This makes check-then-mutate atomic, so exactly one breaker
can take over a stale lock while the others wait and re-evaluate.

The mutex itself uses mtime-based staleness (open('wx') creates an empty
inode before the token is written; a content check would reap a lock that is
still being acquired). The mutex populates-or-cleans-up on write failure so a
half-created mutex never leaks.

Regression coverage at two widths: a 2-breaker barrier test (exactly one
takes over, the other waits) and the existing 3-breaker test (maxActive===1,
distinct tokens, final lock released).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 20:26:39 -05:00
Jarvis
43ad813e0d fix(fleet): make restart-lock release/break ownership-safe (review #680)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was canceled
Addresses the reviewer's blocker (comment 15915): release() unconditionally
unlinked restart.lock, so after a stale/max-wait break an OLD owner could
delete a NEWER owner's lock, letting a third restart interleave and defeating
the guard.

- Each acquire writes a unique owner token (randomUUID) into the lock file.
- release() only unlinks while that token is still on disk; once another caller
  has broken and re-owned the lock, the timed-out original owner's release() is
  a no-op and leaves the new owner's lock intact.
- Breaking a stale/timed-out lock now takes ownership atomically via
  write-temp + rename (atomic replace) instead of a blind unlink-then-recreate;
  a breaker that loses a concurrent takeover reads back a foreign token and
  keeps waiting rather than assuming ownership.

Regression test (does not let a timed-out owner drop a lock another restart
broke and re-owned) reproduces the three-restart interleave: R1 hangs (stale),
R2 breaks + re-owns, R1.release() must NOT drop R2's lock. Fails on the old
blind-unlink path (ENOENT), passes now. Also adds explicit single-agent
restart-path guard coverage (review should-fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:18:38 -05:00
Jarvis
9c2e4f0b2d fix(fleet): guard mosaic fleet restart against tight-loop re-entry race
`mosaic fleet restart` runs as a fresh process each invocation, issuing
`systemctl --user restart` for the tmux holder and then each agent. The
agent sessions live inside the holder's tmux, so restarting the holder
tears them down. With no mutual exclusion, a second restart entering
while the first is still mid-teardown (upgrade `--relaunch`, a watchdog,
or a hurried operator) interleaves: agents relaunch against a
half-torn-down holder, fail, and tight-loop.

Add a cross-process teardown-settle guard: a lock file under
`<mosaicHome>/fleet/run/restart.lock` acquired with O_CREAT|O_EXCL.
A re-entrant restart waits (bounded, injectable sleep) for the in-flight
restart to release the lock before relaunching, breaks a stale lock left
by a crashed owner, and after a max wait breaks the lock to avoid a
permanent deadlock. Both full-fleet and single-agent restart paths are
guarded; start/stop/status are unchanged.

Regression test reproduces the race: with an in-flight lock held, the
restart must wait before issuing any systemctl command — fails on the
unguarded code path, passes with the guard. Adds stale-lock-break and
lock-release coverage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 19:18:38 -05:00
79eae2ffce fix(fleet): raise fleet-personas.spec timeout to 30s (mirror #665) (#677)
Some checks are pending
ci/woodpecker/push/ci Pipeline is pending
ci/woodpecker/push/publish Pipeline is pending
2026-06-24 22:16:49 +00:00
e0e7be70f5 chore(release): @mosaicstack/mosaic 0.0.48 (#670)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 20:01:17 +00:00
d7eaa19380 feat(fleet): provision roster from system-type profile (H3) (#665)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 19:48:54 +00:00
248193cd3b fix(fleet): export MOSAIC_AGENT_CLASS into the agent pane so personas inject (#669)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 19:38:15 +00:00
a9857c5043 fix(release): republish @mosaicstack/db 0.0.4 with BacklogService; mosaic 0.0.47 (#668)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
2026-06-24 18:54:33 +00:00
e4ede69144 chore(release): mosaic 0.0.46 (persona contracts live) (#666)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 18:29:01 +00:00
0d17a29ebe feat(fleet): export MOSAIC_AGENT_CLASS into agent env (A3a) (#663)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 17:19:59 +00:00
28cfecda94 feat(fleet): inject persona contract at launch (A3b) (#664)
Some checks failed
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline failed
2026-06-24 17:06:51 +00:00
6c84ccd0b1 feat(fleet): dedicated orchestrator persona, split from planner (#662)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 16:42:23 +00:00
84d2757817 feat(fleet): update-surviving persona customization (H4) (#661)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 16:21:01 +00:00
a738ac1410 feat(fleet): system-type profiles (H2) (#660)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 16:02:25 +00:00
538f0556d5 feat(fleet): cross-domain baseline persona library (H1) (#659)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 15:31:56 +00:00
a094c86eea feat(fleet): North Star scope — general-purpose system, personas & system profiles (workstream H) (#658)
Some checks failed
ci/woodpecker/push/publish Pipeline was canceled
ci/woodpecker/push/ci Pipeline was canceled
2026-06-24 15:25:57 +00:00
f852250419 feat(fleet): native Mosaic backlog on @mosaicstack/db (atomic claim + TTL) (#657)
Some checks failed
ci/woodpecker/push/ci-image Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was canceled
2026-06-24 14:55:10 +00:00
61b1bdac2a feat(fleet): add machine-readable NORTH_STAR.yaml + Markdown projection (#656)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 14:40:09 +00:00
cabb179d5a feat(fleet): seed role registry markdown library (#655)
Some checks failed
ci/woodpecker/push/publish Pipeline was canceled
ci/woodpecker/push/ci Pipeline was canceled
2026-06-24 14:39:54 +00:00
eb795bab18 chore(release): mosaic CLI 0.0.45 (#654)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 14:11:33 +00:00
937077f6be fix(fleet): report idle agents as available, reserve stuck for genuine blocks (#653)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 13:58:22 +00:00
1020cfaf9b chore(release): mosaic CLI 0.0.44 (#652)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 06:49:04 +00:00
70661e3fab fix(fleet): derive pane idle from window activity fallback (#651)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 06:37:45 +00:00
ec8dd7ca86 chore(release): mosaic CLI 0.0.43 (#650)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 06:08:20 +00:00
d887555852 feat(fleet): classify agent readiness in fleet ps (#649)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:55:47 +00:00
e3adc6a1bc chore(release): mosaic CLI 0.0.42 (#648)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:28:28 +00:00
aa27c42129 fix(fleet): pre-trust claude agent workdir to clear the folder-trust gate (#644) (#645)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:16:46 +00:00
16ae809442 fix(update): re-seed framework on version drift, not just in-command updates (#642) (#646)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-24 05:04:34 +00:00
e6b53ea103 fix(tools): default AGENT_WORK_ROOT to $HOME/mosaic/agent-work (#641)
Some checks failed
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was canceled
2026-06-23 13:40:13 +00:00
4da87640e8 feat(tmux): agent-send.sh --class triage tag for the comms daemon (#552)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-23 03:25:16 +00:00
a38a491403 chore(release): mosaic CLI 0.0.41 (#640)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-23 02:21:04 +00:00
94e5cd7a81 ci: eliminate cold pnpm install via pre-baked CI base image (Phase 1) (#635)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
2026-06-22 22:50:21 +00:00
4e84f8e850 feat(fleet): comms-block emitter + FLEET-LAUNCH runbook (#633) (#638)
Some checks failed
ci/woodpecker/push/ci Pipeline was canceled
ci/woodpecker/push/publish Pipeline was canceled
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 22:23:50 +00:00
bf2a6745c8 fix(install): preserve user fleet data on re-seed + refresh active units (CRITICAL) (#632)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 21:38:09 +00:00
d539d61e0e refactor(fleet): rename tmux socket mosaic-factory → mosaic-fleet (#630)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 21:08:43 +00:00
e2336bb0ca chore(release): mosaic CLI 0.0.40 (#624)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
2026-06-22 19:49:45 +00:00
7342415a32 fix(fleet): consume model_hint + fix socket-default trap (stand-up fixes) (#627)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 19:18:01 +00:00
095e19443b feat(fleet): onboarding-injection — comms cheat-sheet + peer roster per agent (#621)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 17:54:54 +00:00
fabc413407 feat(fleet): F4 Phase 2a — Matrix CS-API connector client + factory (#618)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 16:48:17 +00:00
858d90329d feat(fleet): F4 Phase 1 — chat connector abstraction + Matrix design (#617)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 16:14:32 +00:00
2bf66136e4 feat(fleet): enhancer role + two-agent floor (orchestrator + enhancer) (#615)
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 13:15:59 +00:00
d46ac40890 fix(fleet): boot-survival symmetry — disable-on-remove + add-enable + init-R5 (#612)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 08:12:58 +00:00
8ddd48c843 feat(mosaic): mosaic update re-seeds framework + relaunches agents (R13) (#610)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 03:34:05 +00:00
528700ceea feat(framework): P6 — docs + compliance matrix + resident-budget CI (#607)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was canceled
ci/woodpecker/tag/publish Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 02:20:35 +00:00
32f4215461 chore(release): bump @mosaicstack/mosaic 0.0.38 -> 0.0.39 (#608)
Some checks are pending
ci/woodpecker/push/ci Pipeline is pending
ci/woodpecker/push/publish Pipeline is pending
2026-06-22 02:16:12 +00:00
23343bb7f0 feat(mosaic): P5 — overlay composer (compose-contract + *.local overlays) (#605)
Some checks are pending
ci/woodpecker/push/ci Pipeline is pending
ci/woodpecker/push/publish Pipeline is pending
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 02:16:05 +00:00
c8b2dab0ca chore(release): bump @mosaicstack/mosaic 0.0.37 -> 0.0.38 (#603)
Some checks failed
ci/woodpecker/push/publish Pipeline was canceled
ci/woodpecker/push/ci Pipeline was canceled
2026-06-22 01:48:27 +00:00
6dbe452a9f fix(fleet): watch viewer-session leak + workdir test settle-race (#601)
Some checks failed
ci/woodpecker/push/publish Pipeline was canceled
ci/woodpecker/push/ci Pipeline was canceled
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 01:43:21 +00:00
59c755067e feat(fleet): F3-m2 — native Pi heartbeat + model surface + mosaic_mission_status tool (#602)
Some checks are pending
ci/woodpecker/push/ci Pipeline is pending
ci/woodpecker/push/publish Pipeline is pending
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 01:43:18 +00:00
6ffb27787e fix(fleet): complete HB reader/writer consistency + sidecar hardening (#599)
Some checks failed
ci/woodpecker/push/ci Pipeline was canceled
ci/woodpecker/push/publish Pipeline was canceled
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-06-22 01:22:35 +00:00