stack

Author	SHA1	Message	Date
Jarvis	786762587d	fix(fleet): serialize restart-lock transitions to close concurrent-breaker race (review #680 ) All checks were successful ci/woodpecker/pr/ci Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details Stale/max-wait takeover was not safe against concurrent breakers: two breakers could both judge the lock stale and both proceed, re-introducing the tight-loop. POSIX/Node has no content- or inode-conditional unlink or rename, so "judge stale, then replace" can never be atomic with pure path ops. Serialize ALL lock transitions (acquire, release, takeover) under one short-lived registry mutex held only across a few fs ops, never across the restart itself. This makes check-then-mutate atomic, so exactly one breaker can take over a stale lock while the others wait and re-evaluate. The mutex itself uses mtime-based staleness (open('wx') creates an empty inode before the token is written; a content check would reap a lock that is still being acquired). The mutex populates-or-cleans-up on write failure so a half-created mutex never leaks. Regression coverage at two widths: a 2-breaker barrier test (exactly one takes over, the other waits) and the existing 3-breaker test (maxActive===1, distinct tokens, final lock released). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 20:26:39 -05:00
Jarvis	43ad813e0d	fix(fleet): make restart-lock release/break ownership-safe (review #680 ) Some checks failed ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/pr/ci Pipeline was canceled Details Addresses the reviewer's blocker (comment 15915): release() unconditionally unlinked restart.lock, so after a stale/max-wait break an OLD owner could delete a NEWER owner's lock, letting a third restart interleave and defeating the guard. - Each acquire writes a unique owner token (randomUUID) into the lock file. - release() only unlinks while that token is still on disk; once another caller has broken and re-owned the lock, the timed-out original owner's release() is a no-op and leaves the new owner's lock intact. - Breaking a stale/timed-out lock now takes ownership atomically via write-temp + rename (atomic replace) instead of a blind unlink-then-recreate; a breaker that loses a concurrent takeover reads back a foreign token and keeps waiting rather than assuming ownership. Regression test (does not let a timed-out owner drop a lock another restart broke and re-owned) reproduces the three-restart interleave: R1 hangs (stale), R2 breaks + re-owns, R1.release() must NOT drop R2's lock. Fails on the old blind-unlink path (ENOENT), passes now. Also adds explicit single-agent restart-path guard coverage (review should-fix). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 19:18:38 -05:00
Jarvis	9c2e4f0b2d	fix(fleet): guard `mosaic fleet restart` against tight-loop re-entry race `mosaic fleet restart` runs as a fresh process each invocation, issuing `systemctl --user restart` for the tmux holder and then each agent. The agent sessions live inside the holder's tmux, so restarting the holder tears them down. With no mutual exclusion, a second restart entering while the first is still mid-teardown (upgrade `--relaunch`, a watchdog, or a hurried operator) interleaves: agents relaunch against a half-torn-down holder, fail, and tight-loop. Add a cross-process teardown-settle guard: a lock file under `<mosaicHome>/fleet/run/restart.lock` acquired with O_CREAT\|O_EXCL. A re-entrant restart waits (bounded, injectable sleep) for the in-flight restart to release the lock before relaunching, breaks a stale lock left by a crashed owner, and after a max wait breaks the lock to avoid a permanent deadlock. Both full-fleet and single-agent restart paths are guarded; start/stop/status are unchanged. Regression test reproduces the race: with an in-flight lock held, the restart must wait before issuing any systemctl command — fails on the unguarded code path, passes with the guard. Adds stale-lock-break and lock-release coverage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 19:18:38 -05:00
jason.woltje	d7eaa19380	feat(fleet): provision roster from system-type profile (H3) (#665 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 19:48:54 +00:00
jason.woltje	0d17a29ebe	feat(fleet): export MOSAIC_AGENT_CLASS into agent env (A3a) (#663 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 17:19:59 +00:00
jason.woltje	84d2757817	feat(fleet): update-surviving persona customization (H4) (#661 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 16:21:01 +00:00
jason.woltje	a738ac1410	feat(fleet): system-type profiles (H2) (#660 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 16:02:25 +00:00
jason.woltje	f852250419	feat(fleet): native Mosaic backlog on @mosaicstack/db (atomic claim + TTL) (#657 ) Some checks failed ci/woodpecker/push/ci-image Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/push/publish Pipeline was canceled Details	2026-06-24 14:55:10 +00:00
jason.woltje	937077f6be	fix(fleet): report idle agents as available, reserve stuck for genuine blocks (#653 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 13:58:22 +00:00
jason.woltje	70661e3fab	fix(fleet): derive pane idle from window activity fallback (#651 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 06:37:45 +00:00
jason.woltje	d887555852	feat(fleet): classify agent readiness in fleet ps (#649 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details	2026-06-24 05:55:47 +00:00
Jason Woltje	4e84f8e850	feat(fleet): comms-block emitter + FLEET-LAUNCH runbook (#633 ) (#638 ) Some checks failed ci/woodpecker/push/ci Pipeline was canceled Details ci/woodpecker/push/publish Pipeline was canceled Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 22:23:50 +00:00
Jason Woltje	d539d61e0e	refactor(fleet): rename tmux socket mosaic-factory → mosaic-fleet (#630 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 21:08:43 +00:00
Jason Woltje	7342415a32	fix(fleet): consume model_hint + fix socket-default trap (stand-up fixes) (#627 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 19:18:01 +00:00
Jason Woltje	2bf66136e4	feat(fleet): enhancer role + two-agent floor (orchestrator + enhancer) (#615 ) All checks were successful ci/woodpecker/push/publish Pipeline was successful Details ci/woodpecker/push/ci Pipeline was successful Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 13:15:59 +00:00
Jason Woltje	d46ac40890	fix(fleet): boot-survival symmetry — disable-on-remove + add-enable + init-R5 (#612 ) All checks were successful ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/push/publish Pipeline was successful Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 08:12:58 +00:00
Jason Woltje	59c755067e	feat(fleet): F3-m2 — native Pi heartbeat + model surface + mosaic_mission_status tool (#602 ) Some checks are pending ci/woodpecker/push/ci Pipeline is pending Details ci/woodpecker/push/publish Pipeline is pending Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 01:43:18 +00:00
Jason Woltje	6ffb27787e	fix(fleet): complete HB reader/writer consistency + sidecar hardening (#599 ) Some checks failed ci/woodpecker/push/ci Pipeline was canceled Details ci/woodpecker/push/publish Pipeline was canceled Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-22 01:22:35 +00:00
jason.woltje	67df06f1c4	feat(fleet): orchestrator-mutable fleet — fleet add/remove (F5/R9) (#596 ) Some checks are pending ci/woodpecker/push/ci Pipeline is pending Details ci/woodpecker/push/publish Pipeline is pending Details	2026-06-21 23:26:21 +00:00
Jason Woltje	60a309d5a4	fix(fleet): heartbeat consistency — MOSAIC_HOME path + configurable interval (#595 ) Some checks are pending ci/woodpecker/push/ci Pipeline is pending Details ci/woodpecker/push/publish Pipeline is pending Details Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>	2026-06-21 23:25:53 +00:00
jason.woltje	ca19d57bba	feat(fleet): config-type presets + AI-free init wizard (F1) (#591 ) Some checks are pending ci/woodpecker/push/ci Pipeline is pending Details ci/woodpecker/push/publish Pipeline is pending Details	2026-06-21 23:07:41 +00:00
jason.woltje	5bef2c35eb	feat(fleet): fleet ps surfaces unmanaged socket sessions (#586 ) Some checks failed ci/woodpecker/push/ci Pipeline was canceled Details ci/woodpecker/push/publish Pipeline was canceled Details	2026-06-21 22:37:34 +00:00
jason.woltje	afcbbb302f	feat(fleet): auto-enable units on install + drift recognizes wrapped runtimes (#583 ) Some checks failed ci/woodpecker/push/ci Pipeline failed Details ci/woodpecker/push/publish Pipeline was successful Details	2026-06-21 20:02:19 +00:00
jason.woltje	af2eede7a9	feat(fleet): Phase-2 observability — fleet ps + watch + send verify (#579 ) All checks were successful ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/push/publish Pipeline was successful Details	2026-06-21 04:23:51 +00:00
jason.woltje	7498fcb20d	fix(fleet): preserve agent env overrides on install (#567 ) All checks were successful ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/push/publish Pipeline was successful Details	2026-06-20 21:50:46 +00:00
jason.woltje	b5c1381e45	fix(fleet): harden operator sends for release (#565 ) All checks were successful ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/push/publish Pipeline was successful Details	2026-06-20 20:41:11 +00:00
jason.woltje	6dfd78f643	feat(fleet): add local canary CLI (#563 ) All checks were successful ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/push/publish Pipeline was successful Details	2026-06-20 17:49:01 +00:00

27 Commits