docs(fleet): session-2 log — heartbeat live + launch-path/env findings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RMoEx7hfdFGjUiCHuN1RRi
This commit is contained in:
@@ -54,3 +54,22 @@ with a second agent on `dragon-lin`.
|
||||
Jason (8 forks decided). Branched `feat/fleet-observability`. Persisted
|
||||
`docs/fleet/{north-star.md,PRD.md,TASKS.md}` + this scratchpad. Next: establish comms
|
||||
with dragon-lin coder, commit docs, begin Phase-2 delivery (heartbeat + `fleet ps`).
|
||||
- 2026-06-20 (session 2): Built Phase-2 CLI via worker (commit ab47831): `fleet ps`,
|
||||
`agent watch`, `agent send --verify`, 62 tests. LIVE-verified `fleet ps` on
|
||||
mosaic-factory — correctly flagged canary-pi DRIFT + BOOT-ENABLE, tenant_id+host in JSON.
|
||||
Heartbeat responder added to dogfood-agent.py (FLEET-OBS-002) — `fleet ps` HB now
|
||||
`healthy` for all 4 agents.
|
||||
- Coordination: dual-engine-reviewed (Claude+Codex) and merged framework PRs #572
|
||||
(sanitization gate) + #575 (CONSTITUTION extraction) as Lead. Codex caught an Alpine
|
||||
blocker on #572 (refuted by CI); Claude caught a CI-breaking format failure on #575.
|
||||
- **FINDINGS (north-star / Phase-3 blockers):**
|
||||
1. Ad-hoc `mosaic yolo {codex,pi}` via `start-agent-session.sh` DIE immediately in a
|
||||
detached tmux pane (codex: "stdin is not a terminal"; pi: same). Only the python stub
|
||||
survives. => Real runtimes have NEVER run durably in the fleet. Launch path (PATH/TTY
|
||||
in the detached shell) must be fixed before Phase-3 real-runtime swap. `fleet ps`
|
||||
caught both dead panes instantly (tool validated).
|
||||
2. `MOSAIC_AGENT_NAME` (set in systemd EnvironmentFile) is NOT propagated into tmux's
|
||||
global env, so agents defaulted to `unknown`. Worked around in dogfood-agent.py via
|
||||
tmux session-name fallback; the systemd/tmux env handoff needs a real fix.
|
||||
- Next: rebase on merged main, open Phase-2 PR, dual-engine review, merge, close
|
||||
`fleet-observability-1`. Defer launch-path + env-propagation fixes to Phase 3.
|
||||
|
||||
Reference in New Issue
Block a user