Files
stack/docs/scratchpads/h2-readiness-available.md
jason.woltje 937077f6be
All checks were successful
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
fix(fleet): report idle agents as available, reserve stuck for genuine blocks (#653)
2026-06-24 13:58:22 +00:00

71 lines
3.3 KiB
Markdown

# H2 — readiness semantics: available, not stuck
## Objective
Correct fleet readiness semantics so a healthy long-idle agent is reported as `available` (good/assignable) instead of `stuck` (fault). Reserve `stuck` in the type/JSON value space for future positive block evidence.
## Scope
- `packages/mosaic/src/commands/fleet.ts`
- replace `idle` readiness state with `available`
- keep `stuck` in the union but stop emitting it from idle-only heuristics
- remove stuck threshold helper/env handling
- remove IDLE/STUCK alarm flags from table rendering
- `packages/mosaic/src/commands/fleet.spec.ts`
- update classifier branch/boundary tests
- assert very long idle maps to `available`, not `stuck`
- update table/JSON assertions for available with no alarm flags
- remove stuck threshold helper tests
## Acceptance Criteria
- `classifyReadiness()` remains pure/total/never-throw and maps:
- dead/stale/unknown unchanged
- busy/null/undefined/non-finite idle to `working`
- idle >= activity threshold to `available`
- idle < activity threshold to `working`
- No idle-derived path emits `stuck`.
- `MOSAIC_HEARTBEAT_IDLE_THRESHOLD` remains backward compatible as the working→available activity threshold.
- `MOSAIC_HEARTBEAT_STUCK_THRESHOLD` and helper/default are removed.
- `fleet ps` keeps the idle-seconds column header `IDLE`, renders `available` in HB label, and does not add IDLE/STUCK warning flags.
- Local gates green: build precheck, typecheck, lint, format:check, fleet vitest.
- PR opened against `main`; no merge by worker.
## Constraints / Assumptions
- Source branch: `origin/main` @ `1020cfa`.
- `docs/TASKS.md` is orchestrator-owned; worker will not modify it.
- Documentation impact is captured in this scratchpad and PR description; no user/admin guide behavior beyond CLI readiness label semantics.
## Plan
1. Install dependencies with requested PNPM environment.
2. Inspect current H1/H1b readiness implementation and tests.
3. Update classifier types/helpers/rendering.
4. Update focused tests.
5. Run build precheck + required gates.
6. Run automated code review, remediate any findings.
7. Queue guard, push, open PR.
## Progress
- 2026-06-24: Branch created from `origin/main` @ `1020cfa`.
- 2026-06-24: Replaced idle-derived `idle`/`stuck` outputs with `available`; retained `stuck` in type union for future positive block evidence.
- 2026-06-24: Removed stuck threshold env/helper plumbing and IDLE/STUCK alarm flags.
- 2026-06-24: Updated classifier and table-render tests for available semantics.
## Verification Evidence
- `pnpm install --store-dir "$HOME/.pnpm-store"` — pass.
- `npx turbo build --filter=@mosaicstack/mosaic^...` — pass, 12/12 tasks successful.
- `pnpm typecheck` — pass, 41/41 tasks successful.
- `pnpm lint` — pass, 23/23 tasks successful.
- `pnpm format:check` — pass, all matched files use Prettier style.
- `pnpm --filter @mosaicstack/mosaic exec vitest run src/commands/fleet.spec.ts` — pass, 177 tests.
- `~/.config/mosaic/tools/codex/codex-code-review.sh --uncommitted` — approve, 0 findings (reviewed supplied diff; sandbox file-inspection limitation noted by tool).
## Risks / Blockers
- No current blocker.
- Review tool could not inspect repo files directly due sandbox wrapper limitation, but it reviewed the supplied diff and approved with no findings.