Files
stack/docs/scratchpads/h2-readiness-available.md
Jarvis f91bbeea48
All checks were successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/push/ci Pipeline was successful
fix(fleet): report idle agents as available
2026-06-24 08:48:23 -05:00

3.3 KiB

H2 — readiness semantics: available, not stuck

Objective

Correct fleet readiness semantics so a healthy long-idle agent is reported as available (good/assignable) instead of stuck (fault). Reserve stuck in the type/JSON value space for future positive block evidence.

Scope

  • packages/mosaic/src/commands/fleet.ts
    • replace idle readiness state with available
    • keep stuck in the union but stop emitting it from idle-only heuristics
    • remove stuck threshold helper/env handling
    • remove IDLE/STUCK alarm flags from table rendering
  • packages/mosaic/src/commands/fleet.spec.ts
    • update classifier branch/boundary tests
    • assert very long idle maps to available, not stuck
    • update table/JSON assertions for available with no alarm flags
    • remove stuck threshold helper tests

Acceptance Criteria

  • classifyReadiness() remains pure/total/never-throw and maps:
    • dead/stale/unknown unchanged
    • busy/null/undefined/non-finite idle to working
    • idle >= activity threshold to available
    • idle < activity threshold to working
  • No idle-derived path emits stuck.
  • MOSAIC_HEARTBEAT_IDLE_THRESHOLD remains backward compatible as the working→available activity threshold.
  • MOSAIC_HEARTBEAT_STUCK_THRESHOLD and helper/default are removed.
  • fleet ps keeps the idle-seconds column header IDLE, renders available in HB label, and does not add IDLE/STUCK warning flags.
  • Local gates green: build precheck, typecheck, lint, format:check, fleet vitest.
  • PR opened against main; no merge by worker.

Constraints / Assumptions

  • Source branch: origin/main @ 1020cfa.
  • docs/TASKS.md is orchestrator-owned; worker will not modify it.
  • Documentation impact is captured in this scratchpad and PR description; no user/admin guide behavior beyond CLI readiness label semantics.

Plan

  1. Install dependencies with requested PNPM environment.
  2. Inspect current H1/H1b readiness implementation and tests.
  3. Update classifier types/helpers/rendering.
  4. Update focused tests.
  5. Run build precheck + required gates.
  6. Run automated code review, remediate any findings.
  7. Queue guard, push, open PR.

Progress

  • 2026-06-24: Branch created from origin/main @ 1020cfa.
  • 2026-06-24: Replaced idle-derived idle/stuck outputs with available; retained stuck in type union for future positive block evidence.
  • 2026-06-24: Removed stuck threshold env/helper plumbing and IDLE/STUCK alarm flags.
  • 2026-06-24: Updated classifier and table-render tests for available semantics.

Verification Evidence

  • pnpm install --store-dir "$HOME/.pnpm-store" — pass.
  • npx turbo build --filter=@mosaicstack/mosaic^... — pass, 12/12 tasks successful.
  • pnpm typecheck — pass, 41/41 tasks successful.
  • pnpm lint — pass, 23/23 tasks successful.
  • pnpm format:check — pass, all matched files use Prettier style.
  • pnpm --filter @mosaicstack/mosaic exec vitest run src/commands/fleet.spec.ts — pass, 177 tests.
  • ~/.config/mosaic/tools/codex/codex-code-review.sh --uncommitted — approve, 0 findings (reviewed supplied diff; sandbox file-inspection limitation noted by tool).

Risks / Blockers

  • No current blocker.
  • Review tool could not inspect repo files directly due sandbox wrapper limitation, but it reviewed the supplied diff and approved with no findings.