8.6 KiB
Fleet Backlog Conventions
The backlog is Mosaic's native backlog-of-record for fleet work. It is built
end-to-end on Mosaic's own storage layer (@mosaicstack/db, drizzle/Postgres)
and surfaced as mosaic fleet backlog <sub> --json.
Mosaic-native, no Hermes. This backlog REPLACES the former Hermes adapter. There is no runtime dependency on Hermes,
hermes kanban, or~/.hermesanywhere in this feature. Anything previously delegated to Hermes is recreated here on Mosaic's own Postgres storage layer.
Storage tier — PGlite by default, Postgres by config
The backlog uses the existing Mosaic storage layer; there is no new database engine (no sqlite, no raw client).
| Condition | Tier | Data location |
|---|---|---|
DATABASE_URL set |
Full server Postgres | the configured database |
PGLITE_DATA_DIR set (no URL) |
Embedded PGlite | that directory |
| neither (default) | Embedded PGlite | ~/.config/mosaic/fleet/backlog |
PGlite is real Postgres semantics in-process — including the row locks the atomic claim relies on — so the same code runs on a laptop (embedded, single-host default) and on a full Postgres deployment. Switching tiers is config-only.
The schema (backlog table) is created automatically on first CLI use:
runMigrations() for Postgres, runPgliteMigrations() for embedded PGlite.
Update safety
The embedded PGlite store lives under ~/.config/mosaic/fleet/backlog, which is
listed in PRESERVE_PATHS in packages/mosaic/framework/install.sh. This means
mosaic update (which runs the framework sync with rsync --delete) will not
wipe the operator's backlog — same protection as the roster, per-agent env, and
heartbeat run dir.
Card schema
A card is one row in the backlog table:
| Column | Type | Notes |
|---|---|---|
id |
text (PK) | Stable, caller-supplied id (e.g. A4, fleet-001). |
title |
text | Required. |
body |
text (nullable) | Free-form description. |
phase |
text (nullable) | Board/phase grouping (see below). |
priority |
int (default 0) | Higher = sooner. Claim picks the max-priority ready card. |
status |
enum | ready | claimed | blocked | done. |
depends_on |
jsonb string[] |
DAG edges — ids of cards this one depends on. |
claim_owner |
text (nullable) | Owner token of the active claim. |
claim_ttl_seconds |
int (nullable) | TTL of the active claim. |
claimed_at |
timestamptz (null) | When the claim was taken. claimed_at + ttl = expiry. |
attempts |
int (default 0) | Incremented each time the card is claimed. |
idempotency_key |
text (unique, null) | Dedups create; NULLs are distinct in Postgres. |
acceptance |
jsonb (nullable) | Acceptance criteria (array of strings or object). |
created_at |
timestamptz | |
updated_at |
timestamptz |
depends_on is modeled as a jsonb array column rather than a separate edge
table. Justification: it matches the repo's existing style (e.g. tasks.tags,
agents.skills, routing_rules.conditions are all jsonb arrays), keeps a card
self-contained, and the DAG is small (per-card dependency lists), so a join table
would add ceremony without benefit.
Board / phase convention
phase is a free-form grouping string used as the board column / milestone label
(e.g. M1, fleet, infra). list --phase <phase> filters to one board lane.
priority orders cards within the ready pool regardless of phase.
Status lifecycle
create
│
▼
┌──────► ready ───── claim ─────► claimed ───── complete ─────► done
│ │ │
│ block reclaim (TTL expiry or --id)
│ ▼ │
│ blocked └──────────────────────────┘ (back to ready)
└──────────┘ (reclaim / re-create can return a card to ready)
- ready — eligible to be claimed once every
depends_oncard isdone. - claimed — a worker holds it;
claim_owner+claimed_atset. - blocked — explicitly parked; never auto-claimed.
- done — completed; satisfies dependents.
Atomic claim (FOR UPDATE SKIP LOCKED) + TTL
claim is atomic. Inside a single transaction it locks candidate ready rows
with SELECT ... FOR UPDATE SKIP LOCKED (via the drizzle sql operator), picks
the highest-priority deps-satisfied card, and flips it to claimed. Because a row
already locked by a concurrent claimer is skipped, two claimers can never
both win the same card — the loser falls through to the next candidate or gets
null. (Proven by the concurrency tests in packages/db/src/backlog.spec.ts.)
- Deps gate: a card is only claimable when every id in
depends_onisdone. - TTL:
claim --ttl <sec>(default 900s) recordsclaim_ttl_seconds. - reclaim: releases claims whose
claimed_at + ttlis in the past (expired) back toready, clearing the claim fields.reclaim --id <id>force-releases a specific card regardless of expiry. This is how a crashed worker's card returns to the pool.
CLI — mosaic fleet backlog <sub> --json
All subcommands support --json.
| Subcommand | Purpose |
|---|---|
create --id --title [--body --phase --priority --depends-on --acceptance --idempotency-key] |
Create a card; idempotency_key dedups (repeat returns the existing card). |
list [--status --phase --ready-only] |
List cards. --ready-only = status ready AND all deps done. |
claim --owner [--ttl <sec> --id <id>] |
Atomically claim the highest-priority ready card (or --id). Returns the card or null. |
reclaim [--id <id>] |
Release expired claims (or a specific card) back to ready. |
link --from --to |
Add a depends_on edge (--from depends on --to). |
stats |
Counts by status, oldest-ready age, expired-claim count. |
block --id |
Set a card to blocked. |
complete --id |
Set a card to done (releases any claim). |
Example
# Seed two cards, the second depends on the first.
mosaic fleet backlog create --id A1 --title "schema" --priority 5
mosaic fleet backlog create --id A2 --title "service" --depends-on A1 --priority 9
# A2 is gated on A1, so claim returns A1 first.
mosaic fleet backlog claim --owner worker-1 --ttl 600 --json
# Finish A1; now A2 is ready.
mosaic fleet backlog complete --id A1
mosaic fleet backlog list --ready-only --json
# Recover stalled work.
mosaic fleet backlog reclaim --json