feat(gateway): tier-detector with fail-fast PG/Valkey/pgvector probes (FED-M1-04)
Implements `apps/gateway/src/bootstrap/tier-detector.ts` invoked from
`main.ts` before NestJS bootstraps. For each tier:
- `local`: no-op (PGlite is in-process)
- `standalone`: probe Postgres + Valkey
- `federated`: probe Postgres + Valkey + pgvector extension; reject
config upfront if `queue.type !== 'bullmq'`
Each probe has a 5-second hard cap and emits a structured
`TierDetectionError` with service / host / port / remediation. The
remediation field discriminates pgvector failure modes ("library not
available" vs "permission denied") so operators get actionable hints
without leaking credentials.
Adds `postgres` and `ioredis` as direct gateway deps; previously only
transitive. 12 unit tests cover happy paths and each fail-fast branch.
Refs #460
This commit is contained in:
@@ -19,8 +19,8 @@ Goal: Gateway runs in `federated` tier with containerized PG+pgvector+Valkey. No
|
||||
| --------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | ------ | ------------------------------- | ---------- | -------- | --------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| FED-M1-01 | done | Extend `mosaic.config.json` schema: add `"federated"` to `tier` enum in validator + TS types. Keep `local` and `standalone` working. Update schema docs/README where referenced. | #460 | sonnet | feat/federation-m1-tier-config | — | 4K | Shipped in PR #470. Renamed `team` → `standalone`; added `team` deprecation alias; added `DEFAULT_FEDERATED_CONFIG`. |
|
||||
| FED-M1-02 | in-progress | Author `docker-compose.federated.yml` as an overlay profile: Postgres 17 + pgvector extension (port 5433), Valkey (6380), named volumes, healthchecks. Compose-up should boot cleanly on a clean machine. | #460 | sonnet | feat/federation-m1-compose | FED-M1-01 | 5K | Bumped PG16→PG17 to match base compose. Overlay defines distinct `postgres-federated`/`valkey-federated` services, profile-gated. |
|
||||
| FED-M1-03 | not-started | Add pgvector support to `packages/storage/src/adapters/postgres.ts`: create extension on init (idempotent), expose vector column type in schema helpers. No adapter changes for non-federated tiers. | #460 | codex | feat/federation-m1-pgvector | FED-M1-02 | 8K | Extension create is idempotent `CREATE EXTENSION IF NOT EXISTS vector`. Gate on tier = federated. |
|
||||
| FED-M1-04 | not-started | Implement `apps/gateway/src/bootstrap/tier-detector.ts`: reads config, asserts PG/Valkey/pgvector reachable for `federated`, fail-fast with actionable error message on failure. Unit tests for each failure mode. | #460 | codex | feat/federation-m1-detector | FED-M1-03 | 8K | Structured error type with remediation hints. Logs which service failed, with host:port attempted. |
|
||||
| FED-M1-03 | done | Add pgvector support to `packages/storage/src/adapters/postgres.ts`: create extension on init (idempotent), expose vector column type in schema helpers. No adapter changes for non-federated tiers. | #460 | sonnet | feat/federation-m1-pgvector | FED-M1-02 | 8K | Shipped in PR #472. `enableVector` flag on postgres StorageConfig; idempotent CREATE EXTENSION before migrations. |
|
||||
| FED-M1-04 | in-progress | Implement `apps/gateway/src/bootstrap/tier-detector.ts`: reads config, asserts PG/Valkey/pgvector reachable for `federated`, fail-fast with actionable error message on failure. Unit tests for each failure mode. | #460 | sonnet | feat/federation-m1-detector | FED-M1-03 | 8K | Worker delivered; reviewer flagged 3 issues (Valkey timeout, pgvector error discrimination, federated/non-bullmq guard) — fixed. |
|
||||
| FED-M1-05 | not-started | Write `scripts/migrate-to-federated.ts`: one-way migration from `local` (PGlite) / `standalone` (PG without pgvector) → `federated`. Dumps, transforms, loads; dry-run + confirm UX. Idempotent on re-run. | #460 | codex | feat/federation-m1-migrate | FED-M1-04 | 10K | Do NOT run automatically. CLI subcommand `mosaic migrate tier --to federated --dry-run`. Safety rails. |
|
||||
| FED-M1-06 | not-started | Update `mosaic doctor`: report current tier, required services, actual health per service, pgvector presence, overall green/yellow/red. Machine-readable JSON output flag for CI use. | #460 | sonnet | feat/federation-m1-doctor | FED-M1-04 | 6K | Existing doctor output evolves; add `--json` flag. Green/yellow/red + remediation suggestions per issue. |
|
||||
| FED-M1-07 | not-started | Integration test: gateway boots in `federated` tier with docker-compose `federated` profile; refuses to boot when PG unreachable (asserts fail-fast); pgvector extension query succeeds. | #460 | sonnet | feat/federation-m1-integration | FED-M1-04 | 8K | Vitest + docker-compose test profile. One test file per assertion; real services, no mocks. |
|
||||
|
||||
@@ -343,3 +343,39 @@ Affected files (storage-tier semantics only — Team/workspace usages unaffected
|
||||
|
||||
- `MVP-T04` (sync `.mosaic/orchestrator/mission.json`) still deferred.
|
||||
- `team` tier rename touches install wizard headless env vars (`MOSAIC_STORAGE_TIER=team`); will need 0.0.x deprecation note in scratchpad if release notes are written this milestone.
|
||||
|
||||
---
|
||||
|
||||
## Session 17 — 2026-04-19 — claude
|
||||
|
||||
**Mode:** Delivery (W1 / FED-M1 execution; resumed after compaction)
|
||||
**Branches landed this run:** `feat/federation-m1-tier-config` (PR #470), `feat/federation-m1-compose` (PR #471), `feat/federation-m1-pgvector` (PR #472)
|
||||
**Branch active at end:** `feat/federation-m1-detector` (FED-M1-04, ready to push)
|
||||
|
||||
**Tasks closed:** FED-M1-01, FED-M1-02, FED-M1-03 (all merged to `main` via squash, CI green, issue #460 still open as milestone).
|
||||
|
||||
**FED-M1-04 — tier-detector:** Worker delivered `apps/gateway/src/bootstrap/tier-detector.ts` (~210 lines) + `tier-detector.spec.ts` (12 tests). Independent code review (sonnet) returned `changes-required` with 3 issues:
|
||||
|
||||
1. CRITICAL: `probeValkey` missing `connectTimeout: 5000` on the ioredis Redis client (defaulted to 10s, violated fail-fast spec).
|
||||
2. IMPORTANT: `probePgvector` catch block did not discriminate "library not installed" (use `pgvector/pgvector:pg17`) from permission errors.
|
||||
3. IMPORTANT: Federated tier silently skipped Valkey probe when `queue.type !== 'bullmq'` (computed Valkey URL conditionally).
|
||||
|
||||
Worker fix-up round addressed all three:
|
||||
|
||||
- L147: `connectTimeout: 5000` added to Redis options
|
||||
- L113-117: catch block branches on `extension "vector" is not available` substring → distinct remediation per failure mode
|
||||
- L206-215: federated branch fails fast with `service: 'config'` if `queue.type !== 'bullmq'`, then probes Valkey unconditionally
|
||||
- 4 new tests (8 → 12 total) cover each fix specifically
|
||||
|
||||
Independent verifier (haiku) confirmed all 6 verification claims (line numbers, test presence, suite green: 12/12 PASS).
|
||||
|
||||
**Process note — review pipeline working as designed:**
|
||||
|
||||
Initial verifier (haiku) on the first delivery returned "OK to ship" but missed the 3 deeper issues that the sonnet code-reviewer caught. This validates the user's "always verify subagent claims independently with another subagent" rule — but specifically with the **right tier** for the task: code review needs sonnet-level reasoning, while haiku is fine for verifying surface claims (line counts, file existence) once review issues are known. Going forward: code review uses sonnet (`feature-dev:code-reviewer`), claim verification uses haiku.
|
||||
|
||||
**Followup tasks tracked but deferred:**
|
||||
|
||||
- #7: `tier=local` hardcoded in gateway-config resume branches (~262, ~317) — pre-existing bug, fix during M1-06 (doctor) or M1-09 (regression).
|
||||
- #8: confirm `packages/config/dist` not git-tracked.
|
||||
|
||||
**Next:** PR for FED-M1-04 → CI wait → merge. Then FED-M1-05 (migration script, codex/sonnet, 10K).
|
||||
|
||||
Reference in New Issue
Block a user