Files
stack/docs/federation/TASKS.md
jason.woltje c56dda74aa
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline failed
feat(federation): Step-CA sidecar in federated compose [FED-M2-02] (#490)
2026-04-22 02:21:49 +00:00

24 KiB
Raw Blame History

Tasks — Federation v1

Single-writer: orchestrator only. Workers read but never modify.

Mission: federation-v1-20260419 Schema: | id | status | description | issue | agent | branch | depends_on | estimate | notes | Status values: not-started | in-progress | done | blocked | failed | needs-qa Agent values: codex | glm-5.1 | haiku | sonnet | opus | (auto)

Scope of this file: M1 is fully decomposed below. M2M7 are placeholders pending each milestone's entry into active planning — the orchestrator expands them one milestone at a time to avoid speculative decomposition of work whose shape will depend on what M1 surfaces.


Milestone 1 — Federated tier infrastructure (FED-M1)

Goal: Gateway runs in federated tier with containerized PG+pgvector+Valkey. No federation logic yet. Existing standalone behavior does not regress.

id status description issue agent branch depends_on estimate notes
FED-M1-01 done Extend mosaic.config.json schema: add "federated" to tier enum in validator + TS types. Keep local and standalone working. Update schema docs/README where referenced. #460 sonnet feat/federation-m1-tier-config 4K Shipped in PR #470. Renamed teamstandalone; added team deprecation alias; added DEFAULT_FEDERATED_CONFIG.
FED-M1-02 done Author docker-compose.federated.yml as an overlay profile: Postgres 17 + pgvector extension (port 5433), Valkey (6380), named volumes, healthchecks. Compose-up should boot cleanly on a clean machine. #460 sonnet feat/federation-m1-compose FED-M1-01 5K Shipped in PR #471. Overlay defines postgres-federated/valkey-federated, profile-gated, with pg-init for pgvector extension.
FED-M1-03 done Add pgvector support to packages/storage/src/adapters/postgres.ts: create extension on init (idempotent), expose vector column type in schema helpers. No adapter changes for non-federated tiers. #460 sonnet feat/federation-m1-pgvector FED-M1-02 8K Shipped in PR #472. enableVector flag on postgres StorageConfig; idempotent CREATE EXTENSION before migrations.
FED-M1-04 done Implement apps/gateway/src/bootstrap/tier-detector.ts: reads config, asserts PG/Valkey/pgvector reachable for federated, fail-fast with actionable error message on failure. Unit tests for each failure mode. #460 sonnet feat/federation-m1-detector FED-M1-03 8K Shipped in PR #473. 12 tests; 5s timeouts on probes; pgvector library/permission discrimination; rejects non-bullmq for federated.
FED-M1-05 done Write scripts/migrate-to-federated.ts: one-way migration from local (PGlite) / standalone (PG without pgvector) → federated. Dumps, transforms, loads; dry-run + confirm UX. Idempotent on re-run. #460 sonnet feat/federation-m1-migrate FED-M1-04 10K Shipped in PR #474. mosaic storage migrate-tier; DrizzleMigrationSource (corrects P0 found in review); 32 tests; idempotent.
FED-M1-06 done Update mosaic doctor: report current tier, required services, actual health per service, pgvector presence, overall green/yellow/red. Machine-readable JSON output flag for CI use. #460 sonnet feat/federation-m1-doctor FED-M1-04 6K Shipped in PR #475 as mosaic gateway doctor. Probes lifted to @mosaicstack/storage; structural TierConfig breaks dep cycle.
FED-M1-07 done Integration test: gateway boots in federated tier with docker-compose federated profile; refuses to boot when PG unreachable (asserts fail-fast); pgvector extension query succeeds. #460 sonnet feat/federation-m1-integration FED-M1-04 8K Shipped in PR #476. 3 test files, 4 tests, gated by FEDERATED_INTEGRATION=1; reserved-port helper avoids host collisions.
FED-M1-08 done Integration test for migration script: seed a local PGlite with representative data (tasks, notes, users, teams), run migration, assert row counts + key samples equal on federated PG. #460 sonnet feat/federation-m1-migrate-test FED-M1-05 6K Shipped in PR #477. Caught P0 in M1-05 (camelCase→snake_case) missed by mocked unit tests; fix in same PR.
FED-M1-09 done Standalone regression: full agent-session E2E on existing standalone tier with a gateway built from this branch. Must pass without referencing any federation module. #460 sonnet feat/federation-m1-regression FED-M1-07 4K Clean canary. 351 gateway tests + 85 storage unit tests + full pnpm test all green; only FEDERATED_INTEGRATION-gated tests skip.
FED-M1-10 done Code review pass: security-focused on the migration script (data-at-rest during migration) + tier detector (error-message sensitivity leakage). Independent reviewer, not authors of tasks 01-09. #460 sonnet feat/federation-m1-security-review FED-M1-09 8K 2 review rounds caught 7 issues: credential leak in pg/valkey/pgvector errors + redact-error util; missing advisory lock; SKIP_TABLES rationale.
FED-M1-11 done Docs update: docs/federation/ operator notes for tier setup; README blurb on federated tier; docs/guides/ entry for migration. Do NOT touch runbook yet (deferred to FED-M7). #460 haiku feat/federation-m1-docs FED-M1-10 4K Shipped: docs/federation/SETUP.md (119 lines), docs/guides/migrate-tier.md (147 lines), README Configuration blurb.
FED-M1-12 done PR, CI green, merge to main, close #460. #460 sonnet feat/federation-m1-close FED-M1-11 3K M1 closed. PRs #470-#480 merged across 11 tasks. Issue #460 closed; release tag fed-v0.1.0-m1 published.

M1 total estimate: ~74K tokens (over-budget vs 20K PRD estimate — explanation below)

Why over-budget: PRD's 20K estimate reflected implementation complexity only. The per-task breakdown includes tests, review, and docs as separate tasks per the delivery cycle, which catches the real cost. The final per-milestone budgets in MISSION-MANIFEST will be updated after M1 completes with actuals.


Pre-M2 — Test deployment infrastructure (FED-M2-DEPLOY)

Goal: Two federated-tier gateways stood up on Portainer at mos-test-1.woltje.com and mos-test-2.woltje.com running the M1 release (gateway:fed-v0.1.0-m1). This is the test bed for M2 enrollment work and the M3 federation E2E harness. No federation logic exercised yet — pure infrastructure validation.

Why now: M2 enrollment requires a real second gateway to test peer-add flows; standing the test hosts up before M2 code lands gives both code and deployment streams a fast feedback loop.

Parallelizable: This workstream runs in parallel with the M2 code workstream (M2-01 → M2-13). They re-converge at M2-10 (E2E test).

Tracking issue: #482.

id status description issue agent branch depends_on estimate notes
FED-M2-DEPLOY-01 done Verify gateway:fed-v0.1.0-m1 image was published by .woodpecker/publish.yml on tag push; if not, investigate and remediate. Document image URI in deployment artifact. #482 sonnet (verified inline, no PR) 2K Tag exists; digest sha256:9b72e202a9eecc27d31920b87b475b9e96e483c0323acc57856be4b1355db1ec captured for digest-pinned deploys.
FED-M2-DEPLOY-02 done Author Portainer git-stack compose file deploy/portainer/federated-test.stack.yml (gateway + PG-pgvector + Valkey, env-driven). Use immutable tag, not latest. #482 sonnet feat/federation-deploy-stack-template DEPLOY-01 5K Shipped in PR #485. Digest-pinned. Env: STACK_NAME, HOST_FQDN, POSTGRES_PASSWORD, BETTER_AUTH_SECRET, BETTER_AUTH_URL.
FED-M2-DEPLOY-IMG-FIX in-progress Gateway image runtime broken (ERR_MODULE_NOT_FOUND for dotenv); Dockerfile copies .pnpm/ store but not apps/gateway/node_modules symlinks. Switch to pnpm deploy for self-contained runtime. #482 sonnet (subagent in flight) DEPLOY-02 4K Subagent a78a9ab0ddae91fbc in flight. Triggers Kaniko rebuild on merge; capture new digest; bump stack template in follow-up PR before redeploy.
FED-M2-DEPLOY-03 blocked Deploy stack to mos-test-1.woltje.com via ~/.config/mosaic/tools/portainer/. Verify M1 acceptance: federated-tier boot succeeds; mosaic gateway doctor --json returns green; pgvector vector(3) round-trip works. #482 sonnet feat/federation-deploy-test-1 IMG-FIX 3K Stack created on Portainer endpoint 3 (Swarm local), but blocked on image fix. Container fails on boot until IMG-FIX merges + redeploy.
FED-M2-DEPLOY-04 blocked Deploy stack to mos-test-2.woltje.com via Portainer wrapper. Same M1 acceptance probes as DEPLOY-03. #482 sonnet feat/federation-deploy-test-2 IMG-FIX 3K Same status as DEPLOY-03. Stack created; blocked on image fix.
FED-M2-DEPLOY-05 not-started Document deployment in docs/federation/TEST-INFRA.md: hosts, image tags, secrets sourcing, redeploy procedure, teardown. Update MISSION-MANIFEST with deployment status. #482 haiku feat/federation-deploy-docs DEPLOY-03,04 3K Operator-facing doc; mentions but does not duplicate tools/portainer/README.md.

Deploy workstream estimate: ~16K tokens


Milestone 2 — Step-CA + grant schema + admin CLI (FED-M2)

Goal: An admin can create a federation grant; counterparty enrolls; cert is signed by Step-CA with SAN OIDs for grantId + subjectUserId. No runtime federation traffic flows yet (that's M3).

id status description issue agent branch depends_on estimate notes
FED-M2-01 needs-qa DB migration: federation_grants, federation_peers, federation_audit_log tables + enum types (grant_status, peer_state). Drizzle schema + migration generation; migration tests. #461 sonnet feat/federation-m2-schema 5K PR #486 open. First review NEEDS CHANGES (missing DESC indexes + reserved cols). Remediation subagent a673dd9355dc26f82 in flight in worktree agent-a4404ac1.
FED-M2-02 not-started Add Step-CA sidecar to docker-compose.federated.yml: official smallstep/step-ca image, persistent CA volume, JWK provisioner config baked into init script. #461 sonnet feat/federation-m2-stepca DEPLOY-02 4K Profile-gated under federated. CA password from secret; dev compose uses dev-only password file.
FED-M2-03 not-started Scope JSON schema + validator: resources allowlist, excluded_resources, include_teams, include_personal, max_rows_per_query. Vitest unit tests for valid + invalid scopes. #461 sonnet feat/federation-m2-scope-schema 4K Validator independent of CA — reusable from grant CRUD + (later) M3 scope enforcement.
FED-M2-04 not-started apps/gateway/src/federation/ca.service.ts: Step-CA client (CSR submission, OID-bearing cert retrieval). Mocked + integration tests against real Step-CA container. #461 sonnet feat/federation-m2-ca-service M2-02 6K SAN OIDs: grantId (custom OID 1.3.6.1.4.1.99999.1) + subjectUserId (1.3.6.1.4.1.99999.2). Document OID assignments in PRD/SETUP. Acceptance: must (a) wire federation.tpl template into mosaic-fed provisioner config and (b) include a unit/integration test asserting issued certs contain BOTH OIDs — fails-loud guard against silent OID stripping (carry-forward from M2-02 review).
FED-M2-05 not-started Sealed storage for client_key_pem reusing existing provider_credentials sealing key. Tests prove DB-at-rest is ciphertext, not PEM. Key rotation path documented (deferred impl). #461 sonnet feat/federation-m2-key-sealing M2-01 5K Separate from M2-06 to keep crypto seam isolated; reviewer focus is sealing only.
FED-M2-06 not-started grants.service.ts: CRUD + status transitions (pendingactiverevoked); integrates M2-03 (scope) + M2-05 (sealing). Unit tests cover all transitions including invalid ones. #461 sonnet feat/federation-m2-grants-service M2-03, M2-05 6K Business logic only — CSR + cert work delegated to M2-04. Revocation handler is M6.
FED-M2-07 not-started enrollment.controller.ts: short-lived single-use token endpoint; CSR signing; updates grant pendingactive; emits enrollment audit (table-only write, M4 tightens). #461 sonnet feat/federation-m2-enrollment M2-04, M2-06 6K Tokens single-use with 410 on replay; tokens TTL'd at 15min; rate-limited at request layer (M4 introduces guard, M2 uses simple lock).
FED-M2-08 not-started Admin CLI: mosaic federation grant create/list/show + peer add/list. Integration with grants.service (no API duplication). Help output + machine-readable JSON option. #461 sonnet feat/federation-m2-cli M2-06, M2-07 7K peer add <enrollment-url> is the client-side flow; resolves enrollment URL → CSR → store sealed key + cert.
FED-M2-09 not-started Integration tests covering MILESTONES.md M2 acceptance tests #1, #2, #3, #5, #7, #8 (single-gateway suite). Real Step-CA container; vitest profile gated by FEDERATED_INTEGRATION=1. #461 sonnet feat/federation-m2-integration M2-08 8K Tests #4 (cert OID match) + #6 (two-gateway peer-add) handled separately by M2-10 (E2E).
FED-M2-10 not-started E2E test against deployed mos-test-1 + mos-test-2 (or local two-gateway docker-compose if Portainer not ready): MILESTONES test #6 peer add yields active peer record with valid cert + key. #461 sonnet feat/federation-m2-e2e M2-08, DEPLOY-04 6K Falls back to local docker-compose-two-gateways if remote test hosts not yet available. Documents both paths.
FED-M2-11 not-started Independent security review (sonnet, not author of M2-04/05/06/07): focus on single-use token replay, sealing leak surfaces, OID match enforcement, scope schema bypass paths. #461 sonnet feat/federation-m2-security-review M2-10 8K Apply M1 two-round pattern. Reviewer should explicitly attempt enrollment-token replay, OID-spoofing CSR, and key leak in error messages.
FED-M2-12 not-started Docs update: docs/federation/SETUP.md Step-CA section; new docs/federation/ADMIN-CLI.md with grant/peer commands; scope schema reference; OID registration note. Runbook still M7-deferred. #461 haiku feat/federation-m2-docs M2-11 4K Adds CA bootstrap section to SETUP.md with docker compose --profile federated up step-ca example.
FED-M2-13 not-started PR aggregate close, CI green, merge to main, close #461. Release tag fed-v0.2.0-m2. Mark deploy stream complete. Update mission manifest M2 row. #461 sonnet feat/federation-m2-close M2-12 3K Same close pattern as M1-12; queue-guard before merge; tea release-create with notes including deploy-stream PRs.

M2 code workstream estimate: ~72K tokens (vs MILESTONES.md 30K — same over-budget pattern as M1, where per-task breakdown including tests/review/docs catches the real cost).

Deploy + code combined: ~88K tokens.

Milestone 3 — mTLS handshake + list/get + scope enforcement (FED-M3)

Deferred. Issue #462.

Milestone 4 — search + audit + rate limit (FED-M4)

Deferred. Issue #463.

Milestone 5 — cache + offline + OTEL (FED-M5)

Deferred. Issue #464.

Milestone 6 — revocation + auto-renewal + CRL (FED-M6)

Deferred. Issue #465.

Milestone 7 — multi-user hardening + acceptance suite (FED-M7)

Deferred. Issue #466.


Execution Notes

Agent assignment rationale:

  • codex for most implementation tasks (OpenAI credit pool preferred for feature code)
  • sonnet for tests (pattern-based, moderate complexity), doctor work (cross-cutting), and independent code review
  • haiku for docs and the standalone regression canary (cheapest tier for mechanical/verification work)
  • No opus in M1 — save for cross-cutting architecture decisions if they surface later

Branch strategy: Each task gets its own feature branch off main. Tasks within a milestone merge in dependency order. Final aggregate PR (FED-M1-12) isn't a branch of its own — it's the merge of the last upstream task that closes the issue.

Queue guard: Every push and every merge in this mission must run ~/.config/mosaic/tools/git/ci-queue-wait.sh --purpose push|merge per Mosaic hard gate #6.