From 78841f228a522520ad023d3602fc1871df0c063f Mon Sep 17 00:00:00 2001 From: "jason.woltje" Date: Mon, 20 Apr 2026 02:07:15 +0000 Subject: [PATCH] docs(federation): operator setup + migration guides (FED-M1-11) (#480) --- README.md | 2 + docs/federation/SETUP.md | 119 +++++++++++++++++++++++++++++ docs/federation/TASKS.md | 2 +- docs/guides/migrate-tier.md | 147 ++++++++++++++++++++++++++++++++++++ 4 files changed, 269 insertions(+), 1 deletion(-) create mode 100644 docs/federation/SETUP.md create mode 100644 docs/guides/migrate-tier.md diff --git a/README.md b/README.md index da3654d..2784568 100644 --- a/README.md +++ b/README.md @@ -80,6 +80,8 @@ If you already have a gateway account but no token, use `mosaic gateway config r ### Configuration +Mosaic supports three storage tiers: `local` (PGlite, single-host), `standalone` (PostgreSQL, single-host), and `federated` (PostgreSQL + pgvector + Valkey, multi-host). See [Federated Tier Setup](docs/federation/SETUP.md) for multi-user and production deployments, or [Migrating to Federated](docs/guides/migrate-tier.md) to upgrade from existing tiers. + ```bash mosaic config show # Print full config as JSON mosaic config get # Read a specific key diff --git a/docs/federation/SETUP.md b/docs/federation/SETUP.md new file mode 100644 index 0000000..5b3f486 --- /dev/null +++ b/docs/federation/SETUP.md @@ -0,0 +1,119 @@ +# Federated Tier Setup Guide + +## What is the federated tier? + +The federated tier is designed for multi-user and multi-host deployments. It consists of PostgreSQL 17 with pgvector extension (for embeddings and RAG), Valkey for distributed task queueing and caching, and a shared configuration across multiple Mosaic gateway instances. Use this tier when running Mosaic in production or when scaling beyond a single-host deployment. + +## Prerequisites + +- Docker and Docker Compose installed +- Ports 5433 (PostgreSQL) and 6380 (Valkey) available on your host (or adjust environment variables) +- At least 2 GB free disk space for data volumes + +## Start the federated stack + +Run the federated overlay: + +```bash +docker compose -f docker-compose.federated.yml --profile federated up -d +``` + +This starts PostgreSQL 17 with pgvector and Valkey 8. The pgvector extension is created automatically on first boot. + +Verify the services are running: + +```bash +docker compose -f docker-compose.federated.yml ps +``` + +Expected output shows `postgres-federated` and `valkey-federated` both healthy. + +## Configure mosaic for federated tier + +Create or update your `mosaic.config.json`: + +```json +{ + "tier": "federated", + "database": "postgresql://mosaic:mosaic@localhost:5433/mosaic", + "queue": "redis://localhost:6380" +} +``` + +If you're using environment variables instead: + +```bash +export DATABASE_URL="postgresql://mosaic:mosaic@localhost:5433/mosaic" +export REDIS_URL="redis://localhost:6380" +``` + +## Verify health + +Run the health check: + +```bash +mosaic gateway doctor +``` + +Expected output (green): + +``` +Tier: federated Config: mosaic.config.json + ✓ postgres localhost:5433 (42ms) + ✓ valkey localhost:6380 (8ms) + ✓ pgvector (embedded) (15ms) +``` + +For JSON output (useful in CI/automation): + +```bash +mosaic gateway doctor --json +``` + +## Troubleshooting + +### Port conflicts + +**Symptom:** `bind: address already in use` + +**Fix:** Stop the base dev stack first: + +```bash +docker compose down +docker compose -f docker-compose.federated.yml --profile federated up -d +``` + +Or change the host port with an environment variable: + +```bash +PG_FEDERATED_HOST_PORT=5434 VALKEY_FEDERATED_HOST_PORT=6381 \ + docker compose -f docker-compose.federated.yml --profile federated up -d +``` + +### pgvector extension error + +**Symptom:** `ERROR: could not open extension control file` + +**Fix:** pgvector is created at first boot. Check logs: + +```bash +docker compose -f docker-compose.federated.yml logs postgres-federated | grep -i vector +``` + +If missing, exec into the container and create it manually: + +```bash +docker exec psql -U mosaic -d mosaic -c "CREATE EXTENSION vector;" +``` + +### Valkey connection refused + +**Symptom:** `Error: connect ECONNREFUSED 127.0.0.1:6380` + +**Fix:** Check service health: + +```bash +docker compose -f docker-compose.federated.yml logs valkey-federated +``` + +If Valkey is running, verify your firewall allows 6380. On macOS, Docker Desktop may require binding to `host.docker.internal` instead of `localhost`. diff --git a/docs/federation/TASKS.md b/docs/federation/TASKS.md index d3afefd..676f0c3 100644 --- a/docs/federation/TASKS.md +++ b/docs/federation/TASKS.md @@ -27,7 +27,7 @@ Goal: Gateway runs in `federated` tier with containerized PG+pgvector+Valkey. No | FED-M1-08 | done | Integration test for migration script: seed a local PGlite with representative data (tasks, notes, users, teams), run migration, assert row counts + key samples equal on federated PG. | #460 | sonnet | feat/federation-m1-migrate-test | FED-M1-05 | 6K | Shipped in PR #477. Caught P0 in M1-05 (camelCase→snake_case) missed by mocked unit tests; fix in same PR. | | FED-M1-09 | done | Standalone regression: full agent-session E2E on existing `standalone` tier with a gateway built from this branch. Must pass without referencing any federation module. | #460 | sonnet | feat/federation-m1-regression | FED-M1-07 | 4K | Clean canary. 351 gateway tests + 85 storage unit tests + full pnpm test all green; only FEDERATED_INTEGRATION-gated tests skip. | | FED-M1-10 | done | Code review pass: security-focused on the migration script (data-at-rest during migration) + tier detector (error-message sensitivity leakage). Independent reviewer, not authors of tasks 01-09. | #460 | sonnet | feat/federation-m1-security-review | FED-M1-09 | 8K | 2 review rounds caught 7 issues: credential leak in pg/valkey/pgvector errors + redact-error util; missing advisory lock; SKIP_TABLES rationale. | -| FED-M1-11 | not-started | Docs update: `docs/federation/` operator notes for tier setup; README blurb on federated tier; `docs/guides/` entry for migration. Do NOT touch runbook yet (deferred to FED-M7). | #460 | haiku | feat/federation-m1-docs | FED-M1-10 | 4K | Short, actionable. Link from MISSION-MANIFEST. No decisions captured here — those belong in PRD. | +| FED-M1-11 | done | Docs update: `docs/federation/` operator notes for tier setup; README blurb on federated tier; `docs/guides/` entry for migration. Do NOT touch runbook yet (deferred to FED-M7). | #460 | haiku | feat/federation-m1-docs | FED-M1-10 | 4K | Shipped: `docs/federation/SETUP.md` (119 lines), `docs/guides/migrate-tier.md` (147 lines), README Configuration blurb. | | FED-M1-12 | not-started | PR, CI green, merge to main, close #460. | #460 | — | (aggregate) | FED-M1-11 | 3K | Queue-guard before push; wait for green; merge squashed; tea `issue-close` #460. | **M1 total estimate:** ~74K tokens (over-budget vs 20K PRD estimate — explanation below) diff --git a/docs/guides/migrate-tier.md b/docs/guides/migrate-tier.md new file mode 100644 index 0000000..c920545 --- /dev/null +++ b/docs/guides/migrate-tier.md @@ -0,0 +1,147 @@ +# Migrating to the Federated Tier + +Step-by-step guide to migrate from `local` (PGlite) or `standalone` (PostgreSQL without pgvector) to `federated` (PostgreSQL 17 + pgvector + Valkey). + +## When to migrate + +Migrate to federated tier when: + +- Scaling from single-user to multi-user deployments +- Adding vector embeddings or RAG features +- Running Mosaic across multiple hosts +- Requires distributed task queueing and caching +- Moving to production with high availability + +## Prerequisites + +- Federated stack running and healthy (see [Federated Tier Setup](../federation/SETUP.md)) +- Source database accessible and empty target database at the federated URL +- Backup of source database (recommended before any migration) + +## Dry-run first + +Always run a dry-run to validate the migration: + +```bash +mosaic storage migrate-tier --to federated \ + --target-url postgresql://mosaic:mosaic@localhost:5433/mosaic \ + --dry-run +``` + +Expected output (partial example): + +``` +[migrate-tier] Analyzing source tier: pglite +[migrate-tier] Analyzing target tier: federated +[migrate-tier] Precondition: target is empty ✓ + users: 5 rows + teams: 2 rows + conversations: 12 rows + messages: 187 rows + ... (all tables listed) +[migrate-tier] NOTE: Source tier has no pgvector support. insights.embedding will be NULL on all migrated rows. +[migrate-tier] DRY-RUN COMPLETE (no data written). 206 total rows would be migrated. +``` + +Review the output. If it shows an error (e.g., target not empty), address it before proceeding. + +## Run the migration + +When ready, run without `--dry-run`: + +```bash +mosaic storage migrate-tier --to federated \ + --target-url postgresql://mosaic:mosaic@localhost:5433/mosaic \ + --yes +``` + +The `--yes` flag skips the confirmation prompt (required in non-TTY environments like CI). + +The command will: + +1. Acquire an advisory lock (blocks concurrent invocations) +2. Copy data from source to target in dependency order +3. Report rows migrated per table +4. Display any warnings (e.g., null vector embeddings) + +## What gets migrated + +All persistent, user-bound data is migrated in dependency order: + +- **users, teams, team_members** — user and team ownership +- **accounts** — OAuth provider tokens (durable credentials) +- **projects, agents, missions, tasks** — all project and agent definitions +- **conversations, messages** — all chat history +- **preferences, insights, agent_logs** — preferences and observability +- **provider_credentials** — stored API keys and secrets +- **tickets, events, skills, routing_rules, appreciations** — auxiliary records + +Full order is defined in code (`MIGRATION_ORDER` in `packages/storage/src/migrate-tier.ts`). + +## What gets skipped and why + +Three tables are intentionally not migrated: + +| Table | Reason | +| ----------------- | ----------------------------------------------------------------------------------------------- | +| **sessions** | TTL'd auth sessions from the old environment; they will fail JWT verification on the new target | +| **verifications** | One-time tokens (email verify, password reset) that have either expired or been consumed | +| **admin_tokens** | Hashed tokens bound to the old environment's secret keys; must be re-issued | + +**Note on accounts and provider_credentials:** These durable credentials ARE migrated because they are user-bound and required for resuming agent work on the target environment. After migration to a multi-tenant federated deployment, operators may want to audit or wipe these if users are untrusted or credentials should not be shared. + +## Idempotency and concurrency + +The migration is **idempotent**: + +- Re-running is safe (uses `ON CONFLICT DO UPDATE` internally) +- Ideal for retries on transient failures +- Concurrent invocations are blocked by a Postgres advisory lock; the second caller will wait + +If a previous run is stuck, check for advisory locks: + +```sql +SELECT * FROM pg_locks WHERE locktype='advisory'; +``` + +If you need to force-unlock (dangerous): + +```sql +SELECT pg_advisory_unlock(); +``` + +## Verify the migration + +After migration completes, spot-check the target: + +```bash +# Count rows on a few critical tables +psql postgresql://mosaic:mosaic@localhost:5433/mosaic -c \ + "SELECT 'users' as table, COUNT(*) FROM users UNION ALL + SELECT 'conversations' as table, COUNT(*) FROM conversations UNION ALL + SELECT 'messages' as table, COUNT(*) FROM messages;" +``` + +Verify a known user or project exists by ID: + +```bash +psql postgresql://mosaic:mosaic@localhost:5433/mosaic -c \ + "SELECT id, email FROM users WHERE email='';" +``` + +Ensure vector embeddings are NULL (if source was PGlite) or populated (if source was postgres + pgvector): + +```bash +psql postgresql://mosaic:mosaic@localhost:5433/mosaic -c \ + "SELECT embedding IS NOT NULL as has_vector FROM insights LIMIT 5;" +``` + +## Rollback + +There is no in-place rollback. If the migration fails: + +1. Restore the target database from a pre-migration backup +2. Investigate the failure logs +3. Rerun the migration + +Always test migrations in a staging environment first.