fix(storage): redact credentials in driver errors + advisory lock (FED-M1-10) #479

Merged
jason.woltje merged 1 commits from feat/federation-m1-security-review into main 2026-04-20 02:02:58 +00:00
Owner

Summary

FED-M1-10 — Independent security review of the federation tier code (migration script + tier detector). Two review rounds surfaced 7 distinct findings; all addressed in this PR.

Security findings (all fixed)

1. Credential leak via driver error messages (HIGH)

postgres and ioredis packages embed full DSN strings (user:password@host) in connection errors. Those messages were being captured raw in:

  • migrate-tier.ts per-table catch block (re-thrown to caller)
  • tier-detection.ts probePostgresMeasured / probePgvectorMeasured / probeValkeyMeasured (surfaced in TierHealthReport.services[].error.message — emitted as JSON by mosaic gateway doctor --json)
  • cli.ts storage status reachability check + migrate-tier outer catch

The doctor JSON path was the most exposed surface: any monitoring pipeline ingesting it would have collected production credentials.

Fix: New internal helper packages/storage/src/redact-error.ts:

const CREDENTIAL_URL_RE = /(postgres(?:ql)?|rediss?):\/\/[^@\s]*@/gi;
export function redactErrMsg(msg: string): string {
  return msg.replace(CREDENTIAL_URL_RE, (_match, scheme: string) => `${scheme.toLowerCase()}://***@`);
}

Covers postgres / postgresql / redis / rediss. Case-insensitive. Idempotent. NOT exported from package public surface. 10 unit tests covering all schemes, multi-URL strings, no-creds variants, and case sensitivity.

2. No advisory lock on migrate-tier (LOW-MEDIUM)

Two concurrent invocations against the same target both passed checkTargetPreconditions (non-atomic) and proceeded, racing on row counts and defeating the non-empty guard.

Fix: PostgresMigrationTarget now exposes tryAcquireAdvisoryLock() and releaseAdvisoryLock() using session-scoped pg_try_advisory_lock(hashtext('mosaic-migrate-tier')). runMigrateTier acquires the lock before preflight, releases in finally. PG releases automatically on session end. Dry-run skips lock acquisition. Non-blocking — fails fast with a clear message if held.

3. SKIP_TABLES rationale (advisory)

Comment expanded to document:

  • WHY sessions, verifications, admin_tokens are skipped (TTL'd / one-time / env-bound)
  • WHY accounts (OAuth tokens) and provider_credentials (AI provider keys) are NOT skipped — they are durable user-bound credentials, not deployment-bound. Operators migrating to a shared/multi-tenant federated tier should review whether to wipe these manually post-migration.

Process notes

  • Two review rounds were necessary. The first round caught Postgres credential leaks; the second round caught the parallel Valkey leak that the worker missed when extending the regex.
  • This validates the "always re-review fixes for security work" rule: a single review-then-fix cycle would have shipped the Valkey leak.

Verification

$ pnpm --filter @mosaicstack/storage test
 Test Files  7 passed | 1 skipped (8)
      Tests  95 passed | 1 skipped (96)

Quality gates: typecheck PASS · lint PASS · format:check PASS

Test plan

  • CI green
  • Manual smoke: trigger a mosaic storage migrate-tier against a closed PG port; confirm error has no DSN

Refs #460

## Summary FED-M1-10 — Independent security review of the federation tier code (migration script + tier detector). Two review rounds surfaced 7 distinct findings; all addressed in this PR. ## Security findings (all fixed) ### 1. Credential leak via driver error messages (HIGH) `postgres` and `ioredis` packages embed full DSN strings (`user:password@host`) in connection errors. Those messages were being captured raw in: - `migrate-tier.ts` per-table catch block (re-thrown to caller) - `tier-detection.ts` probePostgresMeasured / probePgvectorMeasured / probeValkeyMeasured (surfaced in `TierHealthReport.services[].error.message` — emitted as JSON by `mosaic gateway doctor --json`) - `cli.ts` storage status reachability check + migrate-tier outer catch The doctor JSON path was the most exposed surface: any monitoring pipeline ingesting it would have collected production credentials. **Fix**: New internal helper `packages/storage/src/redact-error.ts`: ```ts const CREDENTIAL_URL_RE = /(postgres(?:ql)?|rediss?):\/\/[^@\s]*@/gi; export function redactErrMsg(msg: string): string { return msg.replace(CREDENTIAL_URL_RE, (_match, scheme: string) => `${scheme.toLowerCase()}://***@`); } ``` Covers postgres / postgresql / redis / rediss. Case-insensitive. Idempotent. NOT exported from package public surface. 10 unit tests covering all schemes, multi-URL strings, no-creds variants, and case sensitivity. ### 2. No advisory lock on `migrate-tier` (LOW-MEDIUM) Two concurrent invocations against the same target both passed `checkTargetPreconditions` (non-atomic) and proceeded, racing on row counts and defeating the non-empty guard. **Fix**: `PostgresMigrationTarget` now exposes `tryAcquireAdvisoryLock()` and `releaseAdvisoryLock()` using session-scoped `pg_try_advisory_lock(hashtext('mosaic-migrate-tier'))`. `runMigrateTier` acquires the lock before preflight, releases in `finally`. PG releases automatically on session end. Dry-run skips lock acquisition. Non-blocking — fails fast with a clear message if held. ### 3. SKIP_TABLES rationale (advisory) Comment expanded to document: - WHY `sessions`, `verifications`, `admin_tokens` are skipped (TTL'd / one-time / env-bound) - WHY `accounts` (OAuth tokens) and `provider_credentials` (AI provider keys) are NOT skipped — they are durable user-bound credentials, not deployment-bound. Operators migrating to a shared/multi-tenant federated tier should review whether to wipe these manually post-migration. ## Process notes - Two review rounds were necessary. The first round caught Postgres credential leaks; the second round caught the parallel Valkey leak that the worker missed when extending the regex. - This validates the "always re-review fixes for security work" rule: a single review-then-fix cycle would have shipped the Valkey leak. ## Verification ``` $ pnpm --filter @mosaicstack/storage test Test Files 7 passed | 1 skipped (8) Tests 95 passed | 1 skipped (96) ``` Quality gates: typecheck PASS · lint PASS · format:check PASS ## Test plan - [ ] CI green - [ ] Manual smoke: trigger a `mosaic storage migrate-tier` against a closed PG port; confirm error has no DSN Refs #460
jason.woltje added 1 commit 2026-04-20 02:02:51 +00:00
fix(storage): redact credentials in driver errors + advisory lock for migrate-tier (FED-M1-10)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
a2277c91cb
Independent two-round security review surfaced credential leak vectors and
a concurrency footgun in the federation tier code. All findings addressed:

Credential redaction (HIGH):
- New packages/storage/src/redact-error.ts strips user:password from
  postgres://, postgresql://, redis://, rediss:// URLs (case-insensitive,
  global). Internal — not exported from package index.
- Applied to: migrate-tier inner catch, tier-detection postgres+pgvector+
  valkey probe error fields, cli.ts storage status + migrate-tier outer
  catch. The TierHealthReport JSON emitted by `mosaic gateway doctor --json`
  no longer leaks DSNs to monitoring pipelines.
- 10 unit tests covering both schemes, multi-URL, no-creds, case variants.

Advisory lock for migrate-tier (LOW-MEDIUM):
- PostgresMigrationTarget gains tryAcquireAdvisoryLock /
  releaseAdvisoryLock using session-scoped pg_try_advisory_lock with key
  hashtext('mosaic-migrate-tier'). Non-blocking — fails fast with a clear
  message if another invocation is in progress. Released in finally; PG
  releases automatically on session end. Dry-run skips lock acquisition.

SKIP_TABLES rationale (advisory):
- Comment expanded to document why sessions/verifications/admin_tokens
  are skipped AND why accounts/provider_credentials are intentionally
  migrated (durable user-bound credentials). Operators migrating to a
  shared/multi-tenant federated tier should review whether to wipe these
  manually post-migration.

Tests: 95 storage tests pass + 1 integration test skipped (FEDERATED_INTEGRATION).

Refs #460
jason.woltje merged commit dc4afee848 into main 2026-04-20 02:02:58 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaicstack/stack#479