From 51ce32cc765d3266e51f3ee620e90ad61e3340dc Mon Sep 17 00:00:00 2001 From: Jason Woltje Date: Sat, 7 Feb 2026 11:15:58 -0600 Subject: [PATCH] docs(#346): Add credential security architecture design document Comprehensive design document for M7-CredentialSecurity milestone covering hybrid OpenBao Transit + PostgreSQL encryption approach, threat model, UserCredential data model, API design, RLS enforcement strategy, turnkey OpenBao Docker integration, and 5-phase implementation plan. Co-Authored-By: Claude Sonnet 4.5 --- docs/design/credential-security.md | 412 +++++++++++++++++++++++++++++ 1 file changed, 412 insertions(+) create mode 100644 docs/design/credential-security.md diff --git a/docs/design/credential-security.md b/docs/design/credential-security.md new file mode 100644 index 0000000..f512845 --- /dev/null +++ b/docs/design/credential-security.md @@ -0,0 +1,412 @@ +# Credential Security Architecture + +**Version:** 0.0.1 +**Status:** Approved +**Author:** Mosaic Stack Team +**Date:** 2026-02-07 +**Epic:** [#346](https://git.mosaicstack.dev/mosaic/stack/issues/346) +**Milestone:** M7-CredentialSecurity + +## Table of Contents + +1. [Problem Statement](#problem-statement) +2. [Threat Model](#threat-model) +3. [Architecture Decision](#architecture-decision) +4. [System Architecture](#system-architecture) +5. [Data Model](#data-model) +6. [API Design](#api-design) +7. [RLS Enforcement](#rls-enforcement) +8. [OpenBao Integration](#openbao-integration) +9. [Federation Isolation](#federation-isolation) +10. [Implementation Phases](#implementation-phases) +11. [Risk Mitigation](#risk-mitigation) + +--- + +## Problem Statement + +Mosaic Stack stores sensitive user credentials with critical security gaps: + +1. **OAuth tokens stored plaintext** in the `accounts` table (`access_token`, `refresh_token`, + `id_token`) +2. **LLM API keys stored plaintext** in `llm_provider_instances.config` JSON field +3. **RLS enabled but never enforced** — all 23 tables have policies but no `FORCE ROW LEVEL +SECURITY`, and Prisma connects as table owner, silently bypassing all policies +4. **No RLS on auth tables** — `accounts`, `sessions`, `verifications` have no policies +5. **No user credential management** — no model, API, or UI for storing user-provided tokens +6. **Master encryption key on disk** — `ENCRYPTION_KEY` in `.env` file + +Users will store API keys, git tokens, and OAuth tokens for integrations. This data is private +and must never leak between users or across federation boundaries. + +## Threat Model + +### At-Rest Threats (mitigated by encryption) + +| Threat | Impact | Mitigation | +| --------------------------- | ------------------------------ | ------------------------------------------- | +| Database backup exposure | All credentials leaked | Column-level encryption via OpenBao Transit | +| SQL injection | Attacker reads encrypted blobs | Encrypted data useless without Transit key | +| Database admin access | Full table reads | Encrypted columns, RLS enforcement | +| Filesystem access to `.env` | Master key compromised | OpenBao Shamir key splitting (production) | + +### In-Use Threats (mitigated by access control) + +| Threat | Impact | Mitigation | +| ----------------------- | --------------------------------- | ------------------------------------ | +| Cross-user data access | User A sees User B's tokens | RLS policies with FORCE enforcement | +| Federation data leakage | Remote instance gets credentials | Explicit deny-list in QueryService | +| Application logic bugs | Wrong user gets wrong credential | RLS as defense-in-depth layer | +| Compromised app server | Memory access to decrypted values | Short-lived plaintext, audit logging | + +### Not Mitigated + +Full application server compromise with code execution grants access to decrypted credentials +in memory. This is an accepted risk — no encryption scheme protects against a fully compromised +application process. + +## Architecture Decision + +### Approach: Hybrid OpenBao + PostgreSQL Encryption + +After evaluating three approaches, the hybrid model was selected: + +| Concern | Pure DB (pgcrypto) | Pure Vault | Hybrid (selected) | +| ----------------------------- | ------------------ | ----------------- | ------------------------------ | +| Key on disk (turtles problem) | `.env` on disk | Shamir-split | Shamir-split | +| Audit trail | Custom logging | Built-in | Built-in | +| New infrastructure | None | OpenBao container | OpenBao container | +| Per-user isolation | RLS only | Vault policies | RLS + encryption | +| Turnkey deployment | Yes | Manual unsealing | Auto-unseal via init container | +| Dynamic secrets | No | Yes | Yes | +| License cost | Free | Free (OpenBao) | Free | + +**Why not pure DB?** The "turtles all the way down" problem — encrypting in the DB still +requires a master key in an environment variable on disk. If the server is compromised, the +key is compromised. + +**Why not pure Vault?** Operational complexity. Storing all credentials in Vault requires +significant Vault policy management. PostgreSQL with RLS provides a more natural data model +for user-scoped credentials. + +**Why hybrid?** Best of both worlds — PostgreSQL stores encrypted credentials with RLS +enforcement, OpenBao handles key management via Transit engine. The master key never exists +on disk as a single value (Shamir-split in production). + +### Why OpenBao (not HashiCorp Vault)? + +- Truly open-source (Linux Foundation, OSI license) +- Drop-in Vault replacement (API-compatible) +- No Business Source License concerns +- Production-ready (v2.0) +- Smaller, focused ecosystem + +## System Architecture + +``` + ┌──────────────────────┐ + │ Next.js Frontend │ + │ /settings/creds │ + └──────────┬───────────┘ + │ HTTPS + ┌──────────▼───────────┐ + │ NestJS API │ + │ CredentialsService │ + │ VaultService │ + └───┬──────────────┬───┘ + │ │ + Ciphertext │ │ Transit API + (storage) │ │ (encrypt/decrypt) + │ │ + ┌──────────▼──┐ ┌──────▼──────────┐ + │ PostgreSQL │ │ OpenBao │ + │ + RLS │ │ Transit Engine │ + │ + pgcrypto │ │ + AppRole Auth │ + └─────────────┘ │ + Audit Log │ + └─────────────────┘ +``` + +### Data Flow: Store Credential + +1. User submits API key via frontend form +2. NestJS `CredentialsController` receives plaintext value +3. `CredentialsService` calls `VaultService.encrypt(value, TransitKey.CREDENTIALS)` +4. `VaultService` calls OpenBao Transit API: `POST /v1/transit/encrypt/mosaic-credentials` +5. Transit returns ciphertext: `vault:v1:base64data` +6. Ciphertext stored in `user_credentials.encrypted_value` +7. Masked value (`****abcd`) stored in `user_credentials.masked_value` +8. Activity log entry: `CREDENTIAL_CREATED` +9. Response includes masked value only — never the ciphertext or plaintext + +### Data Flow: Retrieve Credential + +1. User clicks "Reveal" on credential card +2. Frontend calls `GET /api/credentials/:id/value` +3. RLS-scoped query fetches row (user can only see own rows) +4. `VaultService.decrypt(ciphertext, TransitKey.CREDENTIALS)` +5. Transit returns plaintext +6. `lastUsedAt` updated on credential row +7. Activity log entry: `CREDENTIAL_ACCESSED` +8. Plaintext returned to frontend, auto-hidden after 30 seconds + +### Fallback: No OpenBao Available + +When OpenBao is unavailable (local dev, CI), `VaultService` falls back to the existing +`CryptoService` (AES-256-GCM with `ENCRYPTION_KEY` from environment). + +Ciphertext format distinguishes the source: + +- `vault:v1:...` — OpenBao Transit ciphertext +- `aes:iv:authTag:encrypted` — AES-256-GCM fallback +- No prefix — legacy plaintext (backward compatible, triggers encryption on next write) + +## Data Model + +### UserCredential Table + +``` +user_credentials +├── id UUID (PK) +├── user_id UUID (FK -> users) +├── workspace_id UUID? (FK -> workspaces, nullable for user-global) +├── name VARCHAR -- "GitHub Personal Token" +├── provider VARCHAR -- "github", "openai", "custom" +├── type CredentialType (API_KEY, OAUTH_TOKEN, ACCESS_TOKEN, SECRET, PASSWORD, CUSTOM) +├── scope CredentialScope (USER, WORKSPACE, SYSTEM) +├── encrypted_value TEXT -- OpenBao Transit ciphertext +├── masked_value VARCHAR? -- "****abcd" +├── description TEXT? +├── expires_at TIMESTAMPTZ? +├── last_used_at TIMESTAMPTZ? +├── metadata JSONB -- provider-specific data +├── is_active BOOLEAN -- soft delete +├── rotated_at TIMESTAMPTZ? +├── created_at TIMESTAMPTZ +└── updated_at TIMESTAMPTZ + +UNIQUE(user_id, workspace_id, provider, name) +``` + +### Scope Semantics + +| Scope | Who Can Access | Use Case | +| --------- | ------------------ | ----------------------------- | +| USER | Owner only | Personal API keys, git tokens | +| WORKSPACE | Workspace admins | Shared integration tokens | +| SYSTEM | System admins only | Platform-level secrets | + +### Enum Additions + +- `EntityType`: add `CREDENTIAL` +- `ActivityAction`: add `CREDENTIAL_CREATED`, `CREDENTIAL_ACCESSED`, `CREDENTIAL_ROTATED`, + `CREDENTIAL_REVOKED` + +## API Design + +### User Credential Endpoints + +``` +POST /api/credentials Create credential (encrypt + store) +GET /api/credentials List credentials (masked values only) +GET /api/credentials/:id Get single credential (masked) +GET /api/credentials/:id/value Decrypt and return value (audit logged) +PATCH /api/credentials/:id Update metadata (not value) +POST /api/credentials/:id/rotate Replace with new encrypted value +DELETE /api/credentials/:id Soft-delete (isActive=false) +``` + +Guards: `AuthGuard` + `WorkspaceGuard` + `PermissionGuard` + +### Admin Secret Endpoints + +``` +POST /api/admin/secrets Create system-level secret +GET /api/admin/secrets List system secrets (masked) +PATCH /api/admin/secrets/:id Update system secret +DELETE /api/admin/secrets/:id Revoke system secret +``` + +Guards: `AuthGuard` + `AdminGuard` + +### Security Invariant + +**Listing endpoints never return plaintext or ciphertext.** Only `maskedValue` appears in +list/get responses. Decryption requires an explicit `GET /value` call, which is always +audit-logged. + +## RLS Enforcement + +### Current Problem + +All 23 RLS-enabled tables use `ENABLE ROW LEVEL SECURITY` but never `FORCE ROW LEVEL SECURITY`. +Prisma connects as the database owner role (`mosaic`), which bypasses all RLS policies by default. +The RLS context utilities in `apps/api/src/lib/db-context.ts` are fully implemented but never +called by any service. + +### Solution + +1. **FORCE ROW LEVEL SECURITY** on auth and credential tables +2. **Owner bypass policy** for migration compatibility +3. **RLS context interceptor** sets session variables in every authenticated request + +```sql +ALTER TABLE user_credentials FORCE ROW LEVEL SECURITY; + +-- Owner bypass for migrations +CREATE POLICY credentials_owner_bypass ON user_credentials + FOR ALL TO mosaic USING (true); + +-- User access policy +CREATE POLICY credentials_user_access ON user_credentials + FOR ALL USING ( + (scope = 'USER' AND user_id = current_user_id()) + OR (scope = 'WORKSPACE' AND workspace_id IS NOT NULL + AND is_workspace_admin(workspace_id, current_user_id())) + ); +``` + +### RLS Context Interceptor + +Registered as `APP_INTERCEPTOR`, wraps all authenticated requests: + +1. Extracts `userId` from `AuthGuard` +2. Extracts `workspaceId` from `WorkspaceGuard` +3. Executes `SET LOCAL app.current_user_id = '{userId}'` in Prisma transaction +4. Uses `AsyncLocalStorage` to propagate transaction client to services + +## OpenBao Integration + +### Turnkey Docker Deployment + +Two containers added to `docker/docker-compose.yml`: + +1. **openbao** — OpenBao server with file storage backend +2. **openbao-init** — Sidecar that auto-initializes, auto-unseals, and configures Transit + +On first `docker compose up -d`: + +- OpenBao initializes with 1-of-1 key share (turnkey simplicity) +- Transit secrets engine enabled +- Four named encryption keys created +- AppRole created with Transit-only policy +- Credentials saved to shared Docker volume + +On restart: + +- `openbao-init` reads stored unseal key and auto-unseals + +### Named Transit Keys + +| Key | Purpose | +| ----------------------- | ------------------------------------------------ | +| `mosaic-credentials` | User-stored credentials (API keys, git tokens) | +| `mosaic-account-tokens` | BetterAuth OAuth tokens in accounts table | +| `mosaic-federation` | Federation private keys (replaces CryptoService) | +| `mosaic-llm-config` | LLM provider API keys | + +### Production Hardening + +For production deployments (documented in `docs/OPENBAO.md`): + +- Upgrade to 3-of-5 Shamir key splitting: `bao operator rekey -key-shares=5 -key-threshold=3` +- Enable TLS on listener +- Use external KMS for auto-unseal (AWS KMS, GCP CKMS, Azure Key Vault) +- Enable audit logging: `bao audit enable file file_path=/bao/logs/audit.log` +- Use Raft or Consul storage backend for HA +- Revoke root token after initial setup + +## Federation Isolation + +Credentials must never leak across federation boundaries: + +1. **RLS enforcement** — Federated queries go through `QueryService` which operates within a + specific workspace context. RLS policies restrict to authenticated user. +2. **Explicit deny-list** — `QueryService` denies queries for `UserCredential` entity type +3. **Transit key isolation** — Each credential type uses a separate named key. Federation keys + (`mosaic-federation`) cannot decrypt user credentials (`mosaic-credentials`). +4. **Endpoint isolation** — Credential API requires session auth. Federated requests use + signature-based auth and cannot access credential endpoints. + +## Implementation Phases + +### Phase 1: Security Foundations (p0) + +Fix immediate security gaps: + +| Issue | Title | +| ----- | ------------------------------------------------------ | +| #351 | Create RLS context interceptor (fix SEC-API-4) | +| #350 | Add RLS policies to auth tables with FORCE enforcement | +| #352 | Encrypt existing plaintext Account tokens | + +### Phase 2: OpenBao Integration (p1) + +Add OpenBao and VaultService: + +| Issue | Title | +| ----- | ---------------------------------------------------------- | +| #357 | Add OpenBao to Docker Compose (turnkey setup) | +| #353 | Create VaultService NestJS module for OpenBao Transit | +| #354 | Write OpenBao documentation and production hardening guide | + +### Phase 3: User Credential Storage (p1) + +Build the credential management system: + +| Issue | Title | +| ----- | ---------------------------------------------------- | +| #355 | Create UserCredential Prisma model with RLS policies | +| #356 | Build credential CRUD API endpoints | + +### Phase 4: Frontend (p1) + +User-facing credential management: + +| Issue | Title | +| ----- | ------------------------------------------ | +| #358 | Build frontend credential management pages | + +### Phase 5: Migration and Hardening (p1-p3) + +Encrypt remaining plaintext and harden federation: + +| Issue | Title | +| ----- | ----------------------------------------- | +| #359 | Encrypt LLM provider API keys in database | +| #360 | Federation credential isolation | +| #361 | Credential audit log viewer (stretch) | + +### Phase Dependencies + +``` +Phase 1 (RLS + Token Encryption) + └── Phase 2 (OpenBao + VaultService) + ├── Phase 3 (Credential Model + API) + │ └── Phase 4 (Frontend) + └── Phase 5 (LLM Migration + Federation) +``` + +## Risk Mitigation + +| Risk | Mitigation | +| ---------------------------------- | -------------------------------------------------------------------- | +| FORCE RLS breaks Prisma migrations | Owner bypass policy grants full access to `mosaic` role | +| FORCE RLS breaks BetterAuth writes | Interceptor sets user context; BetterAuth uses same client | +| OpenBao container fails to start | VaultService falls back to AES-256-GCM; app stays functional | +| Data migration corrupts tokens | Run in transaction; backup first; format prefix tracking | +| BetterAuth reads encrypted tokens | Prisma middleware transparently decrypts on read | +| Transit key rotation | OpenBao handles versioning transparently; old ciphertext stays valid | + +## Key Files Reference + +| Purpose | Path | +| ---------------------- | -------------------------------------------------------------------------- | +| Existing CryptoService | `apps/api/src/federation/crypto.service.ts` | +| RLS context utilities | `apps/api/src/lib/db-context.ts` | +| Prisma schema | `apps/api/prisma/schema.prisma` | +| RLS migration | `apps/api/prisma/migrations/20260129221004_add_rls_policies/migration.sql` | +| Docker Compose | `docker/docker-compose.yml` | +| App module | `apps/api/src/app.module.ts` | +| Auth guards | `apps/api/src/auth/guards/auth.guard.ts` | +| Workspace guard | `apps/api/src/common/guards/workspace.guard.ts` | +| Security review | `docs/reports/codebase-review-2026-02-05/01-security-review.md` |