docs(#346): Add credential security architecture design document
Comprehensive design document for M7-CredentialSecurity milestone covering hybrid OpenBao Transit + PostgreSQL encryption approach, threat model, UserCredential data model, API design, RLS enforcement strategy, turnkey OpenBao Docker integration, and 5-phase implementation plan. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
412
docs/design/credential-security.md
Normal file
412
docs/design/credential-security.md
Normal file
@@ -0,0 +1,412 @@
|
||||
# Credential Security Architecture
|
||||
|
||||
**Version:** 0.0.1
|
||||
**Status:** Approved
|
||||
**Author:** Mosaic Stack Team
|
||||
**Date:** 2026-02-07
|
||||
**Epic:** [#346](https://git.mosaicstack.dev/mosaic/stack/issues/346)
|
||||
**Milestone:** M7-CredentialSecurity
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Problem Statement](#problem-statement)
|
||||
2. [Threat Model](#threat-model)
|
||||
3. [Architecture Decision](#architecture-decision)
|
||||
4. [System Architecture](#system-architecture)
|
||||
5. [Data Model](#data-model)
|
||||
6. [API Design](#api-design)
|
||||
7. [RLS Enforcement](#rls-enforcement)
|
||||
8. [OpenBao Integration](#openbao-integration)
|
||||
9. [Federation Isolation](#federation-isolation)
|
||||
10. [Implementation Phases](#implementation-phases)
|
||||
11. [Risk Mitigation](#risk-mitigation)
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Mosaic Stack stores sensitive user credentials with critical security gaps:
|
||||
|
||||
1. **OAuth tokens stored plaintext** in the `accounts` table (`access_token`, `refresh_token`,
|
||||
`id_token`)
|
||||
2. **LLM API keys stored plaintext** in `llm_provider_instances.config` JSON field
|
||||
3. **RLS enabled but never enforced** — all 23 tables have policies but no `FORCE ROW LEVEL
|
||||
SECURITY`, and Prisma connects as table owner, silently bypassing all policies
|
||||
4. **No RLS on auth tables** — `accounts`, `sessions`, `verifications` have no policies
|
||||
5. **No user credential management** — no model, API, or UI for storing user-provided tokens
|
||||
6. **Master encryption key on disk** — `ENCRYPTION_KEY` in `.env` file
|
||||
|
||||
Users will store API keys, git tokens, and OAuth tokens for integrations. This data is private
|
||||
and must never leak between users or across federation boundaries.
|
||||
|
||||
## Threat Model
|
||||
|
||||
### At-Rest Threats (mitigated by encryption)
|
||||
|
||||
| Threat | Impact | Mitigation |
|
||||
| --------------------------- | ------------------------------ | ------------------------------------------- |
|
||||
| Database backup exposure | All credentials leaked | Column-level encryption via OpenBao Transit |
|
||||
| SQL injection | Attacker reads encrypted blobs | Encrypted data useless without Transit key |
|
||||
| Database admin access | Full table reads | Encrypted columns, RLS enforcement |
|
||||
| Filesystem access to `.env` | Master key compromised | OpenBao Shamir key splitting (production) |
|
||||
|
||||
### In-Use Threats (mitigated by access control)
|
||||
|
||||
| Threat | Impact | Mitigation |
|
||||
| ----------------------- | --------------------------------- | ------------------------------------ |
|
||||
| Cross-user data access | User A sees User B's tokens | RLS policies with FORCE enforcement |
|
||||
| Federation data leakage | Remote instance gets credentials | Explicit deny-list in QueryService |
|
||||
| Application logic bugs | Wrong user gets wrong credential | RLS as defense-in-depth layer |
|
||||
| Compromised app server | Memory access to decrypted values | Short-lived plaintext, audit logging |
|
||||
|
||||
### Not Mitigated
|
||||
|
||||
Full application server compromise with code execution grants access to decrypted credentials
|
||||
in memory. This is an accepted risk — no encryption scheme protects against a fully compromised
|
||||
application process.
|
||||
|
||||
## Architecture Decision
|
||||
|
||||
### Approach: Hybrid OpenBao + PostgreSQL Encryption
|
||||
|
||||
After evaluating three approaches, the hybrid model was selected:
|
||||
|
||||
| Concern | Pure DB (pgcrypto) | Pure Vault | Hybrid (selected) |
|
||||
| ----------------------------- | ------------------ | ----------------- | ------------------------------ |
|
||||
| Key on disk (turtles problem) | `.env` on disk | Shamir-split | Shamir-split |
|
||||
| Audit trail | Custom logging | Built-in | Built-in |
|
||||
| New infrastructure | None | OpenBao container | OpenBao container |
|
||||
| Per-user isolation | RLS only | Vault policies | RLS + encryption |
|
||||
| Turnkey deployment | Yes | Manual unsealing | Auto-unseal via init container |
|
||||
| Dynamic secrets | No | Yes | Yes |
|
||||
| License cost | Free | Free (OpenBao) | Free |
|
||||
|
||||
**Why not pure DB?** The "turtles all the way down" problem — encrypting in the DB still
|
||||
requires a master key in an environment variable on disk. If the server is compromised, the
|
||||
key is compromised.
|
||||
|
||||
**Why not pure Vault?** Operational complexity. Storing all credentials in Vault requires
|
||||
significant Vault policy management. PostgreSQL with RLS provides a more natural data model
|
||||
for user-scoped credentials.
|
||||
|
||||
**Why hybrid?** Best of both worlds — PostgreSQL stores encrypted credentials with RLS
|
||||
enforcement, OpenBao handles key management via Transit engine. The master key never exists
|
||||
on disk as a single value (Shamir-split in production).
|
||||
|
||||
### Why OpenBao (not HashiCorp Vault)?
|
||||
|
||||
- Truly open-source (Linux Foundation, OSI license)
|
||||
- Drop-in Vault replacement (API-compatible)
|
||||
- No Business Source License concerns
|
||||
- Production-ready (v2.0)
|
||||
- Smaller, focused ecosystem
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────┐
|
||||
│ Next.js Frontend │
|
||||
│ /settings/creds │
|
||||
└──────────┬───────────┘
|
||||
│ HTTPS
|
||||
┌──────────▼───────────┐
|
||||
│ NestJS API │
|
||||
│ CredentialsService │
|
||||
│ VaultService │
|
||||
└───┬──────────────┬───┘
|
||||
│ │
|
||||
Ciphertext │ │ Transit API
|
||||
(storage) │ │ (encrypt/decrypt)
|
||||
│ │
|
||||
┌──────────▼──┐ ┌──────▼──────────┐
|
||||
│ PostgreSQL │ │ OpenBao │
|
||||
│ + RLS │ │ Transit Engine │
|
||||
│ + pgcrypto │ │ + AppRole Auth │
|
||||
└─────────────┘ │ + Audit Log │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
### Data Flow: Store Credential
|
||||
|
||||
1. User submits API key via frontend form
|
||||
2. NestJS `CredentialsController` receives plaintext value
|
||||
3. `CredentialsService` calls `VaultService.encrypt(value, TransitKey.CREDENTIALS)`
|
||||
4. `VaultService` calls OpenBao Transit API: `POST /v1/transit/encrypt/mosaic-credentials`
|
||||
5. Transit returns ciphertext: `vault:v1:base64data`
|
||||
6. Ciphertext stored in `user_credentials.encrypted_value`
|
||||
7. Masked value (`****abcd`) stored in `user_credentials.masked_value`
|
||||
8. Activity log entry: `CREDENTIAL_CREATED`
|
||||
9. Response includes masked value only — never the ciphertext or plaintext
|
||||
|
||||
### Data Flow: Retrieve Credential
|
||||
|
||||
1. User clicks "Reveal" on credential card
|
||||
2. Frontend calls `GET /api/credentials/:id/value`
|
||||
3. RLS-scoped query fetches row (user can only see own rows)
|
||||
4. `VaultService.decrypt(ciphertext, TransitKey.CREDENTIALS)`
|
||||
5. Transit returns plaintext
|
||||
6. `lastUsedAt` updated on credential row
|
||||
7. Activity log entry: `CREDENTIAL_ACCESSED`
|
||||
8. Plaintext returned to frontend, auto-hidden after 30 seconds
|
||||
|
||||
### Fallback: No OpenBao Available
|
||||
|
||||
When OpenBao is unavailable (local dev, CI), `VaultService` falls back to the existing
|
||||
`CryptoService` (AES-256-GCM with `ENCRYPTION_KEY` from environment).
|
||||
|
||||
Ciphertext format distinguishes the source:
|
||||
|
||||
- `vault:v1:...` — OpenBao Transit ciphertext
|
||||
- `aes:iv:authTag:encrypted` — AES-256-GCM fallback
|
||||
- No prefix — legacy plaintext (backward compatible, triggers encryption on next write)
|
||||
|
||||
## Data Model
|
||||
|
||||
### UserCredential Table
|
||||
|
||||
```
|
||||
user_credentials
|
||||
├── id UUID (PK)
|
||||
├── user_id UUID (FK -> users)
|
||||
├── workspace_id UUID? (FK -> workspaces, nullable for user-global)
|
||||
├── name VARCHAR -- "GitHub Personal Token"
|
||||
├── provider VARCHAR -- "github", "openai", "custom"
|
||||
├── type CredentialType (API_KEY, OAUTH_TOKEN, ACCESS_TOKEN, SECRET, PASSWORD, CUSTOM)
|
||||
├── scope CredentialScope (USER, WORKSPACE, SYSTEM)
|
||||
├── encrypted_value TEXT -- OpenBao Transit ciphertext
|
||||
├── masked_value VARCHAR? -- "****abcd"
|
||||
├── description TEXT?
|
||||
├── expires_at TIMESTAMPTZ?
|
||||
├── last_used_at TIMESTAMPTZ?
|
||||
├── metadata JSONB -- provider-specific data
|
||||
├── is_active BOOLEAN -- soft delete
|
||||
├── rotated_at TIMESTAMPTZ?
|
||||
├── created_at TIMESTAMPTZ
|
||||
└── updated_at TIMESTAMPTZ
|
||||
|
||||
UNIQUE(user_id, workspace_id, provider, name)
|
||||
```
|
||||
|
||||
### Scope Semantics
|
||||
|
||||
| Scope | Who Can Access | Use Case |
|
||||
| --------- | ------------------ | ----------------------------- |
|
||||
| USER | Owner only | Personal API keys, git tokens |
|
||||
| WORKSPACE | Workspace admins | Shared integration tokens |
|
||||
| SYSTEM | System admins only | Platform-level secrets |
|
||||
|
||||
### Enum Additions
|
||||
|
||||
- `EntityType`: add `CREDENTIAL`
|
||||
- `ActivityAction`: add `CREDENTIAL_CREATED`, `CREDENTIAL_ACCESSED`, `CREDENTIAL_ROTATED`,
|
||||
`CREDENTIAL_REVOKED`
|
||||
|
||||
## API Design
|
||||
|
||||
### User Credential Endpoints
|
||||
|
||||
```
|
||||
POST /api/credentials Create credential (encrypt + store)
|
||||
GET /api/credentials List credentials (masked values only)
|
||||
GET /api/credentials/:id Get single credential (masked)
|
||||
GET /api/credentials/:id/value Decrypt and return value (audit logged)
|
||||
PATCH /api/credentials/:id Update metadata (not value)
|
||||
POST /api/credentials/:id/rotate Replace with new encrypted value
|
||||
DELETE /api/credentials/:id Soft-delete (isActive=false)
|
||||
```
|
||||
|
||||
Guards: `AuthGuard` + `WorkspaceGuard` + `PermissionGuard`
|
||||
|
||||
### Admin Secret Endpoints
|
||||
|
||||
```
|
||||
POST /api/admin/secrets Create system-level secret
|
||||
GET /api/admin/secrets List system secrets (masked)
|
||||
PATCH /api/admin/secrets/:id Update system secret
|
||||
DELETE /api/admin/secrets/:id Revoke system secret
|
||||
```
|
||||
|
||||
Guards: `AuthGuard` + `AdminGuard`
|
||||
|
||||
### Security Invariant
|
||||
|
||||
**Listing endpoints never return plaintext or ciphertext.** Only `maskedValue` appears in
|
||||
list/get responses. Decryption requires an explicit `GET /value` call, which is always
|
||||
audit-logged.
|
||||
|
||||
## RLS Enforcement
|
||||
|
||||
### Current Problem
|
||||
|
||||
All 23 RLS-enabled tables use `ENABLE ROW LEVEL SECURITY` but never `FORCE ROW LEVEL SECURITY`.
|
||||
Prisma connects as the database owner role (`mosaic`), which bypasses all RLS policies by default.
|
||||
The RLS context utilities in `apps/api/src/lib/db-context.ts` are fully implemented but never
|
||||
called by any service.
|
||||
|
||||
### Solution
|
||||
|
||||
1. **FORCE ROW LEVEL SECURITY** on auth and credential tables
|
||||
2. **Owner bypass policy** for migration compatibility
|
||||
3. **RLS context interceptor** sets session variables in every authenticated request
|
||||
|
||||
```sql
|
||||
ALTER TABLE user_credentials FORCE ROW LEVEL SECURITY;
|
||||
|
||||
-- Owner bypass for migrations
|
||||
CREATE POLICY credentials_owner_bypass ON user_credentials
|
||||
FOR ALL TO mosaic USING (true);
|
||||
|
||||
-- User access policy
|
||||
CREATE POLICY credentials_user_access ON user_credentials
|
||||
FOR ALL USING (
|
||||
(scope = 'USER' AND user_id = current_user_id())
|
||||
OR (scope = 'WORKSPACE' AND workspace_id IS NOT NULL
|
||||
AND is_workspace_admin(workspace_id, current_user_id()))
|
||||
);
|
||||
```
|
||||
|
||||
### RLS Context Interceptor
|
||||
|
||||
Registered as `APP_INTERCEPTOR`, wraps all authenticated requests:
|
||||
|
||||
1. Extracts `userId` from `AuthGuard`
|
||||
2. Extracts `workspaceId` from `WorkspaceGuard`
|
||||
3. Executes `SET LOCAL app.current_user_id = '{userId}'` in Prisma transaction
|
||||
4. Uses `AsyncLocalStorage` to propagate transaction client to services
|
||||
|
||||
## OpenBao Integration
|
||||
|
||||
### Turnkey Docker Deployment
|
||||
|
||||
Two containers added to `docker/docker-compose.yml`:
|
||||
|
||||
1. **openbao** — OpenBao server with file storage backend
|
||||
2. **openbao-init** — Sidecar that auto-initializes, auto-unseals, and configures Transit
|
||||
|
||||
On first `docker compose up -d`:
|
||||
|
||||
- OpenBao initializes with 1-of-1 key share (turnkey simplicity)
|
||||
- Transit secrets engine enabled
|
||||
- Four named encryption keys created
|
||||
- AppRole created with Transit-only policy
|
||||
- Credentials saved to shared Docker volume
|
||||
|
||||
On restart:
|
||||
|
||||
- `openbao-init` reads stored unseal key and auto-unseals
|
||||
|
||||
### Named Transit Keys
|
||||
|
||||
| Key | Purpose |
|
||||
| ----------------------- | ------------------------------------------------ |
|
||||
| `mosaic-credentials` | User-stored credentials (API keys, git tokens) |
|
||||
| `mosaic-account-tokens` | BetterAuth OAuth tokens in accounts table |
|
||||
| `mosaic-federation` | Federation private keys (replaces CryptoService) |
|
||||
| `mosaic-llm-config` | LLM provider API keys |
|
||||
|
||||
### Production Hardening
|
||||
|
||||
For production deployments (documented in `docs/OPENBAO.md`):
|
||||
|
||||
- Upgrade to 3-of-5 Shamir key splitting: `bao operator rekey -key-shares=5 -key-threshold=3`
|
||||
- Enable TLS on listener
|
||||
- Use external KMS for auto-unseal (AWS KMS, GCP CKMS, Azure Key Vault)
|
||||
- Enable audit logging: `bao audit enable file file_path=/bao/logs/audit.log`
|
||||
- Use Raft or Consul storage backend for HA
|
||||
- Revoke root token after initial setup
|
||||
|
||||
## Federation Isolation
|
||||
|
||||
Credentials must never leak across federation boundaries:
|
||||
|
||||
1. **RLS enforcement** — Federated queries go through `QueryService` which operates within a
|
||||
specific workspace context. RLS policies restrict to authenticated user.
|
||||
2. **Explicit deny-list** — `QueryService` denies queries for `UserCredential` entity type
|
||||
3. **Transit key isolation** — Each credential type uses a separate named key. Federation keys
|
||||
(`mosaic-federation`) cannot decrypt user credentials (`mosaic-credentials`).
|
||||
4. **Endpoint isolation** — Credential API requires session auth. Federated requests use
|
||||
signature-based auth and cannot access credential endpoints.
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Security Foundations (p0)
|
||||
|
||||
Fix immediate security gaps:
|
||||
|
||||
| Issue | Title |
|
||||
| ----- | ------------------------------------------------------ |
|
||||
| #351 | Create RLS context interceptor (fix SEC-API-4) |
|
||||
| #350 | Add RLS policies to auth tables with FORCE enforcement |
|
||||
| #352 | Encrypt existing plaintext Account tokens |
|
||||
|
||||
### Phase 2: OpenBao Integration (p1)
|
||||
|
||||
Add OpenBao and VaultService:
|
||||
|
||||
| Issue | Title |
|
||||
| ----- | ---------------------------------------------------------- |
|
||||
| #357 | Add OpenBao to Docker Compose (turnkey setup) |
|
||||
| #353 | Create VaultService NestJS module for OpenBao Transit |
|
||||
| #354 | Write OpenBao documentation and production hardening guide |
|
||||
|
||||
### Phase 3: User Credential Storage (p1)
|
||||
|
||||
Build the credential management system:
|
||||
|
||||
| Issue | Title |
|
||||
| ----- | ---------------------------------------------------- |
|
||||
| #355 | Create UserCredential Prisma model with RLS policies |
|
||||
| #356 | Build credential CRUD API endpoints |
|
||||
|
||||
### Phase 4: Frontend (p1)
|
||||
|
||||
User-facing credential management:
|
||||
|
||||
| Issue | Title |
|
||||
| ----- | ------------------------------------------ |
|
||||
| #358 | Build frontend credential management pages |
|
||||
|
||||
### Phase 5: Migration and Hardening (p1-p3)
|
||||
|
||||
Encrypt remaining plaintext and harden federation:
|
||||
|
||||
| Issue | Title |
|
||||
| ----- | ----------------------------------------- |
|
||||
| #359 | Encrypt LLM provider API keys in database |
|
||||
| #360 | Federation credential isolation |
|
||||
| #361 | Credential audit log viewer (stretch) |
|
||||
|
||||
### Phase Dependencies
|
||||
|
||||
```
|
||||
Phase 1 (RLS + Token Encryption)
|
||||
└── Phase 2 (OpenBao + VaultService)
|
||||
├── Phase 3 (Credential Model + API)
|
||||
│ └── Phase 4 (Frontend)
|
||||
└── Phase 5 (LLM Migration + Federation)
|
||||
```
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
| Risk | Mitigation |
|
||||
| ---------------------------------- | -------------------------------------------------------------------- |
|
||||
| FORCE RLS breaks Prisma migrations | Owner bypass policy grants full access to `mosaic` role |
|
||||
| FORCE RLS breaks BetterAuth writes | Interceptor sets user context; BetterAuth uses same client |
|
||||
| OpenBao container fails to start | VaultService falls back to AES-256-GCM; app stays functional |
|
||||
| Data migration corrupts tokens | Run in transaction; backup first; format prefix tracking |
|
||||
| BetterAuth reads encrypted tokens | Prisma middleware transparently decrypts on read |
|
||||
| Transit key rotation | OpenBao handles versioning transparently; old ciphertext stays valid |
|
||||
|
||||
## Key Files Reference
|
||||
|
||||
| Purpose | Path |
|
||||
| ---------------------- | -------------------------------------------------------------------------- |
|
||||
| Existing CryptoService | `apps/api/src/federation/crypto.service.ts` |
|
||||
| RLS context utilities | `apps/api/src/lib/db-context.ts` |
|
||||
| Prisma schema | `apps/api/prisma/schema.prisma` |
|
||||
| RLS migration | `apps/api/prisma/migrations/20260129221004_add_rls_policies/migration.sql` |
|
||||
| Docker Compose | `docker/docker-compose.yml` |
|
||||
| App module | `apps/api/src/app.module.ts` |
|
||||
| Auth guards | `apps/api/src/auth/guards/auth.guard.ts` |
|
||||
| Workspace guard | `apps/api/src/common/guards/workspace.guard.ts` |
|
||||
| Security review | `docs/reports/codebase-review-2026-02-05/01-security-review.md` |
|
||||
Reference in New Issue
Block a user