# Vault Secrets Management Guide This guide applies when the project uses HashiCorp Vault for secrets management. ## Before Starting 1. Verify Vault access: `vault status` 2. Authenticate: `vault login` (method depends on environment) 3. Check your permissions for the required paths ## Canonical Structure **ALL Vault secrets MUST follow this structure:** ``` {mount}/{service}/{component}/{secret-name} ``` ### Components - **mount**: Environment-specific mount point - **service**: The service or application name - **component**: Logical grouping (database, api, oauth, etc.) - **secret-name**: Specific secret identifier ## Environment Mounts | Mount | Environment | Usage | | ----------------- | ----------- | ---------------------- | | `secret-dev/` | Development | Local dev, CI | | `secret-staging/` | Staging | Pre-production testing | | `secret-prod/` | Production | Live systems | ## Examples ```bash # Database credentials secret-prod/postgres/database/app secret-prod/mysql/database/readonly secret-staging/redis/auth/default # API tokens secret-prod/authentik/admin/token secret-prod/stripe/api/live-key secret-dev/sendgrid/api/test-key # JWT/Authentication secret-prod/backend-api/jwt/signing-key secret-prod/auth-service/session/secret # OAuth providers secret-prod/backend-api/oauth/google secret-prod/backend-api/oauth/github # Internal services secret-prod/loki/read-auth/admin secret-prod/grafana/admin/password ``` ## Standard Field Names Use consistent field names within secrets: | Purpose | Fields | | ----------- | ---------------------------- | | Credentials | `username`, `password` | | Tokens | `token` | | OAuth | `client_id`, `client_secret` | | Connection | `url`, `host`, `port` | | Keys | `public_key`, `private_key` | ### Example Secret Structure ```json // secret-prod/postgres/database/app { "username": "app_user", "password": "secure-password-here", "host": "db.example.com", "port": "5432", "database": "myapp" } ``` ## Rules 1. **DO NOT GUESS** secret paths - Always verify the path exists 2. **Use helper scripts** in `scripts/vault/` when available 3. **All lowercase, hyphenated** (kebab-case) for all path segments 4. **Standard field names** - Use the conventions above 5. **No sensitive data in path names** - Path itself should not reveal secrets 6. **Environment separation** - Never reference prod secrets from dev ## Deprecated Paths (DO NOT USE) These legacy patterns are deprecated and should be migrated: | Deprecated | Migrate To | | ------------------------- | ------------------------------------------- | | `secret/infrastructure/*` | `secret-{env}/{service}/...` | | `secret/oauth/*` | `secret-{env}/{service}/oauth/{provider}` | | `secret/database/*` | `secret-{env}/{service}/database/{user}` | | `secret/credentials/*` | `secret-{env}/{service}/{component}/{name}` | ## Reading Secrets ### CLI ```bash # Read a secret vault kv get secret-prod/postgres/database/app # Get specific field vault kv get -field=password secret-prod/postgres/database/app # JSON output vault kv get -format=json secret-prod/postgres/database/app ``` ### Application Code **Python (hvac):** ```python import hvac client = hvac.Client(url='https://vault.example.com') secret = client.secrets.kv.v2.read_secret_version( path='postgres/database/app', mount_point='secret-prod' ) password = secret['data']['data']['password'] ``` **Node.js (node-vault):** ```javascript const vault = require('node-vault')({ endpoint: 'https://vault.example.com' }); const secret = await vault.read('secret-prod/data/postgres/database/app'); const password = secret.data.data.password; ``` **Go:** ```go secret, err := client.Logical().Read("secret-prod/data/postgres/database/app") password := secret.Data["data"].(map[string]interface{})["password"].(string) ``` ## Writing Secrets Only authorized personnel should write secrets. If you need a new secret: 1. Request through proper channels (ticket, PR to IaC repo) 2. Follow the canonical structure 3. Document the secret's purpose 4. Set appropriate access policies ```bash # Example (requires write permissions) vault kv put secret-dev/myapp/database/app \ username="dev_user" \ password="dev-password" \ host="localhost" \ port="5432" ``` ## Troubleshooting ### Permission Denied ``` Error: permission denied ``` - Verify your token has read access to the path - Check if you're using the correct mount point - Confirm the secret path exists ### Secret Not Found ``` Error: no value found at secret-prod/data/service/component/name ``` - Verify the exact path (use `vault kv list` to explore) - Check for typos in service/component names - Confirm you're using the correct environment mount ### Token Expired ``` Error: token expired ``` - Re-authenticate: `vault login` - Check token TTL: `vault token lookup` ## Security Best Practices 1. **Least privilege** - Request only the permissions you need 2. **Short-lived tokens** - Use tokens with appropriate TTLs 3. **Audit logging** - All access is logged; act accordingly 4. **No local copies** - Don't store secrets in files or env vars long-term 5. **Rotate on compromise** - Immediately rotate any exposed secrets --- ## Secrets Architecture Decision Matrix Use this table to choose between the ESO bridge (default) and Direct-Vault (opt-in) patterns for every new app or integration. | Factor | ESO Bridge (default) | Direct-Vault (opt-in) | | --- | --- | --- | | **Use-case** | All static secrets (DB creds, API keys, signing keys, OAuth secrets) | Dynamic creds with short TTLs (DB rotation, AWS STS, PKI), per-request audit trails, or lease renewal mid-pod-lifecycle | | **App code change** | None — reads standard env vars via `secretKeyRef` | Requires Vault client (`hvac`, `node-vault`, `vault/api`) in application code | | **Secret rotation** | ESO re-syncs on Vault write; pod restart or secret refresh picks up new value | App manages lease renewal or re-auth within the running process | | **Audit granularity** | Access logged at Vault when ESO syncs; no per-request app audit | Every app request to Vault is a separate audit log entry | | **Operational burden** | Low — ESO handles polling, sync, and k8s Secret lifecycle | Higher — app must handle auth, lease renewal, error paths, and token rotation | | **Justification required?** | No — this is the default | Yes — document in project README under "Secrets architecture" | | **Example use cases** | Web app DB password, OAuth client secret, JWT signing key, API token | HashiCorp DB secrets engine with 15-min TTL leases, AWS STS assume-role, Vault PKI short-lived certs | **Decision rule:** If you are unsure, use ESO. Only justify Direct-Vault when the secret cannot be safely stored in a k8s Secret (too short-lived, per-request TTL required, or mid-lifecycle renewal needed). --- ## ESO Bridge Pattern (Default) This is the required default for all k8s workloads. Follow this exact pattern unless a documented dynamic-secrets requirement justifies Direct-Vault. ### 1. Provision Vault path ```bash # Write the secrets for the app (run once; use IaC/Terraform for repeatable provisioning) vault kv put secret/k3s/ \ db_password="..." \ api_key="..." \ jwt_secret="..." ``` Use the canonical path structure: `secret/k3s/` for k3s cluster workloads. ### 2. ExternalSecret manifest Commit this to the repo's `deploy/` or `k8s/` directory: ```yaml # deploy/external-secret.yaml apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: -secrets namespace: spec: refreshInterval: 1h secretStoreRef: name: vault-backend # ClusterSecretStore name — verify with cluster admin kind: ClusterSecretStore target: name: -secrets # k8s Secret name that will be created creationPolicy: Owner data: - secretKey: DB_PASSWORD # key in the k8s Secret remoteRef: key: secret/k3s/ # Vault path property: db_password # field within the Vault secret - secretKey: API_KEY remoteRef: key: secret/k3s/ property: api_key - secretKey: JWT_SECRET remoteRef: key: secret/k3s/ property: jwt_secret ``` ### 3. Deployment manifest — reference synced k8s Secret ```yaml # deploy/deployment.yaml (env section) env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: -secrets # matches ExternalSecret target.name key: DB_PASSWORD - name: API_KEY valueFrom: secretKeyRef: name: -secrets key: API_KEY - name: JWT_SECRET valueFrom: secretKeyRef: name: -secrets key: JWT_SECRET - name: PORT value: "3000" # safe-default: non-secret, no Vault needed ``` ### 4. App-side schema validation — TypeScript (zod) Validate all required env vars at startup. Exit non-zero on missing values. ```typescript // src/env.ts import { z } from 'zod'; const envSchema = z.object({ DB_PASSWORD: z.string().min(1, 'DB_PASSWORD is required'), API_KEY: z.string().min(1, 'API_KEY is required'), JWT_SECRET: z.string().min(32, 'JWT_SECRET must be at least 32 chars'), PORT: z.coerce.number().default(3000), NODE_ENV: z.enum(['development', 'production', 'test']).default('production'), }); const result = envSchema.safeParse(process.env); if (!result.success) { console.error('Missing or invalid environment variables:'); console.error(result.error.flatten().fieldErrors); process.exit(1); } export const env = result.data; ``` ### 4b. App-side schema validation — Python (pydantic) ```python # src/config.py from pydantic_settings import BaseSettings, SettingsConfigDict class Settings(BaseSettings): db_password: str api_key: str jwt_secret: str port: int = 3000 node_env: str = "production" model_config = SettingsConfigDict(env_file=None) # no .env in prod try: settings = Settings() except Exception as e: import sys print(f"Missing or invalid environment variables: {e}", file=sys.stderr) sys.exit(1) ``` ### 4c. App-side schema validation — Go (envconfig) ```go // config/config.go package config import ( "fmt" "github.com/kelseyhightower/envconfig" ) type Config struct { DBPassword string `envconfig:"DB_PASSWORD" required:"true"` APIKey string `envconfig:"API_KEY" required:"true"` JWTSecret string `envconfig:"JWT_SECRET" required:"true"` Port int `envconfig:"PORT" default:"3000"` } func Load() (*Config, error) { var cfg Config if err := envconfig.Process("", &cfg); err != nil { return nil, fmt.Errorf("invalid environment: %w", err) } return &cfg, nil } ``` In your `main.go`: ```go cfg, err := config.Load() if err != nil { fmt.Fprintln(os.Stderr, err) os.Exit(1) } ``` --- ## Direct-Vault Opt-In Pattern Use this pattern ONLY when a documented dynamic-secrets requirement applies (DB rotation with short TTLs, AWS STS, PKI, per-request audit). Document the justification in the project README under "Secrets architecture" before implementing. ### When it is justified - Vault DB secrets engine with lease TTLs shorter than a typical pod lifecycle (< 1 hour) - AWS STS assume-role tokens generated per-request - Vault PKI short-lived certificates (< 24 hours) that must be renewed within a running pod - Per-request audit trail requirement (each app call must appear separately in Vault audit log) ### Provision an AppRole for the app ```bash # Enable AppRole auth (if not already enabled) vault auth enable approle # Create a Vault policy for the app # Note: KV v2 paths require both the exact path (for the top-level secret) and the # wildcard (for sub-paths). Always include both to avoid permission denied errors. vault policy write -policy - <" { capabilities = ["read"] } path "secret/data/k3s//*" { capabilities = ["read"] } path "database/creds/-role" { capabilities = ["read"] } EOF # Create the AppRole vault write auth/approle/role/-role \ token_policies="-policy" \ token_ttl=1h \ token_max_ttl=4h \ secret_id_ttl=0 # Retrieve role-id and secret-id vault read auth/approle/role/-role/role-id vault write -f auth/approle/role/-role/secret-id ``` ### Bootstrap AppRole credentials via ESO (solving the chicken-and-egg problem) The AppRole `role-id` and `secret-id` are themselves secrets. Store them in Vault at a bootstrap path, then use ESO to sync them into a k8s Secret. The app reads that k8s Secret at startup to authenticate with Vault directly. ```bash # Store the bootstrap credentials in Vault vault kv put secret/k3s/-bootstrap \ role_id="" \ secret_id="" ``` ```yaml # deploy/external-secret-bootstrap.yaml apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: -vault-auth namespace: spec: refreshInterval: 24h secretStoreRef: name: vault-backend kind: ClusterSecretStore target: name: -vault-auth creationPolicy: Owner data: - secretKey: VAULT_ROLE_ID remoteRef: key: secret/k3s/-bootstrap property: role_id - secretKey: VAULT_SECRET_ID remoteRef: key: secret/k3s/-bootstrap property: secret_id ``` ```yaml # deploy/deployment.yaml (env section for Direct-Vault app) env: - name: VAULT_ADDR value: "https://vault.example.com" # safe-default: non-secret cluster address - name: VAULT_ROLE_ID valueFrom: secretKeyRef: name: -vault-auth key: VAULT_ROLE_ID - name: VAULT_SECRET_ID valueFrom: secretKeyRef: name: -vault-auth key: VAULT_SECRET_ID ``` ### App-side Vault client pattern ```typescript // src/vault-client.ts — only exists in Direct-Vault apps import vault from 'node-vault'; import { z } from 'zod'; const bootstrapSchema = z.object({ VAULT_ADDR: z.string().url(), VAULT_ROLE_ID: z.string().min(1), VAULT_SECRET_ID: z.string().min(1), }); const bootstrap = bootstrapSchema.parse(process.env); const client = vault({ endpoint: bootstrap.VAULT_ADDR }); export async function getVaultClient() { const { auth } = await client.approleLogin({ role_id: bootstrap.VAULT_ROLE_ID, secret_id: bootstrap.VAULT_SECRET_ID, }); client.token = auth.client_token; return client; } ``` Document in README under "Secrets architecture": the Vault path, why Direct-Vault is required, and the lease/renewal strategy. --- ## Forbidden Patterns (CI Lint Targets) The following patterns are forbidden in all Mosaic projects. CI lint SHOULD catch these automatically (implementation tracked separately). Agents MUST NOT introduce these patterns. ### 1. Untagged fallback defaults for required values ```yaml # FORBIDDEN — required secret with silent fallback environment: - DB_PASSWORD=${DB_PASSWORD:-changeme} - API_KEY=${API_KEY:-} # REQUIRED — fast-fail on missing required values environment: - DB_PASSWORD=${DB_PASSWORD:?DB_PASSWORD is required} - API_KEY=${API_KEY:?API_KEY is required} # ALLOWED — true convenience default, tagged environment: - PORT=${PORT:-3000} # safe-default: non-secret, app works at any port ``` This applies to: `docker-compose.yml`, k8s manifests, Helm `values.yaml`, any env file committed to git. ### 2. Vault KV calls in application source code (ESO-default projects) ```python # FORBIDDEN in ESO-default apps — direct Vault client in app source import hvac client = hvac.Client(url=os.environ['VAULT_ADDR']) secret = client.secrets.kv.v2.read_secret_version(path='myapp/db') ``` ESO-default apps read env vars only. Direct-Vault clients belong only in apps with a documented dynamic-secrets justification in README. ### 3. Hardcoded secrets or API keys in committed files ```python # FORBIDDEN — hardcoded credential DB_PASSWORD = "supersecret123" API_KEY = "sk-live-abc123" ``` No exceptions. CI lint must flag any string matching common secret patterns (`password`, `secret`, `api_key`, `token` assigned a literal non-env-var value). ### 4. `.env` files in production deployment paths ``` # FORBIDDEN — .env file in a production deploy path deploy/.env k8s/.env docker/.env # ALLOWED — local dev only .env.example # template only, no real values .env # local dev, must be in .gitignore ``` `.env` files are acceptable in local-dev contexts only and MUST be in `.gitignore`. They are forbidden in any path that a CI pipeline or production deployment process reads directly.