Files
stack/docs/plans/gateway-token-recovery.md
jason.woltje 651426cf2e
Some checks failed
ci/woodpecker/push/publish Pipeline failed
ci/woodpecker/push/ci Pipeline failed
docs(plan): gateway admin token recovery flow (#401)
2026-04-05 05:11:33 +00:00

16 KiB

Gateway Admin Token Recovery — Implementation Plan

Mission: cli-unification-20260404 Task: CU-03-01 (planning only — no runtime code changes) Status: Design locked (Session 1) — BetterAuth cookie-based recovery


1. Problem Statement

The gateway installer strands operators when the admin user exists but the admin API token is missing. Concrete trigger:

  • ~/.config/mosaic/gateway/meta.json was deleted / regenerated.
  • The installer was re-run after a previous successful bootstrap.

Flow today (packages/mosaic/src/commands/gateway/install.ts:375-400):

  1. bootstrapFirstUser hits GET /api/bootstrap/status.
  2. Server returns needsSetup: false because users count > 0.
  3. Installer logs Admin user already exists — skipping setup. (No admin token on file — sign in via the web UI to manage tokens.) and returns.
  4. The operator now has:
    • No token in meta.json.
    • No CLI path to mint a new one (mosaic gateway <anything> that needs the token fails).
    • POST /api/bootstrap/setup locked out — it only runs when users count is zero (apps/gateway/src/admin/bootstrap.controller.ts:34-37).
    • POST /api/admin/tokens gated by AdminGuard — requires either a bearer token (which they don't have) or a BetterAuth session (which they don't have in the CLI).

Dead end. The web UI is the only escape hatch today, and for headless installs even that may be inaccessible.

2. Design Summary

The BetterAuth session cookie is the authority. The operator runs mosaic gateway login to sign in with email/password, which persists a session cookie via saveSession (reusing packages/mosaic/src/auth.ts). With a valid session, mosaic gateway config recover-token (stranded-operator entry point) and mosaic gateway config rotate-token call the existing authenticated admin endpoint POST /api/admin/tokens using the cookie, then persist the returned plaintext to meta.json via writeMeta. No new server endpoints are requiredAdminGuard already accepts BetterAuth session cookies via its validateSession path (apps/gateway/src/admin/admin.guard.ts:90-120).

3. Surface Contract

3.1 Server — no changes required

Endpoint Status Notes
POST /api/admin/tokens Reuse as-is admin-tokens.controller.ts:46-72. Returns { id, label, scope, expiresAt, lastUsedAt, createdAt, plaintext }.
GET /api/admin/tokens Reuse Useful for mosaic gateway config tokens list follow-on (out of scope for CU-03-01, but trivial once auth path exists).
DELETE /api/admin/tokens/:id Reuse Used by rotate flow for optional old-token revocation.
POST /api/bootstrap/setup Unchanged Remains first-user-only; not part of recovery.

AdminGuard.validateSession takes BetterAuth cookies from request.raw.headers via fromNodeHeaders and calls auth.api.getSession({ headers }). It also enforces role === 'admin'. This is exactly the path the CLI will hit with Cookie: better-auth.session_token=....

Confirmed feasible during CU-03-01 investigation.

3.2 mosaic gateway login

Thin wrapper over the existing top-level mosaic login (packages/mosaic/src/cli.ts:42-76) with gateway-specific defaults pulled from readMeta().

Aspect Behavior
Default gateway URL http://${meta.host}:${meta.port} from readMeta(), fallback http://localhost:14242.
Flow Prompt email + password -> signIn() -> saveSession().
Persistence ~/.mosaic/session.json via existing saveSession (7-day expiry).
Decision Thin wrapper, not alias. Rationale: defaults differ (reads meta.json), and discoverability under mosaic gateway --help.
Implementation Share the sign-in logic by extracting a small runLogin(gatewayUrl, email?, password?) helper; both commands call it.

3.3 mosaic gateway config rotate-token

Aspect Behavior
Precondition Valid session (via loadSession + validateSession). On failure, print: "Not signed in — run mosaic gateway login" and exit non-zero.
Request POST ${gatewayUrl}/api/admin/tokens with header Cookie: <session>, body { label: "CLI token (rotated YYYY-MM-DD)" }.
On success Read meta via readMeta(), set meta.adminToken = plaintext, writeMeta(meta). Print the token banner (reuse printAdminTokenBanner shape).
Old token Optional --revoke-old flag. When set and a previous meta.adminToken existed, call DELETE /api/admin/tokens/:id after rotation. Requires listing first to find the id; punt to CU-03-02 decision. Document as nice-to-have.
Exit codes 0 success; 1 network error; 2 auth error; 3 server rejection.

3.4 mosaic gateway config recover-token

Superset of rotate-token with an inline login nudge — the "stranded operator" entry point.

Step Action
1 readMeta() — derive gateway URL. If meta is missing entirely, fall back to --gateway flag or default.
2 loadSession(gatewayUrl) then validateSession. If either fails, prompt inline: email + password -> signIn -> saveSession.
3 POST /api/admin/tokens with cookie, label "Recovered via CLI YYYY-MM-DDTHH:mm".
4 Persist plaintext to meta.json via writeMeta.
5 Print the token banner and next-steps hints (e.g. mosaic gateway status).
6 Exit 0.

Key property: this command is runnable with nothing but email+password in hand. It assumes the gateway is up but assumes no prior CLI session state.

3.5 File touch list (for CU-03-02..05 execution)

File Change
packages/mosaic/src/commands/gateway.ts Register login, config recover-token, config rotate-token subcommands under gw.
packages/mosaic/src/commands/gateway/config.ts Add runRecoverToken, runRotateToken handlers; export from module.
packages/mosaic/src/commands/gateway/login.ts (new) Thin wrapper calling shared runLogin helper with meta-derived default URL.
packages/mosaic/src/auth.ts No change expected. Possibly export a requireSession(gatewayUrl) helper (reuse pattern).
packages/mosaic/src/commands/gateway/install.ts bootstrapFirstUser branch: "user exists, no token" -> offer recovery (see Section 4).

4. Installer Fix (CU-03-06 preview)

Current stranding point is install.ts:388-395. The fix:

if (!status.needsSetup) {
  if (meta.adminToken) {
    // unchanged — happy path
  } else {
    // NEW: prompt "Admin exists but no token on file. Recover now? [Y/n]"
    // If yes -> call runRecoverToken(gatewayUrl) inline (interactive):
    //   - prompt email + password
    //   - signIn -> saveSession
    //   - POST /api/admin/tokens
    //   - writeMeta(meta) with returned plaintext
    //   - print banner
    // If no -> print the current stranded message but include:
    //   "Run `mosaic gateway config recover-token` when ready."
  }
}

Shape notes (actual code lands in CU-03-06):

  • Extract the recovery body so it can be called both from the standalone command and from bootstrapFirstUser without duplicating prompts.
  • Reuse the same rl readline interface already open in bootstrapFirstUser for the inline prompts.
  • Preserve non-interactive behavior: if process.stdin.isTTY is false, skip the prompt and emit the "run recover-token" hint only.

5. Test Strategy (CU-03-07 scope)

5.1 Happy paths

Command Scenario Expected
mosaic gateway login Valid creds session.json written, 7-day expiry, exit 0
mosaic gateway config rotate-token Valid session, server reachable meta.json updated, banner printed, new token usable
mosaic gateway config recover-token No session, valid creds, server reachable Prompts for creds, writes session + meta, exit 0
Installer inline recovery Re-run after meta.json wipe, operator says yes Meta restored, banner printed, no manual CLI step needed

5.2 Error paths (must all produce actionable messages and non-zero exit)

Failure Expected handling
Invalid email/password BetterAuth 401 surfaced as "Sign-in failed: ", exit 2
Expired stored session Recover command silently re-prompts; rotate command exits 2 with "run login" hint
Gateway down / connection refused "Could not reach gateway at " exit 1
Server rejects token creation Print status + body excerpt, exit 3
Meta file missing (recover) Fall back to --gateway flag or default; warn that meta will be created
Non-admin user AdminGuard 403 surfaced as "User is not an admin", exit 2

Spin up gateway in test harness, create admin user via /api/bootstrap/setup, wipe meta.json, invoke mosaic gateway config recover-token programmatically, assert new meta.adminToken works against GET /api/admin/tokens.

6. Risks & Open Questions

# Item Severity Mitigation
1 AdminGuard.validateSession calls getSession with fromNodeHeaders(request.raw.headers). CLI sends Cookie: header only. Confirm BetterAuth reads from Cookie, not Set-Cookie. Low Confirmed — mosaic login + mosaic tui already use this flow successfully (cli.ts:137-181).
2 Session cookie local expiry (7d) vs BetterAuth server-side expiry may drift. Low validateSession hits get-session; handle 401 by re-prompting.
3 Label collision / unbounded token growth if operators run recover-token repeatedly. Low Include ISO timestamp in label. Optional --revoke-old in CU-03-02. Add tokens list/prune later.
4 mosaic login exists at top level and mosaic gateway login is a wrapper — risk of confusion. Low Document that gateway login is the preferred entry for gateway operators; top-level stays for compatibility.
5 meta.json write is not atomic. Crash between token creation and writeMeta leaves an orphan token server-side with no plaintext on disk. Medium Accept for now — re-running recover-token mints a fresh token. Document as known limitation.
6 Non-TTY installer runs (CI, headless provisioners) cannot prompt for creds interactively. Medium Installer inline recovery must skip prompt when !process.stdin.isTTY; emit the recover-token hint.
7 If BETTER_AUTH_SECRET rotates between login and recover, the session cookie is invalid — user must re-login. Acceptable but surface a clear error. Low Error handler maps 401 on recover -> "Session invalid; re-run mosaic gateway login".
8 No MFA today. When MFA lands, BetterAuth sign-in will return a challenge, not a cookie — recovery UX will need a second prompt step. Future Out of scope for this mission. Flag for future CLI work.

7. Downstream Task Hooks

Task Scope
CU-03-02 Implement mosaic gateway login wrapper + shared runLogin extraction.
CU-03-03 Implement mosaic gateway config rotate-token.
CU-03-04 Implement mosaic gateway config recover-token.
CU-03-05 Wire commands into gateway.ts registration, update --help copy.
CU-03-06 Installer inline recovery hook in bootstrapFirstUser.
CU-03-07 Tests per Section 5.
CU-03-08 Docs: update gateway install README + operator runbook with recovery flow.