diff --git a/docs/plans/gateway-token-recovery.md b/docs/plans/gateway-token-recovery.md new file mode 100644 index 0000000..9fc486c --- /dev/null +++ b/docs/plans/gateway-token-recovery.md @@ -0,0 +1,193 @@ +# Gateway Admin Token Recovery — Implementation Plan + +**Mission:** `cli-unification-20260404` +**Task:** `CU-03-01` (planning only — no runtime code changes) +**Status:** Design locked (Session 1) — BetterAuth cookie-based recovery + +--- + +## 1. Problem Statement + +The gateway installer strands operators when the admin user exists but the admin +API token is missing. Concrete trigger: + +- `~/.config/mosaic/gateway/meta.json` was deleted / regenerated. +- The installer was re-run after a previous successful bootstrap. + +Flow today (`packages/mosaic/src/commands/gateway/install.ts:375-400`): + +1. `bootstrapFirstUser` hits `GET /api/bootstrap/status`. +2. Server returns `needsSetup: false` because `users` count > 0. +3. Installer logs `Admin user already exists — skipping setup. (No admin token on file — sign in via the web UI to manage tokens.)` and returns. +4. The operator now has: + - No token in `meta.json`. + - No CLI path to mint a new one (`mosaic gateway ` that needs the token fails). + - `POST /api/bootstrap/setup` locked out — it only runs when `users` count is zero (`apps/gateway/src/admin/bootstrap.controller.ts:34-37`). + - `POST /api/admin/tokens` gated by `AdminGuard` — requires either a bearer token (which they don't have) or a BetterAuth session (which they don't have in the CLI). + +Dead end. The web UI is the only escape hatch today, and for headless installs even that may be inaccessible. + +## 2. Design Summary + +The BetterAuth session cookie is the authority. The operator runs +`mosaic gateway login` to sign in with email/password, which persists a session +cookie via `saveSession` (reusing `packages/mosaic/src/auth.ts`). With a valid +session, `mosaic gateway config recover-token` (stranded-operator entry point) +and `mosaic gateway config rotate-token` call the existing authenticated admin +endpoint `POST /api/admin/tokens` using the cookie, then persist the returned +plaintext to `meta.json` via `writeMeta`. **No new server endpoints are +required** — `AdminGuard` already accepts BetterAuth session cookies via its +`validateSession` path (`apps/gateway/src/admin/admin.guard.ts:90-120`). + +## 3. Surface Contract + +### 3.1 Server — no changes required + +| Endpoint | Status | Notes | +| ------------------------------ | --------------- | ------------------------------------------------------------------------------------------------------------------------ | +| `POST /api/admin/tokens` | **Reuse as-is** | `admin-tokens.controller.ts:46-72`. Returns `{ id, label, scope, expiresAt, lastUsedAt, createdAt, plaintext }`. | +| `GET /api/admin/tokens` | **Reuse** | Useful for `mosaic gateway config tokens list` follow-on (out of scope for CU-03-01, but trivial once auth path exists). | +| `DELETE /api/admin/tokens/:id` | **Reuse** | Used by rotate flow for optional old-token revocation. | +| `POST /api/bootstrap/setup` | **Unchanged** | Remains first-user-only; not part of recovery. | + +`AdminGuard.validateSession` takes BetterAuth cookies from `request.raw.headers` +via `fromNodeHeaders` and calls `auth.api.getSession({ headers })`. It also +enforces `role === 'admin'`. This is exactly the path the CLI will hit with +`Cookie: better-auth.session_token=...`. + +**Confirmed feasible** during CU-03-01 investigation. + +### 3.2 `mosaic gateway login` + +Thin wrapper over the existing top-level `mosaic login` +(`packages/mosaic/src/cli.ts:42-76`) with gateway-specific defaults pulled from +`readMeta()`. + +| Aspect | Behavior | +| ------------------- | ------------------------------------------------------------------------------------------------------------------------------- | +| Default gateway URL | `http://${meta.host}:${meta.port}` from `readMeta()`, fallback `http://localhost:14242`. | +| Flow | Prompt email + password -> `signIn()` -> `saveSession()`. | +| Persistence | `~/.mosaic/session.json` via existing `saveSession` (7-day expiry). | +| Decision | **Thin wrapper**, not alias. Rationale: defaults differ (reads `meta.json`), and discoverability under `mosaic gateway --help`. | +| Implementation | Share the sign-in logic by extracting a small `runLogin(gatewayUrl, email?, password?)` helper; both commands call it. | + +### 3.3 `mosaic gateway config rotate-token` + +| Aspect | Behavior | +| ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Precondition | Valid session (via `loadSession` + `validateSession`). On failure, print: "Not signed in — run `mosaic gateway login`" and exit non-zero. | +| Request | `POST ${gatewayUrl}/api/admin/tokens` with header `Cookie: `, body `{ label: "CLI token (rotated YYYY-MM-DD)" }`. | +| On success | Read meta via `readMeta()`, set `meta.adminToken = plaintext`, `writeMeta(meta)`. Print the token banner (reuse `printAdminTokenBanner` shape). | +| Old token | **Optional `--revoke-old`** flag. When set and a previous `meta.adminToken` existed, call `DELETE /api/admin/tokens/:id` after rotation. Requires listing first to find the id; punt to CU-03-02 decision. Document as nice-to-have. | +| Exit codes | `0` success; `1` network error; `2` auth error; `3` server rejection. | + +### 3.4 `mosaic gateway config recover-token` + +Superset of `rotate-token` with an inline login nudge — the "stranded operator" +entry point. + +| Step | Action | +| ---- | -------------------------------------------------------------------------------------------------------------------------------- | +| 1 | `readMeta()` — derive gateway URL. If meta is missing entirely, fall back to `--gateway` flag or default. | +| 2 | `loadSession(gatewayUrl)` then `validateSession`. If either fails, prompt inline: email + password -> `signIn` -> `saveSession`. | +| 3 | `POST /api/admin/tokens` with cookie, label `"Recovered via CLI YYYY-MM-DDTHH:mm"`. | +| 4 | Persist plaintext to `meta.json` via `writeMeta`. | +| 5 | Print the token banner and next-steps hints (e.g. `mosaic gateway status`). | +| 6 | Exit `0`. | + +Key property: this command is **runnable with nothing but email+password in hand**. +It assumes the gateway is up but assumes no prior CLI session state. + +### 3.5 File touch list (for CU-03-02..05 execution) + +| File | Change | +| ----------------------------------------------------- | ------------------------------------------------------------------------------------------ | +| `packages/mosaic/src/commands/gateway.ts` | Register `login`, `config recover-token`, `config rotate-token` subcommands under `gw`. | +| `packages/mosaic/src/commands/gateway/config.ts` | Add `runRecoverToken`, `runRotateToken` handlers; export from module. | +| `packages/mosaic/src/commands/gateway/login.ts` (new) | Thin wrapper calling shared `runLogin` helper with meta-derived default URL. | +| `packages/mosaic/src/auth.ts` | No change expected. Possibly export a `requireSession(gatewayUrl)` helper (reuse pattern). | +| `packages/mosaic/src/commands/gateway/install.ts` | `bootstrapFirstUser` branch: "user exists, no token" -> offer recovery (see Section 4). | + +## 4. Installer Fix (CU-03-06 preview) + +Current stranding point is `install.ts:388-395`. The fix: + +``` +if (!status.needsSetup) { + if (meta.adminToken) { + // unchanged — happy path + } else { + // NEW: prompt "Admin exists but no token on file. Recover now? [Y/n]" + // If yes -> call runRecoverToken(gatewayUrl) inline (interactive): + // - prompt email + password + // - signIn -> saveSession + // - POST /api/admin/tokens + // - writeMeta(meta) with returned plaintext + // - print banner + // If no -> print the current stranded message but include: + // "Run `mosaic gateway config recover-token` when ready." + } +} +``` + +Shape notes (actual code lands in CU-03-06): + +- Extract the recovery body so it can be called **both** from the standalone + command and from `bootstrapFirstUser` without duplicating prompts. +- Reuse the same `rl` readline interface already open in `bootstrapFirstUser` + for the inline prompts. +- Preserve non-interactive behavior: if `process.stdin.isTTY` is false, skip the + prompt and emit the "run recover-token" hint only. + +## 5. Test Strategy (CU-03-07 scope) + +### 5.1 Happy paths + +| Command | Scenario | Expected | +| ------------------------------------- | ------------------------------------------------ | -------------------------------------------------------- | +| `mosaic gateway login` | Valid creds | `session.json` written, 7-day expiry, exit 0 | +| `mosaic gateway config rotate-token` | Valid session, server reachable | `meta.json` updated, banner printed, new token usable | +| `mosaic gateway config recover-token` | No session, valid creds, server reachable | Prompts for creds, writes session + meta, exit 0 | +| Installer inline recovery | Re-run after `meta.json` wipe, operator says yes | Meta restored, banner printed, no manual CLI step needed | + +### 5.2 Error paths (must all produce actionable messages and non-zero exit) + +| Failure | Expected handling | +| --------------------------------- | --------------------------------------------------------------------------------- | +| Invalid email/password | BetterAuth 401 surfaced as "Sign-in failed: ", exit 2 | +| Expired stored session | Recover command silently re-prompts; rotate command exits 2 with "run login" hint | +| Gateway down / connection refused | "Could not reach gateway at " exit 1 | +| Server rejects token creation | Print status + body excerpt, exit 3 | +| Meta file missing (recover) | Fall back to `--gateway` flag or default; warn that meta will be created | +| Non-admin user | `AdminGuard` 403 surfaced as "User is not an admin", exit 2 | + +### 5.3 Integration test (recommended) + +Spin up gateway in test harness, create admin user via `/api/bootstrap/setup`, +wipe `meta.json`, invoke `mosaic gateway config recover-token` programmatically, +assert new `meta.adminToken` works against `GET /api/admin/tokens`. + +## 6. Risks & Open Questions + +| # | Item | Severity | Mitigation | +| --- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------------------------------------------------------------------------------------------------------------- | +| 1 | `AdminGuard.validateSession` calls `getSession` with `fromNodeHeaders(request.raw.headers)`. CLI sends `Cookie:` header only. Confirm BetterAuth reads from `Cookie`, not `Set-Cookie`. | Low | Confirmed — `mosaic login` + `mosaic tui` already use this flow successfully (`cli.ts:137-181`). | +| 2 | Session cookie local expiry (7d) vs BetterAuth server-side expiry may drift. | Low | `validateSession` hits `get-session`; handle 401 by re-prompting. | +| 3 | Label collision / unbounded token growth if operators run `recover-token` repeatedly. | Low | Include ISO timestamp in label. Optional `--revoke-old` in CU-03-02. Add `tokens list/prune` later. | +| 4 | `mosaic login` exists at top level and `mosaic gateway login` is a wrapper — risk of confusion. | Low | Document that `gateway login` is the preferred entry for gateway operators; top-level stays for compatibility. | +| 5 | `meta.json` write is not atomic. Crash between token creation and `writeMeta` leaves an orphan token server-side with no plaintext on disk. | Medium | Accept for now — re-running `recover-token` mints a fresh token. Document as known limitation. | +| 6 | Non-TTY installer runs (CI, headless provisioners) cannot prompt for creds interactively. | Medium | Installer inline recovery must skip prompt when `!process.stdin.isTTY`; emit the recover-token hint. | +| 7 | If `BETTER_AUTH_SECRET` rotates between login and recover, the session cookie is invalid — user must re-login. Acceptable but surface a clear error. | Low | Error handler maps 401 on recover -> "Session invalid; re-run `mosaic gateway login`". | +| 8 | No MFA today. When MFA lands, BetterAuth sign-in will return a challenge, not a cookie — recovery UX will need a second prompt step. | Future | Out of scope for this mission. Flag for future CLI work. | + +## 7. Downstream Task Hooks + +| Task | Scope | +| -------- | -------------------------------------------------------------------------- | +| CU-03-02 | Implement `mosaic gateway login` wrapper + shared `runLogin` extraction. | +| CU-03-03 | Implement `mosaic gateway config rotate-token`. | +| CU-03-04 | Implement `mosaic gateway config recover-token`. | +| CU-03-05 | Wire commands into `gateway.ts` registration, update `--help` copy. | +| CU-03-06 | Installer inline recovery hook in `bootstrapFirstUser`. | +| CU-03-07 | Tests per Section 5. | +| CU-03-08 | Docs: update gateway install README + operator runbook with recovery flow. |