Files
stack/docs/scratchpads/362-auth-session-chain-debug.md
Jason Woltje d2cec04cba
All checks were successful
ci/woodpecker/push/api Pipeline was successful
fix(auth): preserve raw BetterAuth cookie token for session lookup
2026-02-18 23:06:37 -06:00

154 lines
6.2 KiB
Markdown

# 362 - Auth Session Chain Debug (Authentik -> BetterAuth -> API Guard)
## Context
- Date (UTC): 2026-02-19
- Environment under test: production domains
- Web: `https://app.mosaicstack.dev/login`
- API: `https://api.mosaicstack.dev`
- IdP: `https://auth.diversecanvas.com`
- Tooling: Playwright MCP + Chromium
## Problem Statement
Users can complete Authentik login and consent, but Mosaic web app returns to login and remains unauthenticated.
## Timeline and Evidence
1. Initial reproduction from web login:
- `POST /auth/sign-in/oauth2` returned `200` with Authentik authorize URL.
- Authentik login flow and consent screen loaded correctly.
2. First callback failure mode (before `jarvis` email fix):
- Callback ended at API error redirect with `error=email_is_missing`.
- Result URL: `https://api.mosaicstack.dev/?error=email_is_missing`.
3. User updated Authentik account:
- `jarvis` account email set to `jarvis@mosaic.local`.
- `email_is_missing` failure no longer occurs.
4. Current callback behavior (after email fix):
- `GET /auth/oauth2/callback/authentik?code=...&state=...` returns `302` to `https://app.mosaicstack.dev/`.
- Callback sets BetterAuth cookies:
- `__Secure-better-auth.state=...; Max-Age=0; ...`
- `__Secure-better-auth.session_token=...; Max-Age=604800; Path=/; HttpOnly; Secure; SameSite=Lax`
- Browser cookie jar confirms session cookie present for `api.mosaicstack.dev`.
5. Session validation mismatch (critical):
- BetterAuth direct session endpoint succeeds:
- `GET /auth/get-session` -> `200` with session payload.
- Guarded API session endpoint fails:
- `GET /auth/session` -> `401` with
`{"message":"Invalid or expired session", ...}`
- Reproduced repeatedly in same browser context immediately after callback.
## Config Sync Notes
User synced local files with deployed Portainer stack:
- `.env` updated with deployed values.
- `docker-compose.swarm.portainer.yml` changed:
- Removed `BETTER_AUTH_URL` env mapping from API service.
Observed auth behavior after sync:
- Improvement: removed `email_is_missing` callback error.
- Remaining failure: `/auth/session` still returns 401 despite valid BetterAuth cookie and successful `/auth/get-session`.
## Root Cause Hypothesis (Strong)
`AuthGuard` extracts BetterAuth session cookie token correctly, but `AuthService.verifySession()` validates it using `Authorization: Bearer <token>` instead of a BetterAuth cookie/header context.
Relevant code paths:
- `apps/api/src/auth/guards/auth.guard.ts`
- extracts `__Secure-better-auth.session_token` / `better-auth.session_token`
- `apps/api/src/auth/auth.service.ts`
- `verifySession()` calls `auth.api.getSession({ headers: { authorization: "Bearer ..." } })`
Why this matches evidence:
- `/auth/get-session` (native BetterAuth endpoint reading request cookie) succeeds.
- `/auth/session` (custom guard + verify path) fails for same browser session.
## Next Actions
1. Fix `verifySession()` to validate using BetterAuth-compatible cookie header candidates first, with bearer fallback for API clients.
2. Add/update unit tests in `auth.service.spec.ts` to cover cookie-first validation and bearer fallback.
3. Re-run targeted API auth tests.
4. Re-run Playwright auth chain to confirm:
- callback sets cookie
- `/auth/session` returns `200`
- web app transitions out of `/login`.
## Implementation Update (2026-02-19)
Completed items:
1. Updated backend session verification logic:
- File: `apps/api/src/auth/auth.service.ts`
- `verifySession()` now tries session resolution in this order:
- `cookie: __Secure-better-auth.session_token=<token>`
- `cookie: better-auth.session_token=<token>`
- `cookie: __Host-better-auth.session_token=<token>`
- `authorization: Bearer <token>` (fallback)
- Added helper methods:
- `buildSessionHeaderCandidates()`
- `isExpectedAuthError()`
2. Added/updated tests:
- File: `apps/api/src/auth/auth.service.spec.ts`
- Added RED->GREEN test:
- `should validate session token using secure BetterAuth cookie header`
- Updated fallback coverage test:
- `should fall back to Authorization header when cookie-based lookups miss`
3. Verification:
- Command: `pnpm --filter @mosaic/api test -- src/auth/auth.service.spec.ts`
- Result: pass (all tests green).
- Command: `pnpm --filter @mosaic/api lint`
- Result: pass.
Remaining step (requires deploy):
- Redeploy API with this patch and rerun live Playwright flow on `app.mosaicstack.dev` to confirm `/auth/session` returns `200` after callback.
## Playwright Re-Check (2026-02-19, later run)
Live flow evidence after previous deploy attempt:
1. OAuth callback succeeds:
- `GET https://api.mosaicstack.dev/auth/oauth2/callback/authentik?code=...&state=...` -> `302`
- Redirect target observed: `https://app.mosaicstack.dev/`
- Browser cookie jar includes:
- `__Secure-better-auth.session_token` on `api.mosaicstack.dev` (HttpOnly, Secure, SameSite=Lax)
2. Session bootstrap still fails immediately:
- `GET https://api.mosaicstack.dev/auth/session` -> `500`
- Response body shape:
- `{"success":false,"message":"An unexpected error occurred","errorId":"...","path":"/auth/session","statusCode":500}`
- Web app returns to login because session fetch fails.
3. Frontend version mismatch observed:
- Live `POST /auth/sign-in/oauth2` response from login flow still shows callback URL pointing to `/dashboard`.
- Current repository login page uses callback URL `/`.
- This indicates deployed web image is older than current `develop` code (or stale image tag in runtime).
## Additional Code Fix Applied Locally (pending push/deploy)
Refined cookie candidate construction in API session verification:
- File: `apps/api/src/auth/auth.service.ts`
- Removed URL-encoding of session token when constructing cookie headers.
- Cookie candidates now pass raw token value exactly as extracted from incoming cookie.
Why:
- BetterAuth cookie tokens can contain characters like `/`, `+`, and `=`.
- Re-encoding these values can mutate token bytes and cause lookup/parse failures.
Regression test added:
- File: `apps/api/src/auth/auth.service.spec.ts`
- `should preserve raw cookie token value without URL re-encoding`