Files
stack/docs/plans/auth-frontend-remediation.md
Jason Woltje dedc1af080
All checks were successful
ci/woodpecker/push/infra Pipeline was successful
ci/woodpecker/push/web Pipeline was successful
ci/woodpecker/push/api Pipeline was successful
fix(auth): restore BetterAuth OIDC flow across api/web/compose
2026-02-17 23:37:49 -06:00

797 lines
41 KiB
Markdown

# Auth & Frontend Remediation Plan
**Created:** 2026-02-16
**Status:** Draft - Pending milestone/issue creation
**Scope:** Backend auth hardening + frontend OIDC-aware multi-method login
**Branch:** `develop`
---
## Executive Summary
The Mosaic Stack authentication system has critical gaps that cause silent 500 errors
in production and leave the frontend unable to adapt to backend configuration. The
frontend login UI is hardcoded for OIDC-only authentication with no fallback, no error
display, and no awareness of backend state.
This plan addresses both sides with a phased approach: fix the backend validation and
error handling first, then build a proper multi-method login UI that adapts to the
backend's advertised capabilities.
---
## Table of Contents
1. [Current State Assessment](#1-current-state-assessment)
2. [Architecture Design](#2-architecture-design)
3. [Backend Remediation](#3-backend-remediation)
4. [Frontend Remediation](#4-frontend-remediation)
5. [Security Hardening](#5-security-hardening)
6. [Implementation Phases](#6-implementation-phases)
7. [File Change Map](#7-file-change-map)
8. [Testing Strategy](#8-testing-strategy)
9. [Rollout & Rollback](#9-rollout--rollback)
10. [Open Questions](#10-open-questions)
---
## 1. Current State Assessment
### Backend
| Area | Status | Issue |
| ------------------------- | ---------------------- | ---------------------------------------------------- |
| OIDC startup validation | Incomplete | `OIDC_REDIRECT_URI` not validated |
| BetterAuth error handling | Missing | Silent 500s bypass NestJS exception filter |
| Auth config discovery | Missing | Frontend cannot learn what auth methods exist |
| Email/password backend | Enabled but incomplete | No email verification service configured |
| Docker env vars | Inconsistent | Swarm compose has no default for `OIDC_REDIRECT_URI` |
| trustedOrigins | Hardcoded | Production URLs in source code |
| PKCE | Not enabled | genericOAuth lacks `pkce: true` |
### Frontend
| Area | Grade | Issue |
| ------------------ | ----- | ------------------------------------------- |
| Auth flow | C+ | OIDC-only, no fallback path |
| OIDC awareness | D | Zero conditional logic, no env check |
| Login UI | C | Single OAuth button, no email/password form |
| Error display | D | Callback errors silently lost |
| Session management | A- | AuthProvider is solid |
| Route protection | B | Component-level only, no middleware |
| Theme storage key | Bug | Still reads `"jarvis-theme"` |
### Root Cause of Production 500
`POST /auth/sign-in/oauth2` returns 500 because:
1. `OIDC_REDIRECT_URI` may be empty in the Swarm deployment (no default value)
2. BetterAuth's genericOAuth plugin fails when constructing the authorization URL
3. The error is swallowed — `toNodeHandler()` operates outside NestJS exception handling
4. `validateOidcConfig()` checks only 3 of 4 required OIDC variables
---
## 2. Architecture Design
### Auth Discovery Pattern
The backend advertises available auth methods via `GET /auth/config`. The frontend
fetches this on the login page and renders the appropriate UI dynamically.
```
Browser API Authentik
│ │ │
│ GET /auth/config │ │
├────────────────────────►│ │
│◄────────────────────────┤ │
│ { providers: [...] } │ │
│ │ │
│ (render login UI │ │
│ based on providers) │ │
│ │ │
│ POST /auth/sign-in/... │ │
├────────────────────────►│ │
│ (BetterAuth handles flow) │
│ ├────────────────────────►│
│ │◄────────────────────────┤
│◄────────────────────────┤ │
│ Set-Cookie + redirect │ │
```
**Why backend discovery (not build-time env var):**
- Auth config can change without rebuilding the frontend Docker image
- Health-aware: backend can disable a provider if its upstream is unreachable
- Single source of truth: no risk of frontend/backend config drift
### Auth Config Response Shape
```typescript
interface AuthConfigResponse {
providers: AuthProvider[];
}
interface AuthProvider {
id: string; // "authentik", "email"
name: string; // Display name for UI
type: "oauth" | "credentials";
}
```
**What is NOT exposed:** client secrets, client IDs, issuer URLs, redirect URIs,
session expiry times, rate limit thresholds. Only capability metadata.
### Frontend Auth State Machine
```
loading ──► unauthenticated ──► authenticating ──► authenticated
│ │ │
│◄───── error ◄──────┘ │
│ │
│◄──────────── session-expired ◄────────┘
```
States:
- `loading` — checking `/auth/session` on mount
- `unauthenticated` — no valid session, show login page
- `authenticating` — OAuth redirect or form submission in progress
- `authenticated` — valid session, user object available
- `error` — auth failed (network, credentials, OAuth, backend)
- `session-expired` — session ended mid-use, redirect to login
---
## 3. Backend Remediation
### 3.1 Extend Startup Validation
**File:** `apps/api/src/auth/auth.config.ts`
Add `OIDC_REDIRECT_URI` to `REQUIRED_OIDC_ENV_VARS`. Add URL format validation:
- Must be a valid URL
- Path must start with `/auth/oauth2/callback`
- Warn if using `localhost` in production
**Tests to add:** Missing var, invalid URL, invalid path, valid URL.
### 3.2 Auth Config Discovery Endpoint
**New endpoint:** `GET /auth/config` (public, no auth required)
Returns the list of enabled providers:
- Always includes `email` provider (when `emailAndPassword.enabled`)
- Includes `authentik` provider only when `OIDC_ENABLED=true` **and** the OIDC
provider is reachable (health check)
Cache: `Cache-Control: public, max-age=300` (5 minutes).
No rate limiting needed (read-only, public, low-risk).
**OIDC Health Check:** Implement `isOidcProviderReachable()` in `AuthService` that
fetches the OIDC discovery URL with a 2-second timeout. Cache the result for 30
seconds to avoid repeated network calls. When Authentik is unreachable, the
`authentik` provider is omitted from the config response, causing the frontend to
hide the OAuth button and show only email/password login.
**Secret leakage prevention:** The response must NOT contain `OIDC_CLIENT_SECRET`,
`OIDC_CLIENT_ID`, `OIDC_ISSUER`, `OIDC_REDIRECT_URI`, `BETTER_AUTH_SECRET`,
`JWT_SECRET`, `CSRF_SECRET`, session expiry times, or rate limit thresholds.
Add an explicit test that serializes the response body and asserts none of these
patterns appear.
**Files:**
- `apps/api/src/auth/auth.controller.ts` — add endpoint
- `apps/api/src/auth/auth.service.ts` — add `getAuthConfig()` and `isOidcProviderReachable()`
- `packages/shared/src/types/auth.types.ts` — add `AuthProvider`, `AuthConfigResponse`
### 3.3 BetterAuth Error Handling Wrapper
**File:** `apps/api/src/auth/auth.controller.ts`
Wrap the `handler(req, res)` call in try/catch:
- Log errors with full context (method, URL, stack trace)
- If response not yet sent, throw `HttpException` to trigger `GlobalExceptionFilter`
- If response already started, log warning only (can't throw after headers sent)
### 3.4 Docker Compose Fixes
**File:** `docker-compose.swarm.portainer.yml`
Change line 115 from:
```yaml
OIDC_REDIRECT_URI: ${OIDC_REDIRECT_URI}
```
To:
```yaml
OIDC_REDIRECT_URI: ${OIDC_REDIRECT_URI:-}
```
Empty string is intentional — startup validation catches it when OIDC is enabled.
### 3.5 Email/Password Status Decision
BetterAuth `emailAndPassword: { enabled: true }` is set but incomplete:
- No `sendVerificationEmail` callback configured
- No `sendResetPassword` callback configured
- No email service (SMTP/SendGrid) integrated
**Decision:** Keep enabled without email verification for MVP. This provides a
fallback login method when Authentik is unreachable. Users can sign in with
email/password but cannot reset passwords or verify email addresses. A future
milestone should add an email service (SMTP/SendGrid) with `sendVerificationEmail`
and `sendResetPassword` callbacks.
**Trade-off acknowledged:** The backend specialist recommended disabling until email
service exists (safer). We chose to keep enabled because: (a) it provides the only
fallback when Authentik is down, (b) the risk is limited — no public sign-up means
only admin-created accounts can use it, (c) password reset can be handled manually
by admins until the email service is added.
### 3.6 Extract trustedOrigins to Environment Variables
**File:** `apps/api/src/auth/auth.config.ts`
Replace hardcoded origins with a `getTrustedOrigins()` function that reads:
- `NEXT_PUBLIC_APP_URL` (primary frontend URL)
- `NEXT_PUBLIC_API_URL` (API's own origin)
- `TRUSTED_ORIGINS` (comma-separated additional origins)
- Development-only localhost fallbacks
Align with CORS configuration in `main.ts` to use the same origin list.
### 3.7 Enable PKCE
**File:** `apps/api/src/auth/auth.config.ts`
Add `pkce: true` to the genericOAuth provider config. PKCE (Proof Key for Code
Exchange) prevents authorization code interception attacks. Authentik supports PKCE.
---
## 4. Frontend Remediation
### 4.1 Login Page Redesign
The login page adapts based on the auth config from `GET /auth/config`:
**When OIDC is enabled (OAuth + email/password):**
```
┌─────────────────────────────────┐
│ Welcome to Mosaic Stack │
│ │
│ [error banner if ?error param] │
│ │
│ ┌─────────────────────────┐ │
│ │ Continue with Authentik │ │ ← OAuthButton (primary)
│ └─────────────────────────┘ │
│ │
│ ──── or continue with email ── │ ← AuthDivider
│ │
│ Email: [________________] │
│ Password: [_____________] │ ← LoginForm (secondary)
│ [ Continue ] │
│ │
└─────────────────────────────────┘
```
**When OIDC is disabled (email/password only):**
```
┌─────────────────────────────────┐
│ Welcome to Mosaic Stack │
│ │
│ [error banner if ?error param] │
│ │
│ Email: [________________] │
│ Password: [_____________] │ ← LoginForm (primary)
│ [ Continue ] │
│ │
└─────────────────────────────────┘
```
### 4.2 New Components
| Component | File | Purpose |
| ---------------------- | ------------------------------------------ | ------------------------------------------------------- |
| `OAuthButton` | `components/auth/OAuthButton.tsx` | Replaces `LoginButton`. Loading state, provider config. |
| `LoginForm` | `components/auth/LoginForm.tsx` | Email/password form with validation |
| `AuthErrorBanner` | `components/auth/AuthErrorBanner.tsx` | PDA-friendly error display |
| `AuthDivider` | `components/auth/AuthDivider.tsx` | "or continue with email" separator |
| `SessionExpiryWarning` | `components/auth/SessionExpiryWarning.tsx` | Floating banner when session nears expiry |
**Delete:** `components/auth/LoginButton.tsx` (replaced by `OAuthButton`)
### 4.3 PDA-Friendly Error Messages
All error messages follow PDA design principles. No alarming language.
| Error Source | Message |
| --------------------- | ---------------------------------------------------------------- |
| OAuth callback failed | "Authentication paused. Please try again when ready." |
| Invalid credentials | "The email and password combination wasn't recognized." |
| Network failure | "Unable to connect. Check your network and try again." |
| Backend 500 | "The service is taking a break. Please try again in a moment." |
| Backend 502/503 | "The service is temporarily unavailable. Try again in a moment." |
| Backend 504 | "The connection took longer than expected. Check your network." |
| Rate limited | "You've tried a few times. Take a moment and try again shortly." |
| Session expired | "Your session ended. Please sign in again when ready." |
**Colors:** Blue info banner (`bg-blue-50`, `border-blue-200`, `text-blue-700`).
No red. No warning icons. Info icon only.
### 4.4 Auth Config Fetching
The login page fetches `GET /auth/config` on mount to determine which providers
to render. If the fetch fails, fall back to showing only the email/password form
(safest default).
```typescript
// In login page
const [authConfig, setAuthConfig] = useState<AuthConfigResponse | null>(null);
useEffect(() => {
fetch(`${API_BASE_URL}/auth/config`)
.then((res) => res.json())
.then(setAuthConfig)
.catch(() => {
// Fallback: show email/password only
setAuthConfig({ providers: [{ id: "email", name: "Email", type: "credentials" }] });
});
}, []);
```
### 4.5 Loading States
- **OAuth button:** Shows spinner + "Connecting..." during redirect
- **Login form:** Inputs disabled + submit button shows spinner during API call
- **Callback page:** Already has spinner (no changes needed)
- **Session check:** Full-page spinner while AuthProvider checks `/auth/session`
### 4.6 Error Display on Login Page
The login page reads `?error=` query params from the callback redirect and displays
them in the `AuthErrorBanner`. Error codes are sanitized against an allowlist (already
implemented in callback page).
### 4.7 Fix Theme Storage Key
**File:** `apps/web/src/providers/ThemeProvider.tsx`
Change `STORAGE_KEY` from `"jarvis-theme"` to `"mosaic-theme"`.
### 4.8 Accessibility Requirements
- All form inputs have associated `<label>` with `htmlFor`
- Error messages use `role="alert"` and `aria-live="polite"`
- Loading states use `role="status"` and `aria-label`
- Focus management: auto-focus first input on page load
- Keyboard navigation: Tab through all interactive elements
- Color contrast: WCAG 2.1 AA minimum (4.5:1 for text)
- Screen reader: descriptive `aria-label` on icon-only buttons
---
## 5. Security Hardening
### Priority: Critical
| Finding | Risk | Fix |
| -------------------------------------- | ---------------------------------- | ------------------------------------------------ |
| Missing `OIDC_REDIRECT_URI` validation | Open redirect / callback hijacking | Add to `REQUIRED_OIDC_ENV_VARS` + URL validation |
| PKCE not enabled | Authorization code interception | Add `pkce: true` to genericOAuth config |
### Priority: High
| Finding | Risk | Fix |
| --------------------------------------- | --------------------------------- | ----------------------------------- |
| Auth config endpoint could leak secrets | Information disclosure | Strict DTO, test for secret leakage |
| BetterAuth errors not logged | Blind spot in security monitoring | Try/catch wrapper with logging |
### Priority: Medium
| Finding | Risk | Fix |
| ------------------------------- | ----------------------------- | -------------------------------------------------------------------------- |
| `@SkipCsrf()` on auth catch-all | Potential CSRF on auth routes | Document rationale (BetterAuth handles CSRF internally via Fetch Metadata) |
| Session lacks idle timeout | Stale sessions | Set `updateAge` < `expiresIn` (e.g., 2h idle, 7d absolute) |
| trustedOrigins hardcoded | Configuration inflexibility | Extract to env vars |
| Rate limits too uniform | OAuth callbacks may hit limit | Consider dynamic rate limits per endpoint type |
### Priority: Low
| Finding | Risk | Fix |
| ---------------------------------- | ------------------------- | ------------------------------------- |
| Cross-origin cookie behavior | Safari/Firefox edge cases | Document; add `COOKIE_DOMAIN` env var |
| Error info leakage from BetterAuth | Stack trace exposure | Error wrapper returns generic message |
### Session Configuration Recommendation
```typescript
session: {
expiresIn: 60 * 60 * 24 * 7, // 7 days absolute max
updateAge: 60 * 60 * 2, // 2 hours idle timeout (sliding window)
},
advanced: {
defaultCookieAttributes: {
httpOnly: true,
secure: process.env.NODE_ENV === "production",
sameSite: "lax",
},
},
```
---
## 6. Implementation Phases
### Phase 1: Critical Backend Fixes
**Scope:** Fix the production 500 error and logging blind spot.
| Story | Description |
| ----- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1.1 | Add `OIDC_REDIRECT_URI` to `REQUIRED_OIDC_ENV_VARS` with URL + path validation |
| 1.2 | Wrap BetterAuth handler in try/catch with error logging |
| 1.3 | Fix `docker-compose.swarm.portainer.yml` OIDC_REDIRECT_URI default |
| 1.4 | Enable PKCE in genericOAuth config. Verify PKCE works by checking that authorization URL includes `code_challenge` parameter. |
| 1.5 | Add inline documentation above `@SkipCsrf()` explaining: BetterAuth implements CSRF protection internally via Fetch Metadata headers (Sec-Fetch-Site, Sec-Fetch-Mode) and SameSite=Lax cookies. The `@SkipCsrf()` skips the _custom_ CSRF guard to avoid double-protection conflicts. Reference: https://www.better-auth.com/docs/reference/security |
**Tests:** Validation tests, error wrapper tests, PKCE verification (check `code_challenge` in auth URL).
**Dependencies:** None. Can deploy independently.
### Phase 2: Auth Config Discovery
**Scope:** Backend advertises available auth methods.
| Story | Description |
| ----- | -------------------------------------------------------------------------------- |
| 2.1 | Add `AuthProvider` and `AuthConfigResponse` types to `@mosaic/shared` |
| 2.2 | Implement `getAuthConfig()` in `AuthService` |
| 2.3 | Add `GET /auth/config` endpoint in `AuthController` |
| 2.4 | Add secret-leakage prevention test |
| 2.5 | Implement `isOidcProviderReachable()` health check with 2s timeout and 30s cache |
**Tests:** Unit tests for service, integration test for endpoint, health check mock tests.
**Dependencies:** None. Backward compatible (frontend doesn't use it yet).
### Phase 3: Backend Hardening
**Scope:** Security and configuration improvements.
| Story | Description |
| ----- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 3.1 | Extract `trustedOrigins` to `getTrustedOrigins()` with env vars |
| 3.2 | Align CORS config in `main.ts` to use same origin list |
| 3.3 | Update session config: change `expiresIn` to `60 * 60 * 24 * 7` (7 days absolute), `updateAge` to `60 * 60 * 2` (2h idle timeout). Add `advanced.defaultCookieAttributes` with explicit `httpOnly`, `secure`, `sameSite: "lax"`. Note: existing sessions will expire naturally under old rules; no migration needed. |
| 3.4 | Add `TRUSTED_ORIGINS`, `COOKIE_DOMAIN` to `.env.example` |
**Tests:** Origin extraction tests, session config tests.
**Dependencies:** Phase 1 (validation fixes).
### Phase 4: Frontend Foundation
**Scope:** Base components needed for the login page redesign.
| Story | Description |
| ----- | ----------------------------------------------------------------------------------- |
| 4.1 | Fix theme storage key (`"jarvis-theme"` to `"mosaic-theme"`) |
| 4.2 | Create `AuthErrorBanner` component with PDA-friendly messages |
| 4.3 | Create `AuthDivider` component |
| 4.4 | Create `OAuthButton` component (replaces `LoginButton`) |
| 4.5 | Create `LoginForm` component with email/password validation |
| 4.6 | Create `SessionExpiryWarning` component (floating banner, PDA-friendly, blue theme) |
**Tests:** Unit tests for each component.
**Dependencies:** None (components are independent).
### Phase 5: Login Page Integration
**Scope:** Wire everything together on the login page.
| Story | Description |
| ----- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 5.1 | Fetch `GET /auth/config` on login page mount |
| 5.2 | Render providers dynamically (OAuth + email/password) |
| 5.3 | Read and display `?error=` query params via `AuthErrorBanner` |
| 5.4 | Add loading states to OAuth and form flows |
| 5.5 | Delete old `LoginButton.tsx` and update imports |
| 5.6 | Responsive layout (mobile-first) |
| 5.7 | Accessibility audit per Section 4.8: run axe-core on login page, verify keyboard navigation, verify WCAG 2.1 AA color contrast (4.5:1), verify screen reader flow |
**Tests:** Integration tests, E2E tests for both auth paths.
**Dependencies:** Phase 2 (backend config endpoint), Phase 4 (components).
### Phase 6: Error Recovery & Polish
**Scope:** Robust error handling and UX polish.
| Story | Description |
| ----- | --------------------------------------------------------------------------------------------------- |
| 6.1 | Create `auth-errors.ts` with error parsing and PDA message mapping |
| 6.2 | Add retry logic for network errors (3x exponential backoff) |
| 6.3 | Enhance AuthProvider with `session-expiring` state |
| 6.4 | Integrate `SessionExpiryWarning` into authenticated layout; trigger 5 minutes before session expiry |
| 6.5 | Update `auth-client.ts` error messages to PDA-friendly language |
**Tests:** Error parsing tests, retry logic tests, state transition tests.
**Dependencies:** Phase 5 (login page working).
---
## 7. File Change Map
### Files to Create
| File | Phase | Purpose |
| ------------------------------------------------------- | ----- | ------------------------------------- |
| `apps/web/src/components/auth/OAuthButton.tsx` | 4 | OAuth login button with loading state |
| `apps/web/src/components/auth/LoginForm.tsx` | 4 | Email/password form |
| `apps/web/src/components/auth/AuthErrorBanner.tsx` | 4 | PDA-friendly error display |
| `apps/web/src/components/auth/AuthDivider.tsx` | 4 | "or continue with email" separator |
| `apps/web/src/lib/auth/auth-errors.ts` | 6 | Error parsing and message mapping |
| `apps/web/src/components/auth/SessionExpiryWarning.tsx` | 4 | Session expiry warning banner |
### Files to Modify
| File | Phase | Changes |
| ------------------------------------------ | ----- | ------------------------------------------------------------------ |
| `apps/api/src/auth/auth.config.ts` | 1, 3 | OIDC_REDIRECT_URI validation, PKCE, trustedOrigins, session config |
| `apps/api/src/auth/auth.config.spec.ts` | 1, 3 | New validation tests |
| `apps/api/src/auth/auth.controller.ts` | 1, 2 | Error wrapper, GET /auth/config endpoint |
| `apps/api/src/auth/auth.service.ts` | 2 | getAuthConfig() method |
| `apps/api/src/main.ts` | 3 | Align CORS with getTrustedOrigins() |
| `packages/shared/src/types/auth.types.ts` | 2 | AuthProvider, AuthConfigResponse types |
| `packages/shared/src/types/index.ts` | 2 | Export new types |
| `apps/web/src/app/(auth)/login/page.tsx` | 5 | Full redesign with dynamic providers |
| `apps/web/src/lib/auth-client.ts` | 6 | PDA error messages |
| `apps/web/src/lib/auth/auth-context.tsx` | 6 | Session-expiring state |
| `apps/web/src/providers/ThemeProvider.tsx` | 4 | Fix storage key |
| `docker-compose.swarm.portainer.yml` | 1 | OIDC_REDIRECT_URI default |
| `.env.example` | 3 | New env vars documented |
### Files to Delete
| File | Phase | Reason |
| --------------------------------------------------- | ----- | ----------------------------- |
| `apps/web/src/components/auth/LoginButton.tsx` | 5 | Replaced by OAuthButton |
| `apps/web/src/components/auth/LoginButton.test.tsx` | 5 | Replaced by OAuthButton tests |
---
## 8. Testing Strategy
### Unit Tests (per phase)
**Phase 1:**
- `validateOidcConfig()` with missing OIDC_REDIRECT_URI
- `validateOidcConfig()` with invalid URL
- `validateOidcConfig()` with invalid callback path
- `validateOidcConfig()` with valid config
- BetterAuth handler error wrapper (mock throw, headers sent vs not)
- PKCE: verify authorization URL includes `code_challenge` parameter
**Phase 2:**
- `getAuthConfig()` with OIDC enabled returns authentik provider
- `getAuthConfig()` with OIDC disabled returns only email provider
- `getAuthConfig()` omits authentik when OIDC provider is unreachable (mock health check)
- `GET /auth/config` returns correct response shape
- `GET /auth/config` sets Cache-Control header
- Secret leakage: serialize response body and assert it does NOT contain `CLIENT_SECRET`, `CLIENT_ID`, `JWT_SECRET`, `BETTER_AUTH_SECRET`, `CSRF_SECRET`, `REDIRECT_URI`, or `callback` strings
- Health check: `isOidcProviderReachable()` returns false on timeout, caches result for 30s
**Phase 3:**
- `getTrustedOrigins()` with various env configs
- `getTrustedOrigins()` excludes localhost in production
- Session idle timeout behavior
**Phase 4:**
- `AuthErrorBanner` renders PDA-friendly messages
- `AuthErrorBanner` is dismissible
- `LoginForm` validates email and password
- `LoginForm` shows loading state during submission
- `OAuthButton` shows loading state during redirect
**Phase 5:**
- Login page renders OAuth button when OIDC enabled
- Login page hides OAuth button when OIDC disabled
- Login page displays error from query params
- Login page falls back to email-only on config fetch failure
**Phase 6:**
- `parseAuthError()` classifies all error types
- Retry logic respects max retries
- Session expiry detection triggers warning
### E2E Tests (Playwright)
- Login with OIDC enabled: OAuth button visible, form visible, divider visible
- Login with OIDC disabled: Only email form visible
- Error display: Visit `/login?error=access_denied`, verify PDA-friendly banner
- Form validation: Submit empty form, verify error messages
### Coverage Target
Minimum 85% coverage on all new and modified files.
---
## 9. Rollout & Rollback
### Deployment Order
1. **Phase 1** (backend fixes) — Deploy immediately, fixes production 500
2. **Phase 2** (auth config endpoint) — Deploy next, backward compatible
3. **Phase 3** (hardening) — Deploy with Phase 2 or independently
4. **Phase 4-6** (frontend) — Deploy together as a single release
### Rollback Plan
**Backend (Phases 1-3):** All changes are backward compatible. No rollback needed
unless a bug is introduced. Standard revert-and-redeploy.
**Frontend (Phases 4-6):** If the new login page has issues:
1. Tag the current `develop` branch before deploying Phases 4-6: `git tag pre-auth-redesign`
2. The old `LoginButton` component is preserved in git history
3. Revert the frontend changes and redeploy from the tag
4. Backend auth config endpoint remains deployed but unused
### Monitoring
After deployment, watch for:
- Auth error rates (should decrease after Phase 1)
- `/auth/config` latency and error rate
- Login success rate by method (OAuth vs email/password)
- BetterAuth handler error logs (new visibility from Phase 1)
---
## 10. Open Questions
### Q1: Should the auth config be polled or fetched once?
**Context:** The frontend needs to know what auth methods are available.
**Recommendation:** Fetch once on login page mount. The 5-minute cache header on the
backend handles staleness. No polling needed — auth config rarely changes at runtime.
### Q2: Rate limiting granularity
**Context:** Current rate limit is 10 req/min for all auth routes. OAuth callbacks
could hit this during normal use (3 requests per login attempt).
**Recommendation:** Keep uniform rate limit for now. If users report being rate-limited
during OAuth flows, implement dynamic rate limits per endpoint type in a follow-up.
### Q3: Feature flag for gradual rollout?
**Context:** The frontend changes are significant. A feature flag could enable gradual
rollout.
**Recommendation:** Not needed for this scope. The changes are deterministic (based on
backend config) and can be tested thoroughly in staging before production deployment.
---
## Appendix A: Data Flow Diagrams
### OAuth Sign-In Flow
```
Browser NestJS API Authentik
│ │ │
│ 1. GET /auth/config │ │
├───────────────────────────►│ │
│◄───────────────────────────┤ │
│ { providers: [authentik] } │ │
│ │ │
│ 2. Click "Continue with Authentik" │
│ signIn.oauth2({ providerId: "authentik" }) │
│ │ │
│ 3. POST /auth/sign-in/oauth2 │
├───────────────────────────►│ │
│ BetterAuth constructs auth URL │
│◄───────────────────────────┤ │
│ 302 → Authentik authorize │ │
│ │ │
│ 4. Redirect to Authentik │ │
├────────────────────────────────────────────────────►│
│ │ User authenticates│
│◄────────────────────────────────────────────────────┤
│ 302 → /auth/oauth2/callback/authentik?code=X │
│ │ │
│ 5. GET /auth/oauth2/callback/authentik?code=X │
├───────────────────────────►│ │
│ BetterAuth exchanges code │
│ ├───────────────────────►│
│ │◄───────────────────────┤
│ │ tokens │
│ Creates User + Session │
│◄───────────────────────────┤ │
│ Set-Cookie + 302 → / │ │
│ │ │
│ 6. Frontend /callback page │ │
│ refreshSession() │ │
│ redirect → /tasks │ │
```
### Email/Password Sign-In Flow
```
Browser NestJS API
│ │
│ 1. GET /auth/config │
├───────────────────────────►│
│◄───────────────────────────┤
│ { providers: [email] } │
│ │
│ 2. User enters email + password
│ Clicks "Continue" │
│ │
│ 3. POST /auth/sign-in │
│ { email, password } │
├───────────────────────────►│
│ BetterAuth verifies
│ Creates Session
│◄───────────────────────────┤
│ Set-Cookie + user data │
│ │
│ 4. Frontend stores session │
│ redirect → /tasks │
```
---
## Appendix B: PDA Error Message Reference
These messages replace standard error language throughout the auth UI.
| Standard Term | PDA-Friendly Alternative |
| ----------------- | ------------------------------------------------- |
| Error | (omit entirely or use "Notice") |
| Failed | "Paused" or "Didn't complete" |
| Invalid | "Wasn't recognized" |
| Required | "Needed to continue" |
| Try again | "Consider trying again" or "Try again when ready" |
| Unauthorized | "Session ended" |
| Forbidden | "Not available" |
| Timeout | "Took longer than expected" |
| Too many requests | "You've tried a few times" |
**Color scheme:** Blue (`bg-blue-50`), not red. Info icon, not warning icon.
---
## Appendix C: Environment Variables
New or modified environment variables introduced by this plan:
| Variable | Default | Required | Description |
| ------------------- | ------- | ----------------- | -------------------------------------------- |
| `OIDC_REDIRECT_URI` | (empty) | When OIDC enabled | OAuth callback URL |
| `TRUSTED_ORIGINS` | (empty) | No | Additional trusted origins (comma-separated) |
| `COOKIE_DOMAIN` | (empty) | No | Cookie domain for cross-subdomain sessions |
---
_End of plan. Future sessions will use this to create the proper milestone and
issues in the Gitea repository._