# Performance Optimization — P8-003 **Branch:** `feat/p8-003-performance` **Target metrics:** <200 ms TTFB, <2 s page loads --- ## What Was Profiled The following areas were reviewed through static analysis and code-path tracing (no production traffic available; findings are based on measurable code-level patterns): | Area | Findings | | ---------------------------------- | -------------------------------------------------------------------------------------------------------- | | `packages/db` | Connection pool unbounded (default 10, no idle/connect timeout) | | `apps/gateway/src/preferences` | N+1 round-trip on every pref upsert (SELECT + INSERT/UPDATE) | | `packages/brain/src/conversations` | Unbounded list queries — no `LIMIT` or `ORDER BY` | | `packages/db/src/schema` | Missing hot-path indexes: auth session lookup, OAuth callback, conversation list, agent-log tier queries | | `apps/gateway/src/gc` | Cold-start GC blocked NestJS bootstrap (synchronous `await` in `onModuleInit`) | | `apps/web/next.config.ts` | Missing `compress: true`, no `productionBrowserSourceMaps: false`, no image format config | --- ## Changes Made ### 1. DB Connection Pool — `packages/db/src/client.ts` **Problem:** `postgres()` was called with no pool config. The default max of 10 connections and no idle/connect timeouts meant the pool could hang indefinitely on a stale TCP connection. **Fix:** - `max`: 20 connections (configurable via `DB_POOL_MAX`) - `idle_timeout`: 30 s (configurable via `DB_IDLE_TIMEOUT`) — recycle stale connections - `connect_timeout`: 5 s (configurable via `DB_CONNECT_TIMEOUT`) — fail fast on unreachable DB **Expected impact:** Eliminates pool exhaustion under moderate concurrency; removes indefinite hangs when the DB is temporarily unreachable. --- ### 2. Preferences Upsert — `apps/gateway/src/preferences/preferences.service.ts` **Problem:** `upsertPref` executed two serial DB round-trips on every preference write: ``` 1. SELECT id FROM preferences WHERE user_id = ? AND key = ? (→ check exists) 2a. UPDATE preferences SET value = ? … (→ if found) 2b. INSERT INTO preferences … (→ if not found) ``` Under concurrency this also had a TOCTOU race window. **Fix:** Replaced with single-statement `INSERT … ON CONFLICT DO UPDATE`: ```sql INSERT INTO preferences (user_id, key, value, mutable) VALUES (?, ?, ?, true) ON CONFLICT (user_id, key) DO UPDATE SET value = excluded.value, updated_at = now(); ``` This required promoting `preferences_user_key_idx` from a plain index to a `UNIQUE INDEX` (see migration `0003_p8003_perf_indexes.sql`). **Expected impact:** ~50% reduction in DB round-trips for preference writes; eliminates the race window. --- ### 3. Missing DB Indexes — `packages/db/src/schema.ts` + migration The following indexes were added or replaced to cover common query patterns: | Table | Old indexes | New / changed | | --------------- | ------------------------------------------------- | --------------------------------------------------------------------------------------------------- | | `sessions` | _(none)_ | `sessions_user_id_idx(user_id)`, `sessions_expires_at_idx(expires_at)` | | `accounts` | _(none)_ | `accounts_provider_account_idx(provider_id, account_id)`, `accounts_user_id_idx(user_id)` | | `conversations` | `(user_id)`, `(archived)` separate | `conversations_user_archived_idx(user_id, archived)` compound | | `agent_logs` | `(session_id)`, `(tier)`, `(created_at)` separate | `agent_logs_session_tier_idx(session_id, tier)`, `agent_logs_tier_created_at_idx(tier, created_at)` | | `preferences` | non-unique `(user_id, key)` | **unique** `(user_id, key)` — required for `ON CONFLICT` | **Expected impact:** - Auth session validation (hot path on every request): from seq scan → index scan - OAuth callback account lookup: from seq scan → index scan - Conversation list (dashboard load): compound index covers `WHERE user_id = ? ORDER BY updated_at` - Log summarisation cron: `(tier, created_at)` index enables efficient hot→warm promotion query All changes are in `packages/db/drizzle/0003_p8003_perf_indexes.sql`. --- ### 4. Conversation Queries — `packages/brain/src/conversations.ts` **Problem:** `findAll(userId)` and `findMessages(conversationId)` were unbounded — no `LIMIT` and `findAll` had no `ORDER BY`, so the DB planner may not use the index efficiently. **Fix:** - `findAll`: `ORDER BY updated_at DESC LIMIT 200` — returns most-recent conversations first - `findMessages`: `ORDER BY created_at ASC LIMIT 500` — chronological message history **Expected impact:** Prevents accidental full-table scans on large datasets; ensures the frontend receives a usable, ordered result set regardless of table growth. --- ### 5. Cold-Start GC — `apps/gateway/src/gc/session-gc.service.ts` **Problem:** `onModuleInit()` was `async` and `await`-ed `fullCollect()`, which blocked the NestJS module initialization chain. Full GC — which calls `redis.keys('mosaic:session:*')` and a DB query — typically takes 100–500 ms. This directly added to startup TTFB. **Fix:** Made `onModuleInit()` synchronous and used `.then().catch()` to run GC in the background. The first HTTP request is no longer delayed by GC work. **Expected impact:** Removes 100–500 ms from cold-start TTFB. --- ### 6. Next.js Config — `apps/web/next.config.ts` **Problem:** `compress: true` was not set, so response payloads were uncompressed. No image format optimization or source-map suppression was configured. **Fix:** - `compress: true` — enables gzip/brotli for all Next.js responses - `productionBrowserSourceMaps: false` — reduces build output size - `images.formats: ['image/avif', 'image/webp']` — Next.js Image component will serve modern formats to browsers that support them (typically 40–60% smaller than JPEG/PNG) **Expected impact:** Typical HTML/JSON gzip savings of 60–80%; image serving cost reduced for any `` components added in the future. --- ## What Was Not Changed (Intentionally) - **Caching layer (Valkey/Redis):** The `SystemOverrideService` and GC already use Redis pipelines. `PreferencesService.getEffective()` reads all user prefs in one query — this is appropriate for the data size and doesn't warrant an additional cache layer yet. - **WebSocket backpressure:** The `ChatGateway` already drops events for disconnected clients (`client.connected` check) and cleans up listeners on disconnect. No memory leak was found. - **Plugin/skill loader startup:** `SkillLoaderService.loadForSession()` is called on first session creation, not on startup. Already non-blocking. - **Frontend React memoization:** No specific hot components were identified as causing excessive re-renders without profiling data. No speculative `memo()` calls added. --- ## How to Apply ```bash # Run the DB migration (requires a live DB) pnpm --filter @mosaic/db exec drizzle-kit migrate # Or, in Docker/Swarm — migrations run automatically on gateway startup # via runMigrations() in packages/db/src/migrate.ts ``` --- _Generated by P8-003 performance optimization task — 2026-03-18_