# Issue #189: Add Composite Database Index for job_events Table ## Objective Add an optimal composite index to the `job_events` table to improve query performance based on common access patterns identified in the codebase. ## Analysis of Query Patterns ### Current Schema (line 1193-1213 in schema.prisma) ```prisma model JobEvent { id String @id @default(uuid()) @db.Uuid jobId String @map("job_id") @db.Uuid stepId String? @map("step_id") @db.Uuid // Event details type String timestamp DateTime @db.Timestamptz actor String payload Json // Relations job RunnerJob @relation(fields: [jobId], references: [id], onDelete: Cascade) step JobStep? @relation(fields: [stepId], references: [id], onDelete: Cascade) @@index([jobId]) @@index([stepId]) @@index([timestamp]) @@index([type]) @@map("job_events") } ``` ### Identified Query Patterns #### 1. **JobEventsService.getEventsByJobId** (lines 71-106) ```typescript // WHERE clause: { jobId, [type?], [stepId?] } // ORDER BY: { timestamp: "asc" } // Pagination: skip, take ``` - **Columns used in WHERE**: `jobId`, optionally `type`, optionally `stepId` - **Columns used in ORDER BY**: `timestamp` #### 2. **JobEventsService.findByJob** (lines 202-219) ```typescript // WHERE clause: { jobId } // ORDER BY: { timestamp: "asc" } ``` - **Columns used in WHERE**: `jobId` - **Columns used in ORDER BY**: `timestamp` #### 3. **RunnerJobsService.findOne** (lines 120-144) ```typescript // events: { orderBy: { timestamp: "asc" } } ``` - Uses relation through `jobId` (implicit WHERE) - **Columns used in ORDER BY**: `timestamp` #### 4. **RunnerJobsService.streamEvents** (lines 269-275) ```typescript // WHERE clause: { jobId, timestamp: { gt: lastEventTime } } // ORDER BY: { timestamp: "asc" } ``` - **Columns used in WHERE**: `jobId`, `timestamp` (range query) - **Columns used in ORDER BY**: `timestamp` #### 5. **HeraldService.broadcastJobEvent** (lines 73-81) ```typescript // WHERE clause: { jobId, type: JOB_CREATED } // Uses findFirst ``` - **Columns used in WHERE**: `jobId`, `type` ## Composite Index Design ### Most Common Access Pattern The **dominant query pattern** across all services is: ```sql WHERE jobId = ? [AND type = ?] [AND stepId = ?] ORDER BY timestamp ASC ``` ### Recommended Composite Index ```prisma @@index([jobId, timestamp]) ``` ### Rationale 1. **Covers the most frequent query**: Filtering by `jobId` + ordering by `timestamp` 2. **Efficient for range queries**: `RunnerJobsService.streamEvents` uses `timestamp > lastEventTime` which benefits from the composite index 3. **Supports partial matching**: Queries filtering only by `jobId` can still use the index effectively 4. **Complements existing indexes**: We keep the single-column indexes for `type` and `stepId` since they're used independently in some queries ### Alternative Considered ```prisma @@index([jobId, type, timestamp]) ``` **Rejected because**: - `type` filtering is used in only 2 out of 5 query patterns - Would create a larger index with marginal benefit - Single-column `type` index is sufficient for the rare queries that filter by type alone ## Approach ### Step 1: Write Performance Tests (TDD - RED) Create test file: `apps/api/src/job-events/job-events.performance.spec.ts` - Test query performance for `getEventsByJobId` - Test query performance for `streamEvents` with timestamp range - Measure query execution time before index ### Step 2: Create Prisma Migration (TDD - GREEN) - Add composite index `@@index([jobId, timestamp])` to schema.prisma - Generate migration using `pnpm prisma:migrate dev` - Run migration against test database ### Step 3: Verify Performance Improvement - Re-run performance tests - Verify query times improved - Document results in this scratchpad ### Step 4: Commit and Update Issue - Commit with format: `fix(#189): add composite database index for job_events table` - Update issue #189 with completion status ## Progress - [x] Analyze schema and query patterns - [x] Identify optimal composite index - [x] Document rationale - [x] Write performance tests - [x] Add composite index to schema - [x] Create migration file - [ ] Apply migration (pending database schema sync) - [ ] Run performance tests - [ ] Verify performance improvement - [ ] Commit changes - [ ] Update issue ## Testing Performance tests will validate: 1. Query execution time improvement for `jobId + timestamp` queries 2. Index is used by PostgreSQL query planner (EXPLAIN ANALYZE) 3. No regression in other query patterns ## Notes - The composite index `[jobId, timestamp]` is optimal because: - `jobId` is highly selective (unique per job) - `timestamp` ordering is always required - This pattern appears in 100% of job event queries - Existing single-column indexes remain valuable for admin queries that filter by type or stepId alone - PostgreSQL can efficiently use this composite index for range queries on timestamp ### Migration Status - **Migration file created**: `20260202122655_add_job_events_composite_index/migration.sql` - **Database status**: The `job_events` table hasn't been created yet in the local database - **Pending migrations**: The database has migration history divergence. The following migrations need to be applied first: - `20260129232349_add_agent_task_model` - `20260130002000_add_knowledge_embeddings_vector_index` - `20260131115600_add_llm_provider_instance` - `20260201205935_add_job_tracking` (creates job_events table) - `20260202122655_add_job_events_composite_index` (this migration) - **Note**: The migration is ready and will be applied automatically when `prisma migrate dev` or `prisma migrate deploy` is run with synchronized migration history