Files
stack/docs/scratchpads/2-postgresql-pgvector-schema.md
Jason Woltje 99afde4f99 feat(#2): Implement PostgreSQL 17 + pgvector database schema
Establishes multi-tenant database layer with vector similarity search for AI-powered memory features. Includes Docker infrastructure, Prisma ORM integration, NestJS services, and shared types across the monorepo.

Key changes:
- Docker: PostgreSQL 17 + pgvector v0.7.4, Valkey cache
- Schema: 8 models (User, Workspace, Task, Event, Project, ActivityLog, MemoryEmbedding) with RLS preparation
- NestJS: PrismaModule, DatabaseModule, EmbeddingsService
- Shared: Type-safe enums, constants, and database types

Fixes #2

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-28 16:06:34 -06:00

3.3 KiB

Issue #2: PostgreSQL 17 + pgvector Schema

Objective

Design and implement the PostgreSQL 17 database schema with pgvector extension for Mosaic Stack.

Approach

  1. Docker Infrastructure - Build PostgreSQL 17 container with pgvector extension
  2. Prisma ORM - Define schema with 8 core models (User, Workspace, Task, Event, Project, etc.)
  3. Multi-tenant Design - All tables indexed by workspace_id for RLS preparation
  4. Vector Embeddings - pgvector integration for semantic memory with HNSW index
  5. NestJS Integration - PrismaService + EmbeddingsService for database operations

Progress

  • Plan approved
  • Phase 1: Docker Setup (5 tasks) - COMPLETED
  • Phase 2: Prisma Schema (5 tasks) - COMPLETED
  • Phase 3: NestJS Integration (5 tasks) - COMPLETED
  • Phase 4: Shared Types & Seed (5 tasks) - COMPLETED
  • Phase 5: Build & Verification (2 tasks) - COMPLETED

Completion Summary

Issue #2 successfully completed on 2026-01-28

What Was Delivered

  1. Docker Infrastructure

    • PostgreSQL 17 with pgvector v0.7.4 (HNSW index enabled)
    • Valkey for caching
    • Custom Dockerfile building pgvector from source
    • Init scripts for extension setup
  2. Database Schema (Prisma)

    • 8 models: User, Workspace, WorkspaceMember, Task, Event, Project, ActivityLog, MemoryEmbedding
    • 6 enums for type safety
    • UUID primary keys throughout
    • HNSW index on memory_embeddings for vector similarity search
    • Full multi-tenant support with workspace_id indexing
    • 2 migrations: init + vector index
  3. NestJS Integration

    • PrismaModule (global)
    • PrismaService with lifecycle hooks and health checks
    • EmbeddingsService for pgvector operations (raw SQL)
    • Health endpoint updated with database status
  4. Shared Types

    • Enums mirroring Prisma schema
    • Entity interfaces for type safety across monorepo
    • Exported from @mosaic/shared
  5. Development Tools

    • Seed script with sample data (user, workspace, project, tasks, event)
    • Prisma scripts in package.json
    • Turbo integration for prisma:generate
    • All builds passing with strict TypeScript

Database Statistics

  • Tables: 8
  • Extensions: uuid-ossp, vector (pgvector 0.7.4)
  • Indexes: 14 total (including 1 HNSW vector index)
  • Seed data: 1 user, 1 workspace, 1 project, 5 tasks, 1 event

Testing

  • Unit tests for PrismaService (connection lifecycle, health check)
  • Unit tests for EmbeddingsService (store, search, delete operations)
  • Integration test with actual PostgreSQL database
  • Seed data validation via Prisma Studio

Notes

Design Decisions

  • UUID primary keys for multi-tenant scalability
  • Native Prisma enums mapped to PostgreSQL enums for type safety
  • Unsupported("vector(1536)") type for pgvector (raw SQL operations)
  • Composite PK for WorkspaceMember (workspaceId + userId)
  • Self-referencing Task model for subtasks support

Key Relations

  • User → ownedWorkspaces (1:N), workspaceMemberships (N:M via WorkspaceMember)
  • Workspace → tasks, events, projects, activityLogs, memoryEmbeddings (1:N each)
  • Task → subtasks (self-referencing), project (optional N:1)

RLS Preparation (M2 Milestone)

  • All tenant tables have workspace_id with index
  • Future: PostgreSQL session variables (app.current_workspace_id, app.current_user_id)
  • Future: RLS policies for workspace isolation