Issues resolved:
- #68: pgvector Setup
* Added pgvector vector index migration for knowledge_embeddings
* Vector index uses HNSW algorithm with cosine distance
* Optimized for 1536-dimension OpenAI embeddings
- #69: Embedding Generation Pipeline
* Created EmbeddingService with OpenAI integration
* Automatic embedding generation on entry create/update
* Batch processing endpoint for existing entries
* Async generation to avoid blocking API responses
* Content preparation with title weighting
- #70: Semantic Search API
* POST /api/knowledge/search/semantic - pure vector search
* POST /api/knowledge/search/hybrid - RRF combined search
* POST /api/knowledge/embeddings/batch - batch generation
* Comprehensive test coverage
* Full documentation in docs/SEMANTIC_SEARCH.md
Technical details:
- Uses OpenAI text-embedding-3-small model (1536 dims)
- HNSW index for O(log n) similarity search
- Reciprocal Rank Fusion for hybrid search
- Graceful degradation when OpenAI not configured
- Async embedding generation for performance
Configuration:
- Added OPENAI_API_KEY to .env.example
- Optional feature - disabled if API key not set
- Falls back to keyword search in hybrid mode
- Integrated BetterAuth library for modern authentication
- Added Session, Account, and Verification database tables
- Created complete auth module with service, controller, guards, and decorators
- Implemented shared authentication types in @mosaic/shared package
- Added comprehensive test coverage (26 tests passing)
- Documented type sharing strategy for monorepo
- Updated environment configuration with OIDC and JWT settings
Key architectural decisions:
- BetterAuth over Passport.js for better TypeScript support
- Separation of User (DB entity) vs AuthUser (client-safe subset)
- Shared types package to prevent FE/BE drift
- Factory pattern for auth config to use shared Prisma instance
Ready for frontend integration (Issue #6).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes#4