feat(#69): implement embedding generation pipeline

Generate embeddings for knowledge entries using Ollama via BullMQ job queue. Changes: - Created OllamaEmbeddingService for Ollama-based embedding generation - Set up BullMQ queue and processor for async embedding jobs - Integrated queue into knowledge entry lifecycle (create/update) - Added rate limiting (1 job/second) and retry logic (3 attempts) - Added OLLAMA_EMBEDDING_MODEL environment variable configuration - Implemented dimension normalization (padding/truncating to 1536 dimensions) - Added graceful degradation when Ollama is unavailable Test Coverage: - All 31 embedding-related tests passing - ollama-embedding.service.spec.ts: 13 tests - embedding-queue.spec.ts: 6 tests - embedding.processor.spec.ts: 5 tests - Build and linting successful Fixes #69 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 15:06:11 -06:00
parent 3cb6eb7f8b
commit 3dfa603a03
12 changed files with 1099 additions and 6 deletions
--- a/.env.example
+++ b/.env.example
@@ -35,7 +35,9 @@ POSTGRES_MAX_CONNECTIONS=100
 # Valkey Cache (Redis-compatible)
 # ======================
 VALKEY_URL=redis://localhost:6379
+VALKEY_HOST=localhost
 VALKEY_PORT=6379
+# VALKEY_PASSWORD=       # Optional: Password for Valkey authentication
 VALKEY_MAXMEMORY=256mb

 # Knowledge Module Cache Configuration
@@ -92,6 +94,13 @@ JWT_EXPIRATION=24h
 OLLAMA_ENDPOINT=http://ollama:11434
 OLLAMA_PORT=11434

+# Embedding Model Configuration
+# Model used for generating knowledge entry embeddings
+# Default: mxbai-embed-large (1024-dim, padded to 1536)
+# Alternative: nomic-embed-text (768-dim, padded to 1536)
+# Note: Embeddings are padded/truncated to 1536 dimensions to match schema
+OLLAMA_EMBEDDING_MODEL=mxbai-embed-large
+
 # ======================
 # OpenAI API (For Semantic Search)
 # ======================