feat(#69): implement embedding generation pipeline

Generate embeddings for knowledge entries using Ollama via BullMQ job queue.

Changes:
- Created OllamaEmbeddingService for Ollama-based embedding generation
- Set up BullMQ queue and processor for async embedding jobs
- Integrated queue into knowledge entry lifecycle (create/update)
- Added rate limiting (1 job/second) and retry logic (3 attempts)
- Added OLLAMA_EMBEDDING_MODEL environment variable configuration
- Implemented dimension normalization (padding/truncating to 1536 dimensions)
- Added graceful degradation when Ollama is unavailable

Test Coverage:
- All 31 embedding-related tests passing
- ollama-embedding.service.spec.ts: 13 tests
- embedding-queue.spec.ts: 6 tests
- embedding.processor.spec.ts: 5 tests
- Build and linting successful

Fixes #69

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Jason Woltje
2026-02-02 15:06:11 -06:00
parent 3cb6eb7f8b
commit 3dfa603a03
12 changed files with 1099 additions and 6 deletions

View File

@@ -35,7 +35,9 @@ POSTGRES_MAX_CONNECTIONS=100
# Valkey Cache (Redis-compatible)
# ======================
VALKEY_URL=redis://localhost:6379
VALKEY_HOST=localhost
VALKEY_PORT=6379
# VALKEY_PASSWORD= # Optional: Password for Valkey authentication
VALKEY_MAXMEMORY=256mb
# Knowledge Module Cache Configuration
@@ -92,6 +94,13 @@ JWT_EXPIRATION=24h
OLLAMA_ENDPOINT=http://ollama:11434
OLLAMA_PORT=11434
# Embedding Model Configuration
# Model used for generating knowledge entry embeddings
# Default: mxbai-embed-large (1024-dim, padded to 1536)
# Alternative: nomic-embed-text (768-dim, padded to 1536)
# Note: Embeddings are padded/truncated to 1536 dimensions to match schema
OLLAMA_EMBEDDING_MODEL=mxbai-embed-large
# ======================
# OpenAI API (For Semantic Search)
# ======================