Files
stack/docs/scratchpads/69-embedding-generation.md
Jason Woltje 5d348526de feat(#71): implement graph data API
Implemented three new API endpoints for knowledge graph visualization:

1. GET /api/knowledge/graph - Full knowledge graph
   - Returns all entries and links with optional filtering
   - Supports filtering by tags, status, and node count limit
   - Includes orphan detection (entries with no links)

2. GET /api/knowledge/graph/stats - Graph statistics
   - Total entries and links counts
   - Orphan entries detection
   - Average links per entry
   - Top 10 most connected entries
   - Tag distribution across entries

3. GET /api/knowledge/graph/:slug - Entry-centered subgraph
   - Returns graph centered on specific entry
   - Supports depth parameter (1-5) for traversal distance
   - Includes all connected nodes up to specified depth

New Files:
- apps/api/src/knowledge/graph.controller.ts
- apps/api/src/knowledge/graph.controller.spec.ts

Modified Files:
- apps/api/src/knowledge/dto/graph-query.dto.ts (added GraphFilterDto)
- apps/api/src/knowledge/entities/graph.entity.ts (extended with new types)
- apps/api/src/knowledge/services/graph.service.ts (added new methods)
- apps/api/src/knowledge/services/graph.service.spec.ts (added tests)
- apps/api/src/knowledge/knowledge.module.ts (registered controller)
- apps/api/src/knowledge/dto/index.ts (exported new DTOs)
- docs/scratchpads/71-graph-data-api.md (implementation notes)

Test Coverage: 21 tests (all passing)
- 14 service tests including orphan detection, filtering, statistics
- 7 controller tests for all three endpoints

Follows TDD principles with tests written before implementation.
All code quality gates passed (lint, typecheck, tests).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 15:27:00 -06:00

100 lines
4.0 KiB
Markdown

# Issue #69: [KNOW-017] Embedding Generation Pipeline
## Objective
Generate embeddings for knowledge entries using the LLM infrastructure (Ollama) to enable semantic search capabilities.
## Approach
1. Create an embedding service that interfaces with Ollama
2. Set up BullMQ job queue for async embedding generation
3. Create background worker to process embedding jobs
4. Hook into entry creation/update lifecycle to queue jobs
5. Handle rate limiting and error scenarios gracefully
6. Add configuration for model selection
## Progress
- [x] Create scratchpad
- [x] Review existing schema for embedding column
- [x] Review existing Ollama integration
- [x] Set up BullMQ infrastructure
- [x] Write tests for embedding service (TDD)
- [x] Implement embedding service (OllamaEmbeddingService)
- [x] Create job queue and worker
- [x] Hook into entry lifecycle
- [x] Add rate limiting (1 job per second via queue delay)
- [x] Add configuration (OLLAMA_EMBEDDING_MODEL env var)
- [x] Build and verify (all tests passing, build successful)
- [x] Commit changes (commit 3dfa603)
- [x] Close issue #69
## Summary
Successfully implemented embedding generation pipeline for knowledge entries using Ollama.
### Files Created
1. `apps/api/src/knowledge/services/ollama-embedding.service.ts` - Ollama-based embedding service
2. `apps/api/src/knowledge/services/ollama-embedding.service.spec.ts` - Tests (13 tests)
3. `apps/api/src/knowledge/queues/embedding-queue.service.ts` - BullMQ queue service
4. `apps/api/src/knowledge/queues/embedding-queue.spec.ts` - Tests (6 tests)
5. `apps/api/src/knowledge/queues/embedding.processor.ts` - Background worker processor
6. `apps/api/src/knowledge/queues/embedding.processor.spec.ts` - Tests (5 tests)
7. `apps/api/src/knowledge/queues/index.ts` - Export index
### Files Modified
1. `apps/api/src/knowledge/knowledge.module.ts` - Added BullMQ queue registration and new services
2. `apps/api/src/knowledge/knowledge.service.ts` - Updated to use queue for async embedding generation
3. `apps/api/src/app.module.ts` - Added BullModule.forRoot() configuration
4. `.env.example` - Added OLLAMA_EMBEDDING_MODEL configuration
### Key Features
- Async embedding generation using BullMQ job queue
- Automatic queuing on entry create/update
- Rate limiting: 1 job per second to prevent overwhelming Ollama
- Retry logic: 3 attempts with exponential backoff
- Configurable embedding model via OLLAMA_EMBEDDING_MODEL env var
- Dimension normalization (padding/truncating to 1536 dimensions)
- Graceful degradation if Ollama is unavailable
- Job cleanup: auto-remove completed jobs after 24h, failed after 7 days
### Test Coverage
- All 31 embedding-related tests passing
- Build successful
- Linting clean
- TypeScript compilation successful
## Testing
- Unit tests for embedding service
- Integration tests for job queue
- E2E tests for entry creation with embedding generation
- Target: 85% coverage minimum
## Notes
- Using Ollama for embedding generation (local/remote)
- BullMQ for job queue (Redis-compatible, works with Valkey)
- Embeddings stored in pgvector column from schema (knowledge_embeddings table)
- Need to ensure graceful degradation if Ollama unavailable
- BullMQ is already installed (@nestjs/bullmq: ^11.0.4, bullmq: ^5.67.2)
- Existing EmbeddingService uses OpenAI - need to refactor to use Ollama
- OllamaService already has embed() method for generating embeddings
- Default embedding model for Ollama: "nomic-embed-text" (produces 768-dim vectors)
- Schema expects 1536-dim vectors - need to check if we need to update schema or use different model
## Technical Decisions
1. Refactor existing EmbeddingService to use Ollama instead of OpenAI
2. Keep the same public API for EmbeddingService to minimize changes
3. Add BullMQ queue module for async processing
4. Create a consumer/processor for embedding jobs
5. Hook into knowledge entry lifecycle (onCreate, onUpdate) to queue jobs
6. Add configuration for embedding model selection
7. Implement rate limiting using delays between jobs
8. Add retry logic for failed embedding generation