Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
8.0 KiB
Semantic Search Implementation
This document describes the semantic search implementation for the Mosaic Stack Knowledge Module using OpenAI embeddings and PostgreSQL pgvector.
Overview
The semantic search feature enables AI-powered similarity search across knowledge entries using vector embeddings. It complements the existing full-text search with semantic understanding, allowing users to find relevant content even when exact keywords don't match.
Architecture
Components
- EmbeddingService - Generates and manages OpenAI embeddings
- SearchService - Enhanced with semantic and hybrid search methods
- KnowledgeService - Automatically generates embeddings on entry create/update
- pgvector - PostgreSQL extension for vector similarity search
Database Schema
Knowledge Embeddings Table
model KnowledgeEmbedding {
id String @id @default(uuid()) @db.Uuid
entryId String @unique @map("entry_id") @db.Uuid
entry KnowledgeEntry @relation(fields: [entryId], references: [id], onDelete: Cascade)
embedding Unsupported("vector(1536)")
model String
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
updatedAt DateTime @updatedAt @map("updated_at") @db.Timestamptz
@@index([entryId])
@@map("knowledge_embeddings")
}
Vector Index
An HNSW (Hierarchical Navigable Small World) index is created for fast similarity search:
CREATE INDEX knowledge_embeddings_embedding_idx
ON knowledge_embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Configuration
Environment Variables
Add to your .env file:
# Optional: Required for semantic search
OPENAI_API_KEY=sk-...
Get your API key from: https://platform.openai.com/api-keys
OpenAI Model
The default embedding model is text-embedding-3-small (1536 dimensions). This provides:
- High quality embeddings
- Cost-effective pricing
- Fast generation speed
API Endpoints
1. Semantic Search
POST /api/knowledge/search/semantic
Search using vector similarity only.
Request:
{
"query": "database performance optimization",
"status": "PUBLISHED"
}
Query Parameters:
page(optional): Page number (default: 1)limit(optional): Results per page (default: 20)
Response:
{
"data": [
{
"id": "uuid",
"slug": "postgres-indexing",
"title": "PostgreSQL Indexing Strategies",
"content": "...",
"rank": 0.87,
"tags": [...],
...
}
],
"pagination": {
"page": 1,
"limit": 20,
"total": 15,
"totalPages": 1
},
"query": "database performance optimization"
}
2. Hybrid Search (Recommended)
POST /api/knowledge/search/hybrid
Combines vector similarity and full-text search using Reciprocal Rank Fusion (RRF).
Request:
{
"query": "indexing strategies",
"status": "PUBLISHED"
}
Benefits of Hybrid Search:
- Best of both worlds: semantic understanding + keyword matching
- Better ranking for exact matches
- Improved recall and precision
- Resilient to edge cases
3. Batch Embedding Generation
POST /api/knowledge/embeddings/batch
Generate embeddings for all existing entries. Useful for:
- Initial setup after enabling semantic search
- Regenerating embeddings after model updates
Request:
{
"status": "PUBLISHED"
}
Response:
{
"message": "Generated 42 embeddings out of 45 entries",
"total": 45,
"success": 42
}
Permissions: Requires ADMIN role
Automatic Embedding Generation
Embeddings are automatically generated when:
- Creating an entry - Embedding generated asynchronously after creation
- Updating an entry - Embedding regenerated if title or content changes
The generation happens asynchronously to avoid blocking API responses.
Content Preparation
Before generating embeddings, content is prepared by:
- Combining title and content
- Weighting title more heavily (appears twice)
- This improves semantic matching on titles
prepareContentForEmbedding(title, content) {
return `${title}\n\n${title}\n\n${content}`.trim();
}
Search Algorithms
Vector Similarity Search
Uses cosine distance to find semantically similar entries:
SELECT *
FROM knowledge_entries e
INNER JOIN knowledge_embeddings emb ON e.id = emb.entry_id
ORDER BY emb.embedding <=> query_embedding
LIMIT 20
<=>operator: cosine distance- Lower distance = higher similarity
- Efficient with HNSW index
Hybrid Search (RRF Algorithm)
Reciprocal Rank Fusion combines rankings from multiple sources:
RRF(d) = sum(1 / (k + rank_i))
Where:
d= documentk= constant (60 is standard)rank_i= rank from source i
Example:
Document ranks in two searches:
- Vector search: rank 3
- Keyword search: rank 1
RRF score = 1/(60+3) + 1/(60+1) = 0.0159 + 0.0164 = 0.0323
Higher RRF score = better combined ranking.
Performance Considerations
Index Parameters
The HNSW index uses:
m = 16: Max connections per layer (balances accuracy/memory)ef_construction = 64: Build quality (higher = more accurate, slower build)
Query Performance
- Typical query time: 10-50ms (with index)
- Without index: 1000ms+ (not recommended)
- Embedding generation: 100-300ms per entry
Cost (OpenAI API)
Using text-embedding-3-small:
- ~$0.00002 per 1000 tokens
- Average entry (~500 tokens): $0.00001
- 10,000 entries: ~$0.10
Very cost-effective for most use cases.
Migration Guide
1. Run Migrations
cd apps/api
pnpm prisma migrate deploy
This creates:
knowledge_embeddingstable- Vector index on embeddings
2. Configure OpenAI API Key
# Add to .env
OPENAI_API_KEY=sk-...
3. Generate Embeddings for Existing Entries
curl -X POST http://localhost:3001/api/knowledge/embeddings/batch \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"status": "PUBLISHED"}'
Or use the web UI (Admin dashboard → Knowledge → Generate Embeddings).
4. Test Semantic Search
curl -X POST http://localhost:3001/api/knowledge/search/hybrid \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "your search query"}'
Troubleshooting
"OpenAI API key not configured"
Cause: OPENAI_API_KEY environment variable not set
Solution: Add the API key to your .env file and restart the API server
Semantic search returns no results
Possible causes:
-
No embeddings generated
- Run batch generation endpoint
- Check
knowledge_embeddingstable
-
Query too specific
- Try broader terms
- Use hybrid search for better recall
-
Index not created
- Check migration status
- Verify index exists:
\di knowledge_embeddings_embedding_idxin psql
Slow query performance
Solutions:
-
Verify index exists and is being used:
EXPLAIN ANALYZE SELECT * FROM knowledge_embeddings ORDER BY embedding <=> '[...]'::vector LIMIT 20; -
Adjust index parameters (requires recreation):
DROP INDEX knowledge_embeddings_embedding_idx; CREATE INDEX knowledge_embeddings_embedding_idx ON knowledge_embeddings USING hnsw (embedding vector_cosine_ops) WITH (m = 32, ef_construction = 128); -- Higher values
Future Enhancements
Potential improvements:
- Custom embeddings: Support for local embedding models (Ollama, etc.)
- Chunking: Split large entries into chunks for better granularity
- Reranking: Add cross-encoder reranking for top results
- Caching: Cache query embeddings for repeated searches
- Multi-modal: Support image/file embeddings