Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
363 lines
8.0 KiB
Markdown
363 lines
8.0 KiB
Markdown
# Semantic Search Implementation
|
|
|
|
This document describes the semantic search implementation for the Mosaic Stack Knowledge Module using OpenAI embeddings and PostgreSQL pgvector.
|
|
|
|
## Overview
|
|
|
|
The semantic search feature enables AI-powered similarity search across knowledge entries using vector embeddings. It complements the existing full-text search with semantic understanding, allowing users to find relevant content even when exact keywords don't match.
|
|
|
|
## Architecture
|
|
|
|
### Components
|
|
|
|
1. **EmbeddingService** - Generates and manages OpenAI embeddings
|
|
2. **SearchService** - Enhanced with semantic and hybrid search methods
|
|
3. **KnowledgeService** - Automatically generates embeddings on entry create/update
|
|
4. **pgvector** - PostgreSQL extension for vector similarity search
|
|
|
|
### Database Schema
|
|
|
|
#### Knowledge Embeddings Table
|
|
|
|
```prisma
|
|
model KnowledgeEmbedding {
|
|
id String @id @default(uuid()) @db.Uuid
|
|
entryId String @unique @map("entry_id") @db.Uuid
|
|
entry KnowledgeEntry @relation(fields: [entryId], references: [id], onDelete: Cascade)
|
|
|
|
embedding Unsupported("vector(1536)")
|
|
model String
|
|
|
|
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
|
|
updatedAt DateTime @updatedAt @map("updated_at") @db.Timestamptz
|
|
|
|
@@index([entryId])
|
|
@@map("knowledge_embeddings")
|
|
}
|
|
```
|
|
|
|
#### Vector Index
|
|
|
|
An HNSW (Hierarchical Navigable Small World) index is created for fast similarity search:
|
|
|
|
```sql
|
|
CREATE INDEX knowledge_embeddings_embedding_idx
|
|
ON knowledge_embeddings
|
|
USING hnsw (embedding vector_cosine_ops)
|
|
WITH (m = 16, ef_construction = 64);
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
Add to your `.env` file:
|
|
|
|
```env
|
|
# Optional: Required for semantic search
|
|
OPENAI_API_KEY=sk-...
|
|
```
|
|
|
|
Get your API key from: https://platform.openai.com/api-keys
|
|
|
|
### OpenAI Model
|
|
|
|
The default embedding model is `text-embedding-3-small` (1536 dimensions). This provides:
|
|
|
|
- High quality embeddings
|
|
- Cost-effective pricing
|
|
- Fast generation speed
|
|
|
|
## API Endpoints
|
|
|
|
### 1. Semantic Search
|
|
|
|
**POST** `/api/knowledge/search/semantic`
|
|
|
|
Search using vector similarity only.
|
|
|
|
**Request:**
|
|
|
|
```json
|
|
{
|
|
"query": "database performance optimization",
|
|
"status": "PUBLISHED"
|
|
}
|
|
```
|
|
|
|
**Query Parameters:**
|
|
|
|
- `page` (optional): Page number (default: 1)
|
|
- `limit` (optional): Results per page (default: 20)
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"data": [
|
|
{
|
|
"id": "uuid",
|
|
"slug": "postgres-indexing",
|
|
"title": "PostgreSQL Indexing Strategies",
|
|
"content": "...",
|
|
"rank": 0.87,
|
|
"tags": [...],
|
|
...
|
|
}
|
|
],
|
|
"pagination": {
|
|
"page": 1,
|
|
"limit": 20,
|
|
"total": 15,
|
|
"totalPages": 1
|
|
},
|
|
"query": "database performance optimization"
|
|
}
|
|
```
|
|
|
|
### 2. Hybrid Search (Recommended)
|
|
|
|
**POST** `/api/knowledge/search/hybrid`
|
|
|
|
Combines vector similarity and full-text search using Reciprocal Rank Fusion (RRF).
|
|
|
|
**Request:**
|
|
|
|
```json
|
|
{
|
|
"query": "indexing strategies",
|
|
"status": "PUBLISHED"
|
|
}
|
|
```
|
|
|
|
**Benefits of Hybrid Search:**
|
|
|
|
- Best of both worlds: semantic understanding + keyword matching
|
|
- Better ranking for exact matches
|
|
- Improved recall and precision
|
|
- Resilient to edge cases
|
|
|
|
### 3. Batch Embedding Generation
|
|
|
|
**POST** `/api/knowledge/embeddings/batch`
|
|
|
|
Generate embeddings for all existing entries. Useful for:
|
|
|
|
- Initial setup after enabling semantic search
|
|
- Regenerating embeddings after model updates
|
|
|
|
**Request:**
|
|
|
|
```json
|
|
{
|
|
"status": "PUBLISHED"
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"message": "Generated 42 embeddings out of 45 entries",
|
|
"total": 45,
|
|
"success": 42
|
|
}
|
|
```
|
|
|
|
**Permissions:** Requires ADMIN role
|
|
|
|
## Automatic Embedding Generation
|
|
|
|
Embeddings are automatically generated when:
|
|
|
|
1. **Creating an entry** - Embedding generated asynchronously after creation
|
|
2. **Updating an entry** - Embedding regenerated if title or content changes
|
|
|
|
The generation happens asynchronously to avoid blocking API responses.
|
|
|
|
### Content Preparation
|
|
|
|
Before generating embeddings, content is prepared by:
|
|
|
|
1. Combining title and content
|
|
2. Weighting title more heavily (appears twice)
|
|
3. This improves semantic matching on titles
|
|
|
|
```typescript
|
|
prepareContentForEmbedding(title, content) {
|
|
return `${title}\n\n${title}\n\n${content}`.trim();
|
|
}
|
|
```
|
|
|
|
## Search Algorithms
|
|
|
|
### Vector Similarity Search
|
|
|
|
Uses cosine distance to find semantically similar entries:
|
|
|
|
```sql
|
|
SELECT *
|
|
FROM knowledge_entries e
|
|
INNER JOIN knowledge_embeddings emb ON e.id = emb.entry_id
|
|
ORDER BY emb.embedding <=> query_embedding
|
|
LIMIT 20
|
|
```
|
|
|
|
- `<=>` operator: cosine distance
|
|
- Lower distance = higher similarity
|
|
- Efficient with HNSW index
|
|
|
|
### Hybrid Search (RRF Algorithm)
|
|
|
|
Reciprocal Rank Fusion combines rankings from multiple sources:
|
|
|
|
```
|
|
RRF(d) = sum(1 / (k + rank_i))
|
|
```
|
|
|
|
Where:
|
|
|
|
- `d` = document
|
|
- `k` = constant (60 is standard)
|
|
- `rank_i` = rank from source i
|
|
|
|
**Example:**
|
|
|
|
Document ranks in two searches:
|
|
|
|
- Vector search: rank 3
|
|
- Keyword search: rank 1
|
|
|
|
RRF score = 1/(60+3) + 1/(60+1) = 0.0159 + 0.0164 = 0.0323
|
|
|
|
Higher RRF score = better combined ranking.
|
|
|
|
## Performance Considerations
|
|
|
|
### Index Parameters
|
|
|
|
The HNSW index uses:
|
|
|
|
- `m = 16`: Max connections per layer (balances accuracy/memory)
|
|
- `ef_construction = 64`: Build quality (higher = more accurate, slower build)
|
|
|
|
### Query Performance
|
|
|
|
- **Typical query time:** 10-50ms (with index)
|
|
- **Without index:** 1000ms+ (not recommended)
|
|
- **Embedding generation:** 100-300ms per entry
|
|
|
|
### Cost (OpenAI API)
|
|
|
|
Using `text-embedding-3-small`:
|
|
|
|
- ~$0.00002 per 1000 tokens
|
|
- Average entry (~500 tokens): $0.00001
|
|
- 10,000 entries: ~$0.10
|
|
|
|
Very cost-effective for most use cases.
|
|
|
|
## Migration Guide
|
|
|
|
### 1. Run Migrations
|
|
|
|
```bash
|
|
cd apps/api
|
|
pnpm prisma migrate deploy
|
|
```
|
|
|
|
This creates:
|
|
|
|
- `knowledge_embeddings` table
|
|
- Vector index on embeddings
|
|
|
|
### 2. Configure OpenAI API Key
|
|
|
|
```bash
|
|
# Add to .env
|
|
OPENAI_API_KEY=sk-...
|
|
```
|
|
|
|
### 3. Generate Embeddings for Existing Entries
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3001/api/knowledge/embeddings/batch \
|
|
-H "Authorization: Bearer YOUR_TOKEN" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"status": "PUBLISHED"}'
|
|
```
|
|
|
|
Or use the web UI (Admin dashboard → Knowledge → Generate Embeddings).
|
|
|
|
### 4. Test Semantic Search
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3001/api/knowledge/search/hybrid \
|
|
-H "Authorization: Bearer YOUR_TOKEN" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query": "your search query"}'
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### "OpenAI API key not configured"
|
|
|
|
**Cause:** `OPENAI_API_KEY` environment variable not set
|
|
|
|
**Solution:** Add the API key to your `.env` file and restart the API server
|
|
|
|
### Semantic search returns no results
|
|
|
|
**Possible causes:**
|
|
|
|
1. **No embeddings generated**
|
|
- Run batch generation endpoint
|
|
- Check `knowledge_embeddings` table
|
|
|
|
2. **Query too specific**
|
|
- Try broader terms
|
|
- Use hybrid search for better recall
|
|
|
|
3. **Index not created**
|
|
- Check migration status
|
|
- Verify index exists: `\di knowledge_embeddings_embedding_idx` in psql
|
|
|
|
### Slow query performance
|
|
|
|
**Solutions:**
|
|
|
|
1. Verify index exists and is being used:
|
|
|
|
```sql
|
|
EXPLAIN ANALYZE
|
|
SELECT * FROM knowledge_embeddings
|
|
ORDER BY embedding <=> '[...]'::vector
|
|
LIMIT 20;
|
|
```
|
|
|
|
2. Adjust index parameters (requires recreation):
|
|
```sql
|
|
DROP INDEX knowledge_embeddings_embedding_idx;
|
|
CREATE INDEX knowledge_embeddings_embedding_idx
|
|
ON knowledge_embeddings
|
|
USING hnsw (embedding vector_cosine_ops)
|
|
WITH (m = 32, ef_construction = 128); -- Higher values
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
Potential improvements:
|
|
|
|
1. **Custom embeddings**: Support for local embedding models (Ollama, etc.)
|
|
2. **Chunking**: Split large entries into chunks for better granularity
|
|
3. **Reranking**: Add cross-encoder reranking for top results
|
|
4. **Caching**: Cache query embeddings for repeated searches
|
|
5. **Multi-modal**: Support image/file embeddings
|
|
|
|
## References
|
|
|
|
- [OpenAI Embeddings Guide](https://platform.openai.com/docs/guides/embeddings)
|
|
- [pgvector Documentation](https://github.com/pgvector/pgvector)
|
|
- [HNSW Algorithm Paper](https://arxiv.org/abs/1603.09320)
|
|
- [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf)
|