Files
stack/docs/SEMANTIC_SEARCH.md
Jason Woltje 12abdfe81d feat(#93): implement agent spawn via federation
Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 14:37:06 -06:00

8.0 KiB

Semantic Search Implementation

This document describes the semantic search implementation for the Mosaic Stack Knowledge Module using OpenAI embeddings and PostgreSQL pgvector.

Overview

The semantic search feature enables AI-powered similarity search across knowledge entries using vector embeddings. It complements the existing full-text search with semantic understanding, allowing users to find relevant content even when exact keywords don't match.

Architecture

Components

  1. EmbeddingService - Generates and manages OpenAI embeddings
  2. SearchService - Enhanced with semantic and hybrid search methods
  3. KnowledgeService - Automatically generates embeddings on entry create/update
  4. pgvector - PostgreSQL extension for vector similarity search

Database Schema

Knowledge Embeddings Table

model KnowledgeEmbedding {
  id      String         @id @default(uuid()) @db.Uuid
  entryId String         @unique @map("entry_id") @db.Uuid
  entry   KnowledgeEntry @relation(fields: [entryId], references: [id], onDelete: Cascade)

  embedding Unsupported("vector(1536)")
  model     String

  createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
  updatedAt DateTime @updatedAt @map("updated_at") @db.Timestamptz

  @@index([entryId])
  @@map("knowledge_embeddings")
}

Vector Index

An HNSW (Hierarchical Navigable Small World) index is created for fast similarity search:

CREATE INDEX knowledge_embeddings_embedding_idx
ON knowledge_embeddings
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Configuration

Environment Variables

Add to your .env file:

# Optional: Required for semantic search
OPENAI_API_KEY=sk-...

Get your API key from: https://platform.openai.com/api-keys

OpenAI Model

The default embedding model is text-embedding-3-small (1536 dimensions). This provides:

  • High quality embeddings
  • Cost-effective pricing
  • Fast generation speed

API Endpoints

POST /api/knowledge/search/semantic

Search using vector similarity only.

Request:

{
  "query": "database performance optimization",
  "status": "PUBLISHED"
}

Query Parameters:

  • page (optional): Page number (default: 1)
  • limit (optional): Results per page (default: 20)

Response:

{
  "data": [
    {
      "id": "uuid",
      "slug": "postgres-indexing",
      "title": "PostgreSQL Indexing Strategies",
      "content": "...",
      "rank": 0.87,
      "tags": [...],
      ...
    }
  ],
  "pagination": {
    "page": 1,
    "limit": 20,
    "total": 15,
    "totalPages": 1
  },
  "query": "database performance optimization"
}

POST /api/knowledge/search/hybrid

Combines vector similarity and full-text search using Reciprocal Rank Fusion (RRF).

Request:

{
  "query": "indexing strategies",
  "status": "PUBLISHED"
}

Benefits of Hybrid Search:

  • Best of both worlds: semantic understanding + keyword matching
  • Better ranking for exact matches
  • Improved recall and precision
  • Resilient to edge cases

3. Batch Embedding Generation

POST /api/knowledge/embeddings/batch

Generate embeddings for all existing entries. Useful for:

  • Initial setup after enabling semantic search
  • Regenerating embeddings after model updates

Request:

{
  "status": "PUBLISHED"
}

Response:

{
  "message": "Generated 42 embeddings out of 45 entries",
  "total": 45,
  "success": 42
}

Permissions: Requires ADMIN role

Automatic Embedding Generation

Embeddings are automatically generated when:

  1. Creating an entry - Embedding generated asynchronously after creation
  2. Updating an entry - Embedding regenerated if title or content changes

The generation happens asynchronously to avoid blocking API responses.

Content Preparation

Before generating embeddings, content is prepared by:

  1. Combining title and content
  2. Weighting title more heavily (appears twice)
  3. This improves semantic matching on titles
prepareContentForEmbedding(title, content) {
  return `${title}\n\n${title}\n\n${content}`.trim();
}

Search Algorithms

Uses cosine distance to find semantically similar entries:

SELECT *
FROM knowledge_entries e
INNER JOIN knowledge_embeddings emb ON e.id = emb.entry_id
ORDER BY emb.embedding <=> query_embedding
LIMIT 20
  • <=> operator: cosine distance
  • Lower distance = higher similarity
  • Efficient with HNSW index

Hybrid Search (RRF Algorithm)

Reciprocal Rank Fusion combines rankings from multiple sources:

RRF(d) = sum(1 / (k + rank_i))

Where:

  • d = document
  • k = constant (60 is standard)
  • rank_i = rank from source i

Example:

Document ranks in two searches:

  • Vector search: rank 3
  • Keyword search: rank 1

RRF score = 1/(60+3) + 1/(60+1) = 0.0159 + 0.0164 = 0.0323

Higher RRF score = better combined ranking.

Performance Considerations

Index Parameters

The HNSW index uses:

  • m = 16: Max connections per layer (balances accuracy/memory)
  • ef_construction = 64: Build quality (higher = more accurate, slower build)

Query Performance

  • Typical query time: 10-50ms (with index)
  • Without index: 1000ms+ (not recommended)
  • Embedding generation: 100-300ms per entry

Cost (OpenAI API)

Using text-embedding-3-small:

  • ~$0.00002 per 1000 tokens
  • Average entry (~500 tokens): $0.00001
  • 10,000 entries: ~$0.10

Very cost-effective for most use cases.

Migration Guide

1. Run Migrations

cd apps/api
pnpm prisma migrate deploy

This creates:

  • knowledge_embeddings table
  • Vector index on embeddings

2. Configure OpenAI API Key

# Add to .env
OPENAI_API_KEY=sk-...

3. Generate Embeddings for Existing Entries

curl -X POST http://localhost:3001/api/knowledge/embeddings/batch \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status": "PUBLISHED"}'

Or use the web UI (Admin dashboard → Knowledge → Generate Embeddings).

curl -X POST http://localhost:3001/api/knowledge/search/hybrid \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "your search query"}'

Troubleshooting

"OpenAI API key not configured"

Cause: OPENAI_API_KEY environment variable not set

Solution: Add the API key to your .env file and restart the API server

Semantic search returns no results

Possible causes:

  1. No embeddings generated

    • Run batch generation endpoint
    • Check knowledge_embeddings table
  2. Query too specific

    • Try broader terms
    • Use hybrid search for better recall
  3. Index not created

    • Check migration status
    • Verify index exists: \di knowledge_embeddings_embedding_idx in psql

Slow query performance

Solutions:

  1. Verify index exists and is being used:

    EXPLAIN ANALYZE
    SELECT * FROM knowledge_embeddings
    ORDER BY embedding <=> '[...]'::vector
    LIMIT 20;
    
  2. Adjust index parameters (requires recreation):

    DROP INDEX knowledge_embeddings_embedding_idx;
    CREATE INDEX knowledge_embeddings_embedding_idx
    ON knowledge_embeddings
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 32, ef_construction = 128); -- Higher values
    

Future Enhancements

Potential improvements:

  1. Custom embeddings: Support for local embedding models (Ollama, etc.)
  2. Chunking: Split large entries into chunks for better granularity
  3. Reranking: Add cross-encoder reranking for top results
  4. Caching: Cache query embeddings for repeated searches
  5. Multi-modal: Support image/file embeddings

References