Files
stack/.env.example
Jason Woltje 3ec2059470
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
ci/woodpecker/pr/woodpecker Pipeline failed
feat: add semantic search with pgvector (closes #68, #69, #70)
Issues resolved:
- #68: pgvector Setup
  * Added pgvector vector index migration for knowledge_embeddings
  * Vector index uses HNSW algorithm with cosine distance
  * Optimized for 1536-dimension OpenAI embeddings

- #69: Embedding Generation Pipeline
  * Created EmbeddingService with OpenAI integration
  * Automatic embedding generation on entry create/update
  * Batch processing endpoint for existing entries
  * Async generation to avoid blocking API responses
  * Content preparation with title weighting

- #70: Semantic Search API
  * POST /api/knowledge/search/semantic - pure vector search
  * POST /api/knowledge/search/hybrid - RRF combined search
  * POST /api/knowledge/embeddings/batch - batch generation
  * Comprehensive test coverage
  * Full documentation in docs/SEMANTIC_SEARCH.md

Technical details:
- Uses OpenAI text-embedding-3-small model (1536 dims)
- HNSW index for O(log n) similarity search
- Reciprocal Rank Fusion for hybrid search
- Graceful degradation when OpenAI not configured
- Async embedding generation for performance

Configuration:
- Added OPENAI_API_KEY to .env.example
- Optional feature - disabled if API key not set
- Falls back to keyword search in hybrid mode
2026-01-30 15:19:13 -06:00

147 lines
4.5 KiB
Plaintext

# ==============================================
# Mosaic Stack Environment Configuration
# ==============================================
# Copy this file to .env and customize for your environment
# ======================
# Application Ports
# ======================
API_PORT=3001
API_HOST=0.0.0.0
WEB_PORT=3000
# ======================
# Web Configuration
# ======================
NEXT_PUBLIC_API_URL=http://localhost:3001
# ======================
# PostgreSQL Database
# ======================
# SECURITY: Change POSTGRES_PASSWORD to a strong random password in production
DATABASE_URL=postgresql://mosaic:REPLACE_WITH_SECURE_PASSWORD@localhost:5432/mosaic
POSTGRES_USER=mosaic
POSTGRES_PASSWORD=REPLACE_WITH_SECURE_PASSWORD
POSTGRES_DB=mosaic
POSTGRES_PORT=5432
# PostgreSQL Performance Tuning (Optional)
POSTGRES_SHARED_BUFFERS=256MB
POSTGRES_EFFECTIVE_CACHE_SIZE=1GB
POSTGRES_MAX_CONNECTIONS=100
# ======================
# Valkey Cache (Redis-compatible)
# ======================
VALKEY_URL=redis://localhost:6379
VALKEY_PORT=6379
VALKEY_MAXMEMORY=256mb
# Knowledge Module Cache Configuration
# Set KNOWLEDGE_CACHE_ENABLED=false to disable caching (useful for development)
KNOWLEDGE_CACHE_ENABLED=true
# Cache TTL in seconds (default: 300 = 5 minutes)
KNOWLEDGE_CACHE_TTL=300
# ======================
# Authentication (Authentik OIDC)
# ======================
# Authentik Server URLs
OIDC_ISSUER=https://auth.example.com/application/o/mosaic-stack/
OIDC_CLIENT_ID=your-client-id-here
OIDC_CLIENT_SECRET=your-client-secret-here
OIDC_REDIRECT_URI=http://localhost:3001/auth/callback
# Authentik PostgreSQL Database
AUTHENTIK_POSTGRES_USER=authentik
AUTHENTIK_POSTGRES_PASSWORD=REPLACE_WITH_SECURE_PASSWORD
AUTHENTIK_POSTGRES_DB=authentik
# Authentik Configuration
# CRITICAL: Generate a random secret key with at least 50 characters
# Example: openssl rand -base64 50
AUTHENTIK_SECRET_KEY=REPLACE_WITH_RANDOM_SECRET_MINIMUM_50_CHARS
AUTHENTIK_ERROR_REPORTING=false
# SECURITY: Change bootstrap password immediately after first login
AUTHENTIK_BOOTSTRAP_PASSWORD=REPLACE_WITH_SECURE_PASSWORD
AUTHENTIK_BOOTSTRAP_EMAIL=admin@localhost
AUTHENTIK_COOKIE_DOMAIN=.localhost
# Authentik Ports
AUTHENTIK_PORT_HTTP=9000
AUTHENTIK_PORT_HTTPS=9443
# ======================
# JWT Configuration
# ======================
# CRITICAL: Generate a random secret key with at least 32 characters
# Example: openssl rand -base64 32
JWT_SECRET=REPLACE_WITH_RANDOM_SECRET_MINIMUM_32_CHARS
JWT_EXPIRATION=24h
# ======================
# Ollama (Optional AI Service)
# ======================
# Set OLLAMA_ENDPOINT to use local or remote Ollama
# For bundled Docker service: http://ollama:11434
# For external service: http://your-ollama-server:11434
OLLAMA_ENDPOINT=http://ollama:11434
OLLAMA_PORT=11434
# ======================
# OpenAI API (For Semantic Search)
# ======================
# OPTIONAL: Semantic search requires an OpenAI API key
# Get your API key from: https://platform.openai.com/api-keys
# If not configured, semantic search endpoints will return an error
# OPENAI_API_KEY=sk-...
# ======================
# Application Environment
# ======================
NODE_ENV=development
# ======================
# Docker Compose Profiles
# ======================
# Uncomment to enable optional services:
# COMPOSE_PROFILES=authentik,ollama # Enable both Authentik and Ollama
# COMPOSE_PROFILES=full # Enable all optional services
# COMPOSE_PROFILES=authentik # Enable only Authentik
# COMPOSE_PROFILES=ollama # Enable only Ollama
# COMPOSE_PROFILES=traefik-bundled # Enable bundled Traefik reverse proxy
# ======================
# Traefik Reverse Proxy
# ======================
# TRAEFIK_MODE options:
# - bundled: Use bundled Traefik (requires traefik-bundled profile)
# - upstream: Connect to external Traefik instance
# - none: Direct port exposure without reverse proxy (default)
TRAEFIK_MODE=none
# Domain configuration for Traefik routing
MOSAIC_API_DOMAIN=api.mosaic.local
MOSAIC_WEB_DOMAIN=mosaic.local
MOSAIC_AUTH_DOMAIN=auth.mosaic.local
# External Traefik network name (for upstream mode)
# Must match the network name of your existing Traefik instance
TRAEFIK_NETWORK=traefik-public
# TLS/SSL Configuration
TRAEFIK_TLS_ENABLED=true
# For Let's Encrypt (production):
TRAEFIK_ACME_EMAIL=admin@example.com
# For self-signed certificates (development), leave TRAEFIK_ACME_EMAIL empty
# Traefik Dashboard (bundled mode only)
TRAEFIK_DASHBOARD_ENABLED=true
TRAEFIK_DASHBOARD_PORT=8080
# ======================
# Logging & Debugging
# ======================
LOG_LEVEL=info
DEBUG=false