Files
stack/docs/DOCKER-SWARM.md
Jason Woltje f8477d5052
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
docs(swarm): comprehensive Docker Swarm deployment documentation
- Update docker-compose.swarm.yml with external Authentik configuration
  - Comment out Authentik services (using external OIDC provider)
  - Comment out Authentik volumes
  - Add header with deployment instructions and current configuration

- Create comprehensive SWARM-DEPLOYMENT.md guide
  - Prerequisites and swarm initialization
  - Manual OpenBao initialization (critical - no auto-init in swarm)
  - External service configuration examples
  - Scaling, updates, rollbacks
  - Troubleshooting and maintenance procedures
  - Backup and restore instructions

- Update .env.swarm.example
  - Add note about external vs internal Authentik
  - Update default OIDC_ISSUER to use https
  - Clarify which variables are needed for internal Authentik

- Update README.md Docker Swarm section
  - Fix deploy script path (./scripts/deploy-swarm.sh)
  - Add note about manual OpenBao initialization
  - Add warning about no profile support in swarm
  - Update documentation references to docs/ directory

- Update documentation cross-references
  - Add deprecation notice to old DOCKER-SWARM.md
  - Add deployment guide reference to SWARM-QUICKREF.md
  - Update DOCKER-COMPOSE-GUIDE.md See Also section

Key changes for swarm deployment:
- Swarm does NOT support docker-compose profiles
- External services must be manually commented out
- OpenBao requires manual initialization (no sidecar)
- All documentation updated with correct paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 17:12:49 -06:00

7.7 KiB

Mosaic Stack - Docker Swarm Deployment

⚠️ This guide has been superseded. Please see SWARM-DEPLOYMENT.md for the complete, up-to-date deployment guide.

This guide covers deploying Mosaic Stack to a Docker Swarm cluster with Traefik reverse proxy integration.

Prerequisites

  1. Docker Swarm initialized:

    docker swarm init
    
  2. Traefik running on the swarm with a network named traefik-public

  3. DNS or /etc/hosts configured with your domain names:

    • mosaic.mosaicstack.dev → Web UI
    • api.mosaicstack.dev → API
    • auth.mosaicstack.dev → Authentik SSO

Quick Start

1. Configure Environment

Copy the swarm environment template:

cp .env.swarm.example .env

Edit .env and set the following critical values:

# Database passwords
POSTGRES_PASSWORD=your-secure-password-here
AUTHENTIK_POSTGRES_PASSWORD=your-secure-password-here

# Secrets (generate with openssl rand -hex 32 or openssl rand -base64 50)
AUTHENTIK_SECRET_KEY=$(openssl rand -base64 50)
JWT_SECRET=$(openssl rand -base64 32)
ENCRYPTION_KEY=$(openssl rand -hex 32)
ORCHESTRATOR_API_KEY=$(openssl rand -base64 32)
COORDINATOR_API_KEY=$(openssl rand -base64 32)

# Claude API Key
CLAUDE_API_KEY=your-claude-api-key

# Authentik Bootstrap
AUTHENTIK_BOOTSTRAP_PASSWORD=your-admin-password
AUTHENTIK_BOOTSTRAP_EMAIL=admin@yourdomain.com

2. Create Traefik Network (if not exists)

docker network create --driver=overlay traefik-public

3. Deploy the Stack

./scripts/deploy-swarm.sh mosaic

Or manually:

docker stack deploy -c docker-compose.swarm.yml mosaic

4. Verify Deployment

Check stack status:

docker stack services mosaic
docker stack ps mosaic

Check service logs:

docker service logs mosaic_api
docker service logs mosaic_web
docker service logs mosaic_postgres

Stack Services

The following services will be deployed:

Service Internal Port Traefik Domain Description
web 3000 mosaic.mosaicstack.dev Next.js Web UI
api 3001 api.mosaicstack.dev NestJS API
authentik-server 9000 auth.mosaicstack.dev Authentik SSO
postgres 5432 - PostgreSQL 17 + pgvector
valkey 6379 - Redis-compatible cache
openbao 8200 - Secrets vault
ollama 11434 - LLM service (optional)
orchestrator 3001 - Agent orchestrator

Traefik Integration

Services are automatically registered with Traefik using labels defined in deploy.labels:

deploy:
  labels:
    - "traefik.enable=true"
    - "traefik.http.routers.mosaic-web.rule=Host(`mosaic.mosaicstack.dev`)"
    - "traefik.http.routers.mosaic-web.entrypoints=web"
    - "traefik.http.services.mosaic-web.loadbalancer.server.port=3000"

Important: Traefik labels MUST be under deploy.labels for Docker Swarm (not at service level).

Accessing Services

Once deployed and Traefik is configured:

Scaling Services

Scale specific services:

# Scale web frontend to 3 replicas
docker service scale mosaic_web=3

# Scale API to 2 replicas
docker service scale mosaic_api=2

Note: Database services (postgres, valkey) should NOT be scaled (remain at 1 replica).

Updating Services

Update a specific service:

# Rebuild image
docker compose -f docker-compose.swarm.yml build api

# Update the service
docker service update --image mosaic-stack-api:latest mosaic_api

Or redeploy the entire stack:

./scripts/deploy-swarm.sh mosaic

Rolling Updates

Docker Swarm supports rolling updates. To configure:

deploy:
  update_config:
    parallelism: 1
    delay: 10s
    order: start-first
  rollback_config:
    parallelism: 1
    delay: 10s

Troubleshooting

Service Won't Start

Check service logs:

docker service logs mosaic_api --tail 100 --follow

Check service tasks:

docker service ps mosaic_api --no-trunc

Traefik Not Routing

  1. Verify service is on traefik-public network:

    docker service inspect mosaic_web | grep -A 10 Networks
    
  2. Check Traefik dashboard for registered routes:

  3. Verify domain DNS/hosts resolution:

    ping mosaic.mosaicstack.dev
    

Database Connection Issues

Check postgres is healthy:

docker service logs mosaic_postgres --tail 50

Verify DATABASE_URL in API service:

docker service inspect mosaic_api --format '{{json .Spec.TaskTemplate.ContainerSpec.Env}}' | jq

Volume Permissions

If volume permission errors occur, check service user:

# Orchestrator runs as user 1000:1000
docker service inspect mosaic_orchestrator | grep -A 5 User

Backup & Restore

Backup Volumes

# Backup postgres data
docker run --rm -v mosaic_postgres_data:/data -v $(pwd):/backup alpine \
  tar czf /backup/postgres-backup-$(date +%Y%m%d).tar.gz -C /data .

# Backup authentik data
docker run --rm -v mosaic_authentik_postgres_data:/data -v $(pwd):/backup alpine \
  tar czf /backup/authentik-backup-$(date +%Y%m%d).tar.gz -C /data .

Restore Volumes

# Restore postgres data
docker run --rm -v mosaic_postgres_data:/data -v $(pwd):/backup alpine \
  tar xzf /backup/postgres-backup-20260208.tar.gz -C /data

# Restore authentik data
docker run --rm -v mosaic_authentik_postgres_data:/data -v $(pwd):/backup alpine \
  tar xzf /backup/authentik-backup-20260208.tar.gz -C /data

Removing the Stack

Remove all services and networks (volumes are preserved):

docker stack rm mosaic

Remove volumes (⚠️ DATA WILL BE LOST):

docker volume rm mosaic_postgres_data
docker volume rm mosaic_valkey_data
docker volume rm mosaic_authentik_postgres_data
# ... etc

Security Considerations

  1. Change default passwords in .env before deploying
  2. Use secrets management for production:
    echo "my-db-password" | docker secret create postgres_password -
    
  3. Enable TLS in Traefik (Let's Encrypt)
  4. Restrict network access using Docker network policies
  5. Run services as non-root (orchestrator already does this)

Differences from Docker Compose

Key differences when running in Swarm mode:

Feature Docker Compose Docker Swarm
Container names container_name: foo Auto-generated
Restart policy restart: unless-stopped deploy.restart_policy
Labels (Traefik) Service level deploy.labels
Networks bridge driver overlay driver
Scaling Manual docker compose up --scale docker service scale
Updates Stop/start containers Rolling updates

Reference

  • Compose file: docker-compose.swarm.yml
  • Environment: .env.swarm.example
  • Deployment script: scripts/deploy-swarm.sh
  • Traefik example: ../mosaic-telemetry/docker-compose.yml