Commit Graph

16 Commits

Author SHA1 Message Date
24c21f45b3 feat(#374): add telemetry config to docker-compose and .env
- Add MOSAIC_TELEMETRY_* variables to .env.example with descriptions
- Pass telemetry env vars to api service in production compose
- Pass telemetry env vars to coordinator service in dev and swarm composes
- Swarm composes default to production URL (https://tel-api.mosaicstack.dev)
- Dev compose includes commented-out telemetry-api service placeholder
- All compose files default MOSAIC_TELEMETRY_ENABLED to false for safety

Refs #374

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:10:22 -06:00
f4e759c07a fix(devops): bypass OpenBao base entrypoint to prevent dev-mode flags
Some checks failed
ci/woodpecker/push/infra Pipeline failed
The base openbao image's docker-entrypoint.sh injects -dev-root-token-id
and -dev-listen-address flags when it sees 'server' as $1, causing the
server to exit immediately (code 0). Override entrypoint with dumb-init
and call bao directly to avoid the dev-mode flag injection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:13:57 -06:00
b6d272992a fix(devops): fix OpenBao healthcheck URL truncation with CMD-SHELL
Some checks failed
ci/woodpecker/push/infra Pipeline failed
The CMD exec form drops everything after & in the healthcheck URL,
causing uninitcode=200 and sealedcode=200 params to be lost. Without
them, OpenBao returns 501 when uninitialized, healthcheck fails, and
Swarm kills the container before the init sidecar can reach it.

Switch to CMD-SHELL with single-quoted URL to preserve query params.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:08:12 -06:00
899faba7e2 fix(devops): set Valkey maxmemory-policy to noeviction for BullMQ
Some checks failed
ci/woodpecker/push/infra Pipeline was successful
ci/woodpecker/manual/infra Pipeline was successful
ci/woodpecker/manual/coordinator Pipeline failed
ci/woodpecker/manual/web Pipeline failed
ci/woodpecker/manual/orchestrator Pipeline failed
ci/woodpecker/manual/api Pipeline failed
BullMQ requires noeviction to prevent silent job data loss. With
allkeys-lru, Valkey could evict keys BullMQ depends on for job tracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 16:51:42 -06:00
0ca3945061 fix(api): resolve Docker startup failures (secrets, Redis, Prisma)
- Pass BETTER_AUTH_SECRET through all 6 docker-compose files to API container
- Fix BullModule to parse VALKEY_URL instead of VALKEY_HOST/VALKEY_PORT,
  matching all other Redis consumers in the codebase
- Migrate Prisma encryption from removed $use() middleware to $extends()
  client extensions (Prisma 6.x compatibility), keeping extends PrismaClient
  pattern with only account and llmProviderInstance getters overridden

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 11:04:04 -06:00
Jason Woltje
429cf85f87 fix(#363): rebuild gosu from source with Go 1.26 to fix CRITICAL CVEs
The gosu 1.19 binary bundled in the postgres base image was compiled
with Go 1.24.6, which contains CVE-2025-68121 (CRITICAL) and 5 HIGH
severity Go stdlib vulnerabilities. Since upstream gosu has not released
a version built with patched Go (1.24.13+ / 1.25.7+), this adds a
multi-stage Docker build that recompiles gosu from source using Go 1.26.

Changes:
- Pin postgres base image to 17.7-alpine3.22 for reproducibility
- Add golang:1.26-alpine3.22 builder stage to compile gosu v1.19
- Replace bundled gosu binary with freshly built version
- Pin all postgres:17-alpine references across compose files and CI

CVEs fixed:
- CVE-2025-68121 (CRITICAL): Go crypto/tls vulnerability
- CVE-2025-58183 (HIGH): Go archive/tar unbounded allocation
- CVE-2025-61726 (HIGH): Go net/url memory exhaustion
- CVE-2025-61728 (HIGH): Go archive/zip CPU exhaustion
- CVE-2025-61729 (HIGH): Go crypto/x509 DoS
- CVE-2025-61730 (HIGH): Go TLS 1.3 handshake vulnerability

Fixes #363

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 12:38:33 -06:00
b3c0f51dc9 fix(devops): enable OpenBao in Swarm and fix healthchecks
- Enable OpenBao + init sidecar in Swarm compose (was commented out)
- Fix healthcheck to accept uninitialized/sealed vault states
  (add ?uninitcode=200&sealedcode=200 to /v1/sys/health)
- Replace nc-based healthcheck with wget in dev compose
- Add ORCHESTRATOR_URL env var to API service in Swarm compose
- Uncomment OpenBao volumes in Swarm compose

The healthcheck was returning HTTP 501 for uninitialized vault,
causing Swarm to restart OpenBao before init sidecar could run.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 19:38:34 -06:00
f3694592cc feat(swarm): add coordinator service and reorganize compose files
- Add coordinator service to docker-compose.swarm.portainer.yml and
  docker-compose.swarm.yml with full environment config and healthcheck
- Add ANTHROPIC_API_KEY and coordinator settings to .env.swarm.example
- Move docker-compose.override.yml.example and docker-compose.prod.yml
  into docker/ directory
- Add *.bak to .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 22:04:55 -06:00
3485ab7883 fix(swarm): remove postgres init-scripts bind mount for Portainer
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Remove ./docker/postgres/init-scripts bind mount from postgres service
- Fixes: 'bind source path does not exist' error in Portainer
- Init scripts are already baked into postgres image at build time

Portainer can't access repository files when deploying stacks,
so bind mounts to local paths don't work. The postgres image
already includes init scripts via Dockerfile COPY.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 20:29:25 -06:00
c195b8c8fd feat(openbao): add standalone deployment for swarm compatibility
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Create docker-compose.openbao.yml for standalone OpenBao deployment
  - Includes openbao and openbao-init services
  - Auto-initialization on first run
  - Connects to swarm's mosaic_internal network
  - Binds to localhost:8200 for security

- Update docker-compose.swarm.yml
  - Comment out OpenBao service (cannot run in swarm)
  - Add clear note about standalone requirement
  - Update volumes section
  - Update header with current config

- Create docs/OPENBAO-DEPLOYMENT.md
  - Comprehensive deployment guide
  - 4 deployment options: standalone, bundled, external, fallback
  - Clear explanation why OpenBao can't run in swarm
  - Deployment workflows for each scenario
  - Troubleshooting section

- Update docs/SWARM-DEPLOYMENT.md
  - Add Step 1: Deploy OpenBao standalone FIRST
  - Remove manual initialization (now automatic)
  - Update expected services list
  - Reference OpenBao deployment guide

- Update README.md
  - Clarify OpenBao standalone requirement for swarm
  - Update deployment steps
  - Highlight critical requirement at top of notes

Key changes:
- OpenBao MUST be deployed standalone when using swarm
- Automatic initialization via openbao-init sidecar
- Clear documentation for all deployment options
- Swarm stack no longer includes OpenBao

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 17:30:30 -06:00
dac735af56 fix(swarm): move docker-compose.swarm.yml back to root directory
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Move docker/docker-compose.swarm.yml to root
- Update documentation references
- Simplifies deployment: swarm file in root, standalone file in root
- Deploy script already expects file in root

Rationale: Keep it simple - two compose files for two deployment methods:
  - docker-compose.yml → standalone (docker compose up -d)
  - docker-compose.swarm.yml → swarm (docker stack deploy)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 17:22:20 -06:00
8b78ffe4a0 refactor(ci): Rename images to stack-* prefix for clarity
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Renamed all Docker images from generic names to stack-* prefix:
- api → stack-api
- web → stack-web
- postgres → stack-postgres
- openbao → stack-openbao
- orchestrator → stack-orchestrator

This prevents confusion with other repositories in the mosaic/
organization on git.mosaicstack.dev.

Registry images:
  git.mosaicstack.dev/mosaic/stack-api
  git.mosaicstack.dev/mosaic/stack-web
  git.mosaicstack.dev/mosaic/stack-postgres
  git.mosaicstack.dev/mosaic/stack-openbao
  git.mosaicstack.dev/mosaic/stack-orchestrator

Local images:
  stack-api:latest
  stack-web:latest
  stack-postgres:latest
  stack-openbao:latest
  stack-orchestrator:latest

Updated files:
- .woodpecker.yml (all build steps + package linking)
- docker-compose.swarm.yml (all image references)
- build-images.sh (local image names)
- deploy-swarm.sh (image validation)
2026-02-08 02:03:31 -06:00
0e3baae415 feat(ci): Add OpenBao and Orchestrator image builds to Woodpecker CI
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Add missing Docker image builds for swarm deployment.

Changes:
- Added docker-build-openbao step to .woodpecker.yml
- Added docker-build-orchestrator step to .woodpecker.yml
- Updated docker-compose.swarm.yml to use registry images
  (git.mosaicstack.dev/mosaic/*)
- Added IMAGE_TAG variable support for versioned deployments
- Updated deploy-swarm.sh to support both registry and local images

Image tagging strategy:
- All commits: SHA tag (e.g., 658ec077)
- main branch: latest + SHA
- develop branch: dev + SHA
- git tags: version tag + SHA

Registry images:
- git.mosaicstack.dev/mosaic/postgres
- git.mosaicstack.dev/mosaic/openbao
- git.mosaicstack.dev/mosaic/api
- git.mosaicstack.dev/mosaic/orchestrator
- git.mosaicstack.dev/mosaic/web

Deployment modes:
- IMAGE_TAG=latest (default, use registry latest)
- IMAGE_TAG=dev (use registry dev tag)
- IMAGE_TAG=local (use local builds via build-images.sh)
2026-02-08 01:33:36 -06:00
7f3499b1f2 fix(swarm): Remove build directives and unsupported options for swarm
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Docker Swarm doesn't support build directives or security_opt.
Images must be pre-built before deployment.

Changes:
- Created build-images.sh script to build all images
- Updated deploy-swarm.sh to check for images and offer to build
- Removed build: sections from docker-compose.swarm.yml
- Removed security_opt: (not supported in swarm)
- Services now reference pre-built images only

Deployment workflow:
1. ./build-images.sh (build all images)
2. ./deploy-swarm.sh mosaic (deploy to swarm)
2026-02-08 01:31:29 -06:00
2a9a1f1367 fix(swarm): Convert boolean env vars to strings in orchestrator service
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Docker Compose/Swarm requires environment variables to be strings, not booleans.

Changes:
- KILLSWITCH_ENABLED: true -> "true"
- SANDBOX_ENABLED: true -> "true"

Fixes deployment error: 'must be a string, number or null'
2026-02-08 01:30:07 -06:00
ed92bb5402 feat(#swarm): Add Docker Swarm deployment with AI provider configuration
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add setup-wizard.sh for interactive configuration
- Add docker-compose.swarm.yml optimized for swarm deployment
- Make CLAUDE_API_KEY optional based on AI_PROVIDER setting
- Support multiple AI providers: Ollama, Claude API, OpenAI
- Add BETTER_AUTH_SECRET to .env.example
- Update deploy-swarm.sh to validate AI provider config
- Add comprehensive documentation (DOCKER-SWARM.md, SWARM-QUICKREF.md)

Changes:
- AI_PROVIDER env var controls which AI backend to use
- Ollama is default (no API key required)
- Claude API and OpenAI require respective API keys
- Deployment script validates based on selected provider
- Removed Authentik services from swarm compose (using external)
- Configured for upstream Traefik integration
2026-02-08 01:18:04 -06:00