Kaniko fundamentally cannot run apt-get update on bookworm (Debian 12)
due to GPG signature verification failures during filesystem snapshots.
Neither --snapshot-mode=redo nor clearing /var/lib/apt/lists/* resolves
this.
Changes:
- Replace apt-get install dumb-init with ADD from GitHub releases
(static x86_64 binary) in api, web, and orchestrator Dockerfiles
- Switch coordinator builder from python:3.11-slim to python:3.11
(full image includes build tools, avoids 336MB build-essential)
- Replace wget healthcheck with node-based check in orchestrator
(wget no longer installed)
- Exclude telemetry lifecycle integration tests in CI (fail due to
runner disk pressure on PostgreSQL, not code issues)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The genericOAuth plugin is conditionally loaded based on OIDC_ENABLED
env var. Without it, BetterAuth has no /sign-in/oauth2 route, causing
404 when the login button is clicked.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Alpine (musl libc) is incompatible with matrix-sdk-crypto-nodejs native binary
which requires glibc's ld-linux-x86-64.so.2. Switched all Node.js Dockerfiles
to node:24-slim (Debian/glibc). Also fixed docker-compose.matrix.yml network
naming from undefined mosaic-network to mosaic-internal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep both Mosaic Telemetry section (from develop) and Matrix Dev
Environment section (from feature branch) in .env.example.
Regenerate pnpm-lock.yaml with both dependency trees merged.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix sendThreadMessage room mismatch: use channelId from options instead of hardcoded controlRoomId
- Add .catch() to fire-and-forget handleRoomMessage to prevent silent error swallowing
- Wrap dispatchJob in try-catch for user-visible error reporting in handleFixCommand
- Add MATRIX_BOT_USER_ID validation in connect() to prevent infinite message loops
- Fix streamResponse error masking: wrap finally/catch side-effects in try-catch
- Replace unsafe type assertion with public getClient() in MatrixRoomService
- Add orphaned room warning in provisionRoom on DB failure
- Add provider identity to Herald error logs
- Add channelId to ThreadMessageOptions interface and all callers
- Add missing env var warnings in BridgeModule factory
- Fix JSON injection in setup-bot.sh: use jq for safe JSON construction
Fixes#377
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add MOSAIC_TELEMETRY_* variables to .env.example with descriptions
- Pass telemetry env vars to api service in production compose
- Pass telemetry env vars to coordinator service in dev and swarm composes
- Swarm composes default to production URL (https://tel-api.mosaicstack.dev)
- Dev compose includes commented-out telemetry-api service placeholder
- All compose files default MOSAIC_TELEMETRY_ENABLED to false for safety
Refs #374
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added CSRF_SECRET to docker-compose.swarm.portainer.yml (the active
Portainer deployment) and both example compose files. Also added
ENCRYPTION_KEY to the example files where it was missing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both env vars were missing from the API service environment in
docker-compose.prod.yml and docker-compose.build.yml, causing the
CSRF_SECRET check to fail at startup even when set in .env.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Standalone Synapse + Element Web deployment for Docker Swarm/Portainer.
Separate infrastructure from Mosaic Stack (same pattern as Authentik).
Includes: Synapse, Element Web, dedicated PostgreSQL, optional coturn.
Traefik labels match existing Stack conventions.
Refs #387
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The base openbao image's docker-entrypoint.sh injects -dev-root-token-id
and -dev-listen-address flags when it sees 'server' as $1, causing the
server to exit immediately (code 0). Override entrypoint with dumb-init
and call bao directly to avoid the dev-mode flag injection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CMD exec form drops everything after & in the healthcheck URL,
causing uninitcode=200 and sealedcode=200 params to be lost. Without
them, OpenBao returns 501 when uninitialized, healthcheck fails, and
Swarm kills the container before the init sidecar can reach it.
Switch to CMD-SHELL with single-quoted URL to preserve query params.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BullMQ requires noeviction to prevent silent job data loss. With
allkeys-lru, Valkey could evict keys BullMQ depends on for job tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Pass BETTER_AUTH_SECRET through all 6 docker-compose files to API container
- Fix BullModule to parse VALKEY_URL instead of VALKEY_HOST/VALKEY_PORT,
matching all other Redis consumers in the codebase
- Migrate Prisma encryption from removed $use() middleware to $extends()
client extensions (Prisma 6.x compatibility), keeping extends PrismaClient
pattern with only account and llmProviderInstance getters overridden
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- docker/postgres/Dockerfile: build gosu from source with Go 1.26 via
multi-stage build (eliminates 1 CRITICAL + 5 HIGH Go stdlib CVEs)
- apps/{api,web,orchestrator}/Dockerfile: remove npm from production
images (eliminates 5 HIGH CVEs in npm's bundled cross-spawn/glob/tar)
- .trivyignore: trimmed from 16 to 5 CVEs (OpenBao only — 4 false
positives from Go pseudo-version + 1 real Go stdlib waiting on upstream)
Fixes#363
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gosu doesn't publish proper Go module semver tags, so
`go install github.com/tianon/gosu@v1.19` fails with "no matching
versions". Replace the multi-stage golang builder with
`COPY --from=tianon/gosu /gosu /usr/local/bin/gosu`, which pulls the
pre-built binary from the official tianon/gosu Docker image. This image
is rebuilt with recent Go toolchains, so it still addresses the Go
stdlib CVEs documented in the Dockerfile comments.
Fixes#363
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The gosu 1.19 binary bundled in the postgres base image was compiled
with Go 1.24.6, which contains CVE-2025-68121 (CRITICAL) and 5 HIGH
severity Go stdlib vulnerabilities. Since upstream gosu has not released
a version built with patched Go (1.24.13+ / 1.25.7+), this adds a
multi-stage Docker build that recompiles gosu from source using Go 1.26.
Changes:
- Pin postgres base image to 17.7-alpine3.22 for reproducibility
- Add golang:1.26-alpine3.22 builder stage to compile gosu v1.19
- Replace bundled gosu binary with freshly built version
- Pin all postgres:17-alpine references across compose files and CI
CVEs fixed:
- CVE-2025-68121 (CRITICAL): Go crypto/tls vulnerability
- CVE-2025-58183 (HIGH): Go archive/tar unbounded allocation
- CVE-2025-61726 (HIGH): Go net/url memory exhaustion
- CVE-2025-61728 (HIGH): Go archive/zip CPU exhaustion
- CVE-2025-61729 (HIGH): Go crypto/x509 DoS
- CVE-2025-61730 (HIGH): Go TLS 1.3 handshake vulnerability
Fixes#363
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pin OpenBao base image from unpinned :2 tag to :2.5.0 (latest stable,
released 2026-02-04) in both the Dockerfile and the dev docker-compose.
CVEs resolved:
- CVE-2025-68121 (CRITICAL): Go stdlib crypto/tls session resumption
- CVE-2024-8185 (HIGH): DoS via Raft join requests
- CVE-2024-9180 (HIGH): Root namespace privilege escalation
- CVE-2025-59043 (HIGH): DoS via malicious JSON
- CVE-2025-64761 (HIGH): Identity group root escalation
All fixed in OpenBao >= 2.4.4; v2.5.0 includes all patches plus new
features (horizontal read scalability, OCI plugin distribution).
Files changed:
- docker/openbao/Dockerfile: FROM tag 2 -> 2.5.0
- docker/docker-compose.yml: openbao + openbao-init image tags 2 -> 2.5.0
The production/swarm compose files use the custom-built
git.mosaicstack.dev/mosaic/stack-openbao image which is built FROM
this Dockerfile, so they inherit the fix on next CI build.
Fixes#363
- Enable OpenBao + init sidecar in Swarm compose (was commented out)
- Fix healthcheck to accept uninitialized/sealed vault states
(add ?uninitcode=200&sealedcode=200 to /v1/sys/health)
- Replace nc-based healthcheck with wget in dev compose
- Add ORCHESTRATOR_URL env var to API service in Swarm compose
- Uncomment OpenBao volumes in Swarm compose
The healthcheck was returning HTTP 501 for uninitialized vault,
causing Swarm to restart OpenBao before init sidecar could run.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add coordinator service to docker-compose.swarm.portainer.yml and
docker-compose.swarm.yml with full environment config and healthcheck
- Add ANTHROPIC_API_KEY and coordinator settings to .env.swarm.example
- Move docker-compose.override.yml.example and docker-compose.prod.yml
into docker/ directory
- Add *.bak to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update docker-compose.swarm.yml with external Authentik configuration
- Comment out Authentik services (using external OIDC provider)
- Comment out Authentik volumes
- Add header with deployment instructions and current configuration
- Create comprehensive SWARM-DEPLOYMENT.md guide
- Prerequisites and swarm initialization
- Manual OpenBao initialization (critical - no auto-init in swarm)
- External service configuration examples
- Scaling, updates, rollbacks
- Troubleshooting and maintenance procedures
- Backup and restore instructions
- Update .env.swarm.example
- Add note about external vs internal Authentik
- Update default OIDC_ISSUER to use https
- Clarify which variables are needed for internal Authentik
- Update README.md Docker Swarm section
- Fix deploy script path (./scripts/deploy-swarm.sh)
- Add note about manual OpenBao initialization
- Add warning about no profile support in swarm
- Update documentation references to docs/ directory
- Update documentation cross-references
- Add deprecation notice to old DOCKER-SWARM.md
- Add deployment guide reference to SWARM-QUICKREF.md
- Update DOCKER-COMPOSE-GUIDE.md See Also section
Key changes for swarm deployment:
- Swarm does NOT support docker-compose profiles
- External services must be manually commented out
- OpenBao requires manual initialization (no sidecar)
- All documentation updated with correct paths
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add OpenBao services to docker-compose.yml with profiles (openbao, full)
- Add docker-compose.build.yml for local builds vs registry pulls
- Make PostgreSQL and Valkey optional via profiles (database, cache)
- Create example compose files for common deployment scenarios:
- docker/docker-compose.example.turnkey.yml (all bundled)
- docker/docker-compose.example.external.yml (all external)
- docker/docker.example.hybrid.yml (mixed deployment)
- Update documentation:
- Enhance .env.example with profiles and external service examples
- Update README.md with deployment mode quick starts
- Add deployment scenarios to docs/OPENBAO.md
- Create docker/DOCKER-COMPOSE-GUIDE.md with comprehensive guide
- Clean up repository structure:
- Move shell scripts to scripts/ directory
- Move documentation to docs/ directory
- Move docker compose examples to docker/ directory
- Configure for external Authentik with internal services:
- Comment out Authentik services (using external OIDC)
- Comment out unused volumes for disabled services
- Keep postgres, valkey, openbao as internal services
This provides a flexible deployment architecture supporting turnkey,
production (all external), and hybrid configurations via Docker Compose
profiles.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The docker-build-openbao pipeline step was failing because the Dockerfile
was missing from docker/openbao/.
Created a minimal Dockerfile that:
- Uses official quay.io/openbao/openbao:2 as base
- Copies config.hcl and init.sh into the image
- Exposes port 8200
- Preserves the default entrypoint from base image
This allows Kaniko to build the stack-openbao image for Swarm deployment.
Fixes pipeline #325 docker-build-openbao failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements secure credential storage using OpenBao Transit encryption.
Features:
- Auto-initialization on first run (1-of-1 Shamir key for dev)
- Auto-unseal on container restart with verification and retry logic
- Transit secrets engine with 4 named encryption keys
- AppRole authentication with Transit-only policy
- Localhost-only API binding for security
- Comprehensive integration test suite (22 tests, all passing)
Security:
- API bound to 127.0.0.1 (localhost only, no external access)
- Unseal verification with 3-attempt retry logic
- Sanitized error messages in tests (no secret leakage)
- Volume-based secret reading (doesn't require running container)
Files:
- docker/openbao/config.hcl: Server configuration
- docker/openbao/init.sh: Auto-init/unseal script
- docker/docker-compose.yml: OpenBao and init services
- tests/integration/openbao.test.ts: Full test coverage
- .env.example: OpenBao configuration variables
Closes#357
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added explicit package update/upgrade step to patch CVE-2025-58183, CVE-2025-61726, CVE-2025-61728, and CVE-2025-61729 in Go stdlib components from Alpine Linux packages (likely LLVM or transitive dependencies).
The fix ensures all base image packages are up-to-date before pgvector build, capturing any security patches released for Alpine components.
Fixes#181
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>