Files
stack/apps/orchestrator/Dockerfile
Jason Woltje d2ed1f2817
Some checks failed
ci/woodpecker/push/infra Pipeline was successful
ci/woodpecker/push/orchestrator Pipeline failed
ci/woodpecker/push/api Pipeline failed
ci/woodpecker/push/coordinator Pipeline was successful
ci/woodpecker/push/web Pipeline was successful
fix: eliminate apt-get from Kaniko builds, use static dumb-init binary
Kaniko fundamentally cannot run apt-get update on bookworm (Debian 12)
due to GPG signature verification failures during filesystem snapshots.
Neither --snapshot-mode=redo nor clearing /var/lib/apt/lists/* resolves
this.

Changes:
- Replace apt-get install dumb-init with ADD from GitHub releases
  (static x86_64 binary) in api, web, and orchestrator Dockerfiles
- Switch coordinator builder from python:3.11-slim to python:3.11
  (full image includes build tools, avoids 336MB build-essential)
- Replace wget healthcheck with node-based check in orchestrator
  (wget no longer installed)
- Exclude telemetry lifecycle integration tests in CI (fail due to
  runner disk pressure on PostgreSQL, not code issues)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 20:06:06 -06:00

116 lines
4.3 KiB
Docker

# syntax=docker/dockerfile:1
# Enable BuildKit features for cache mounts
# Base image for all stages
# Uses Debian slim (glibc) instead of Alpine (musl) for native addon compatibility.
FROM node:24-slim AS base
# Install pnpm globally
RUN corepack enable && corepack prepare pnpm@10.27.0 --activate
# Set working directory
WORKDIR /app
# Copy monorepo configuration files
COPY pnpm-workspace.yaml package.json pnpm-lock.yaml ./
COPY turbo.json ./
# ======================
# Dependencies stage
# ======================
FROM base AS deps
# Copy all package.json files for workspace resolution
COPY packages/shared/package.json ./packages/shared/
COPY packages/config/package.json ./packages/config/
COPY apps/orchestrator/package.json ./apps/orchestrator/
# Install ALL dependencies (not just production)
# This ensures NestJS packages and other required deps are available
RUN --mount=type=cache,id=pnpm-store,target=/root/.local/share/pnpm/store \
pnpm install --frozen-lockfile
# ======================
# Builder stage
# ======================
FROM base AS builder
# Copy root node_modules from deps
COPY --from=deps /app/node_modules ./node_modules
# Copy all source code
COPY packages ./packages
COPY apps/orchestrator ./apps/orchestrator
# Copy workspace node_modules from deps
COPY --from=deps /app/packages/shared/node_modules ./packages/shared/node_modules
COPY --from=deps /app/packages/config/node_modules ./packages/config/node_modules
COPY --from=deps /app/apps/orchestrator/node_modules ./apps/orchestrator/node_modules
# Build the orchestrator app using TurboRepo
RUN pnpm turbo build --filter=@mosaic/orchestrator
# Remove compiled test/spec files from dist BEFORE copying to production.
# These contain test fixture secrets (fake AWS keys, RSA keys) that trigger Trivy.
# Must happen in builder stage so they never appear in any production image layer.
RUN find ./apps/orchestrator/dist \( -name '*.spec.js' -o -name '*.spec.js.map' -o -name '*.test.js' -o -name '*.test.js.map' \) -delete
# ======================
# Production stage
# ======================
FROM node:24-slim AS production
# Add metadata labels
LABEL maintainer="mosaic-team@mosaicstack.dev"
LABEL version="0.0.6"
LABEL description="Mosaic Orchestrator - Agent orchestration service"
LABEL org.opencontainers.image.source="https://git.mosaicstack.dev/mosaic/stack"
LABEL org.opencontainers.image.vendor="Mosaic Stack"
LABEL org.opencontainers.image.title="Mosaic Orchestrator"
LABEL org.opencontainers.image.description="Agent orchestration service for Mosaic Stack"
# Remove npm (unused in production — we use pnpm) to reduce attack surface
RUN rm -rf /usr/local/lib/node_modules/npm /usr/local/bin/npm /usr/local/bin/npx
# Install dumb-init for proper signal handling (static binary from GitHub,
# avoids apt-get which fails under Kaniko with bookworm GPG signature errors)
ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64 /usr/local/bin/dumb-init
RUN chmod 755 /usr/local/bin/dumb-init
# Create non-root user
RUN groupadd -g 1001 nodejs && useradd -m -u 1001 -g nodejs nestjs
WORKDIR /app
# Copy node_modules from builder (includes all dependencies)
COPY --from=builder --chown=nestjs:nodejs /app/node_modules ./node_modules
# Copy built packages (includes dist/ directories)
COPY --from=builder --chown=nestjs:nodejs /app/packages ./packages
# Copy built orchestrator application (spec/test files already removed in builder stage)
COPY --from=builder --chown=nestjs:nodejs /app/apps/orchestrator/dist ./apps/orchestrator/dist
COPY --from=builder --chown=nestjs:nodejs /app/apps/orchestrator/package.json ./apps/orchestrator/
# Copy app's node_modules which contains symlinks to root node_modules
COPY --from=builder --chown=nestjs:nodejs /app/apps/orchestrator/node_modules ./apps/orchestrator/node_modules
# Set working directory to orchestrator app
WORKDIR /app/apps/orchestrator
# Switch to non-root user
USER nestjs
# Expose orchestrator port
EXPOSE 3001
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD node -e "require('http').get('http://localhost:3001/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
# Use dumb-init to handle signals properly
ENTRYPOINT ["dumb-init", "--"]
# Start the application
CMD ["node", "dist/main.js"]