Addresses all 10 quality remediation issues for the orchestrator module: TypeScript & Type Safety: - #260: Fix TypeScript compilation errors in tests - #261: Replace explicit 'any' types with proper typed mocks Error Handling & Reliability: - #262: Fix silent cleanup failures - return structured results - #263: Fix silent Valkey event parsing failures with proper error handling - #266: Improve error context in Docker operations - #267: Fix secret scanner false negatives on file read errors - #268: Fix worktree cleanup error swallowing Testing & Quality: - #264: Add queue integration tests (coverage 15% → 85%) - #265: Fix Prettier formatting violations - #269: Update outdated TODO comments All tests passing (406/406), TypeScript compiles cleanly, ESLint clean. Fixes #260, Fixes #261, Fixes #262, Fixes #263, Fixes #264 Fixes #265, Fixes #266, Fixes #267, Fixes #268, Fixes #269 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.9 KiB
Orchestrator Security Documentation
Overview
This document outlines the security measures implemented in the Mosaic Orchestrator Docker container and deployment configuration.
Docker Security Hardening
Multi-Stage Build
The Dockerfile uses a 4-stage build process to minimize attack surface:
- Base Stage: Minimal Alpine base with pnpm enabled
- Dependencies Stage: Installs production dependencies only
- Builder Stage: Builds the application with all dependencies
- Runtime Stage: Final minimal image with only built artifacts
Benefits:
- Reduces final image size by excluding build tools and dev dependencies
- Minimizes attack surface by removing unnecessary packages
- Separates build-time from runtime environments
Base Image Security
Image: node:20-alpine
Security Scan Results (Trivy, 2026-02-02):
- Alpine Linux: 0 vulnerabilities
- Node.js packages: 0 vulnerabilities
- Base image size: ~180MB (vs 1GB+ for full node images)
Why Alpine?
- Minimal attack surface (only essential packages)
- Security-focused distribution
- Regular security updates
- Small image size reduces download time and storage
Non-Root User
User: node (UID: 1000, GID: 1000)
The container runs as a non-root user to prevent privilege escalation attacks.
Implementation:
# Dockerfile
USER node
# docker-compose.yml
user: "1000:1000"
Security Benefits:
- Prevents root access if container is compromised
- Limits blast radius of potential vulnerabilities
- Follows principle of least privilege
File Permissions
All application files are owned by node:node:
COPY --from=builder --chown=node:node /app/apps/orchestrator/dist ./dist
COPY --from=dependencies --chown=node:node /app/node_modules ./node_modules
Permissions:
- Application code: Read/execute only
- Workspace volume: Read/write (required for git operations)
- Docker socket: Read-only mount
Health Checks
Dockerfile Health Check:
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3001/health || exit 1
Benefits:
- Container orchestration can detect unhealthy containers
- Automatic restart on health check failure
- Minimal overhead (uses wget already in Alpine)
Endpoint: GET /health
- Returns 200 OK when service is healthy
- No authentication required (internal endpoint)
Capability Management
docker-compose.yml:
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE
Dropped Capabilities:
- ALL (start with zero privileges)
Added Capabilities:
- NET_BIND_SERVICE (required to bind to port 3001)
Why minimal capabilities?
- Reduces attack surface
- Prevents privilege escalation
- Limits kernel access
Read-Only Docker Socket
The Docker socket is mounted read-only where possible:
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
Note: The orchestrator needs Docker access to spawn agent containers. This is intentional and required for functionality.
Mitigation:
- Non-root user limits socket abuse
- Capability restrictions prevent escalation
- Monitoring and killswitch can detect anomalies
Temporary Filesystem
A tmpfs mount is configured for /tmp:
tmpfs:
- /tmp:noexec,nosuid,size=100m
Security Benefits:
noexec: Prevents execution of binaries from /tmpnosuid: Ignores setuid/setgid bits- Size limit: Prevents DoS via disk exhaustion
Security Options
security_opt:
- no-new-privileges:true
no-new-privileges:
- Prevents processes from gaining new privileges
- Blocks setuid/setgid binaries
- Prevents privilege escalation
Network Isolation
Network: mosaic-internal (bridge network)
The orchestrator is not exposed to the public network. It communicates only with:
- Valkey (internal)
- API (internal)
- Docker daemon (local socket)
Labels and Metadata
The container includes comprehensive labels for tracking and compliance:
LABEL org.opencontainers.image.source="https://git.mosaicstack.dev/mosaic/stack"
LABEL org.opencontainers.image.vendor="Mosaic Stack"
LABEL com.mosaic.security=hardened
LABEL com.mosaic.security.non-root=true
Runtime Security
Environment Variables
Sensitive configuration is passed via environment variables:
CLAUDE_API_KEY: Claude API credentialsVALKEY_URL: Cache connection string
Best Practices:
- Never commit secrets to git
- Use
.envfiles for local development - Use secrets management (Vault) in production
Volume Security
Workspace Volume:
orchestrator_workspace:/workspace
Security Considerations:
- Persistent storage for git operations
- Writable by node user
- Isolated from other services
- Regular cleanup via lifecycle management
Monitoring and Logging
The orchestrator logs all operations for audit trails:
- Agent spawning/termination
- Quality gate results
- Git operations
- Killswitch activations
Log Security:
- Secrets are redacted from logs
- Logs stored in Docker volumes
- Rotation configured to prevent disk exhaustion
Security Checklist
- Multi-stage Docker build
- Non-root user (node:node, UID 1000)
- Minimal base image (node:20-alpine)
- No unnecessary packages
- Health check in Dockerfile
- Security scan passes (0 vulnerabilities)
- Capability restrictions (drop ALL, add minimal)
- No new privileges flag
- Read-only mounts where possible
- Tmpfs with noexec/nosuid
- Network isolation
- Comprehensive labels
- Environment-based secrets
Known Limitations
Docker Socket Access
The orchestrator requires access to the Docker socket (/var/run/docker.sock) to spawn agent containers.
Risk:
- Docker socket access provides root-equivalent privileges
- Compromised orchestrator could spawn malicious containers
Mitigations:
- Non-root user: Limits socket abuse
- Capability restrictions: Prevents privilege escalation
- Killswitch: Emergency stop for all agents
- Monitoring: Audit logs track all Docker operations
- Network isolation: Orchestrator not exposed publicly
Future Improvements:
- Consider Docker-in-Docker (DinD) for better isolation
- Implement Docker socket proxy with ACLs
- Evaluate Kubernetes pod security policies
Workspace Writes
The workspace volume must be writable for git operations.
Risk:
- Code execution via malicious git hooks
- Data exfiltration via commit/push
Mitigations:
- Isolated volume: Workspace not shared with other services
- Non-root user: Limits blast radius
- Quality gates: Code review before commit
- Secret scanning: git-secrets prevents credential leaks
Compliance
This security configuration aligns with:
- CIS Docker Benchmark: Passes all applicable controls
- OWASP Container Security: Follows best practices
- NIST SP 800-190: Application Container Security Guide
Security Audits
Last Security Scan: 2026-02-02 Tool: Trivy v0.69 Results: 0 vulnerabilities (HIGH/CRITICAL)
Recommended Scan Frequency:
- Weekly automated scans
- On-demand before production deployments
- After base image updates
Reporting Security Issues
If you discover a security vulnerability, please report it to:
- Email: security@mosaicstack.dev
- Issue Tracker: Use the "security" label (private issues only)
Do NOT:
- Open public issues for security vulnerabilities
- Disclose vulnerabilities before patch is available
References
Document Version: 1.0 Last Updated: 2026-02-02 Maintained By: Mosaic Security Team