Files
stack/docs/audits/ms22-phase1-audit.md
Jason Woltje 3607554902
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
fix(api): MS22 Phase 1 post-coding audit (#625)
Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
2026-03-01 19:53:49 +00:00

6.9 KiB
Raw Blame History

MS22 Phase 1 Module Audit

Date: 2026-03-01 Branch: fix/ms22-audit Scope:

  • apps/api/src/container-lifecycle/
  • apps/api/src/crypto/
  • apps/api/src/agent-config/
  • apps/api/src/onboarding/
  • apps/api/src/fleet-settings/
  • apps/api/src/chat-proxy/

Summary

Audit completed for module wiring, security controls, input validation, and error handling.

Findings:

  1. chat-proxy: raw internal/upstream error messages were returned to clients over SSE (fixed).
  2. chat-proxy: proxy requests to OpenClaw did not forward the container bearer token returned by lifecycle startup (fixed).
  3. agent-config: token validation returned early and used length-gated compare logic, creating avoidable timing side-channel behavior (hardened).

Module Review Results

1) container-lifecycle

  • NestJS module dependency audit:
    • ContainerLifecycleModule imports ConfigModule, PrismaModule, and CryptoModule required by ContainerLifecycleService.
    • Providers/exports are correct (ContainerLifecycleService provided and exported).
  • Security review:
    • Container operations are user-scoped by userId and do not expose cross-user selectors in this module.
    • AES token generation/decryption delegated to CryptoService.
  • Input validation:
    • No controller endpoints in this module; no direct request DTO surface here.
  • Error handling:
    • No direct HTTP layer here; errors flow to callers/global filter.
  • Finding status: No issues found in this module.

2) crypto

  • NestJS module dependency audit:
    • CryptoModule correctly imports ConfigModule for ConfigService.
    • CryptoService is correctly provided/exported.
  • Security review:
    • AES-256-GCM is implemented correctly.
    • 96-bit IV generated via randomBytes(12) per encryption.
    • Auth tag captured and verified on decrypt (setAuthTag + decipher.final()).
    • HKDF derives a fixed 32-byte key from MOSAIC_SECRET_KEY.
  • Input validation:
    • No DTO/request surface in this module.
  • Error handling:
    • Decrypt failures are normalized to Failed to decrypt value.
  • Finding status: No issues found in this module.

3) agent-config

  • NestJS module dependency audit:
    • AgentConfigModule imports PrismaModule + CryptoModule; AgentConfigService and AgentConfigGuard are provided.
    • Controller/guard/service wiring is correct.
  • Security review:
    • Bearer token comparisons used timingSafeEqual, but returned early on first match and performed length-gated comparison.
    • Internal route (/api/internal/agent-config/:id) is access-controlled by bearer token guard and container-id match (containerAuth.id === :id).
  • Input validation:
    • Header token extraction and route param are manually handled (no DTO for :id, acceptable for current use but should remain constrained).
  • Error handling:
    • Service throws typed Nest exceptions for not-found paths.
  • Finding status: Issue found and fixed.

4) onboarding

  • NestJS module dependency audit:
    • OnboardingModule imports required dependencies (PrismaModule, CryptoModule; ConfigModule currently unused but harmless).
    • Providers/controllers are correctly declared.
  • Security review:
    • OnboardingGuard blocks all mutating onboarding routes once onboarding.completed=true.
    • Onboarding cannot be re-run via guarded endpoints after completion.
  • Input validation:
    • DTOs use class-validator decorators for all request bodies.
  • Error handling:
    • Uses typed Nest exceptions (ConflictException, BadRequestException).
  • Finding status: No issues found in this module.

5) fleet-settings

  • NestJS module dependency audit:
    • FleetSettingsModule imports AuthModule, PrismaModule, CryptoModule required by its controller/service.
    • Provider/export wiring is correct for FleetSettingsService.
  • Security review:
    • Class-level AuthGuard protects all routes.
    • Admin-only routes additionally use AdminGuard (oidc and breakglass/reset-password).
    • Provider list/get responses do not expose apiKey.
    • OIDC read response intentionally omits clientSecret.
  • Input validation:
    • DTOs are decorated with class-validator.
  • Error handling:
    • Ownership/not-found conditions use typed exceptions.
  • Finding status: No issues found in this module.

6) chat-proxy

  • NestJS module dependency audit:
    • ChatProxyModule imports AuthModule, PrismaModule, ContainerLifecycleModule needed by controller/service.
    • Provider/controller wiring is correct.
  • Security review:
    • User identity comes from AuthGuard; no user-provided container selector, so no cross-user container proxy path found.
    • Issue fixed: gateway bearer token was not forwarded on proxied requests.
    • Issue fixed: SSE error events exposed raw internal exception messages.
  • Input validation:
    • ChatStreamDto + nested ChatMessageDto use class-validator decorators.
  • Error handling:
    • Issue fixed: controller now emits safe client error messages and logs details server-side.
  • Finding status: Issues found and fixed.

Security Checklist Outcomes

  • fleet-settings: admin-only routes are guarded; non-admin users cannot access OIDC or breakglass reset routes. Provider secrets are not returned in provider read endpoints.
  • agent-config: token comparison hardened; route remains gated by bearer token + container id binding.
  • onboarding: guarded mutating endpoints cannot run after completion.
  • crypto: AES-256-GCM usage is correct (random IV, auth-tag verification, fixed 32-byte key derivation).
  • chat-proxy: user cannot target another users container; proxy now authenticates to OpenClaw using per-container bearer token.

Input Validation

  • DTO coverage is present in onboarding, fleet-settings, and chat-proxy request bodies.
  • No critical unvalidated body inputs found in scoped modules.

Error Handling

  • Global API layer has a sanitizing GlobalExceptionFilter.
  • chat-proxy used manual response handling (@Res) and bypassed global filter; this was corrected by sending safe generic SSE errors.
  • No additional critical sensitive-data leaks found in reviewed scope.

Changes Made

  1. Hardened token comparison behavior in:

    • apps/api/src/agent-config/agent-config.service.ts
    • Changes:
      • Compare SHA-256 digests with timingSafeEqual.
      • Avoid early return during scan to reduce timing signal differences.
  2. Fixed OpenClaw auth forwarding and error leak risk in:

    • apps/api/src/chat-proxy/chat-proxy.service.ts
    • apps/api/src/chat-proxy/chat-proxy.controller.ts
    • apps/api/src/chat-proxy/chat-proxy.service.spec.ts
    • Changes:
      • Forward Authorization: Bearer <gatewayToken> when proxying chat requests.
      • Stop returning raw internal/upstream error text to clients over SSE.
      • Log details server-side and return safe client-facing messages.

Validation Commands

Required quality gate command run:

  • pnpm turbo lint typecheck --filter=@mosaic/api

(Results captured in session logs.)