All checks were successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>
158 lines
6.9 KiB
Markdown
158 lines
6.9 KiB
Markdown
# MS22 Phase 1 Module Audit
|
||
|
||
Date: 2026-03-01
|
||
Branch: `fix/ms22-audit`
|
||
Scope:
|
||
|
||
- `apps/api/src/container-lifecycle/`
|
||
- `apps/api/src/crypto/`
|
||
- `apps/api/src/agent-config/`
|
||
- `apps/api/src/onboarding/`
|
||
- `apps/api/src/fleet-settings/`
|
||
- `apps/api/src/chat-proxy/`
|
||
|
||
## Summary
|
||
|
||
Audit completed for module wiring, security controls, input validation, and error handling.
|
||
|
||
Findings:
|
||
|
||
1. `chat-proxy`: raw internal/upstream error messages were returned to clients over SSE (fixed).
|
||
2. `chat-proxy`: proxy requests to OpenClaw did not forward the container bearer token returned by lifecycle startup (fixed).
|
||
3. `agent-config`: token validation returned early and used length-gated compare logic, creating avoidable timing side-channel behavior (hardened).
|
||
|
||
## Module Review Results
|
||
|
||
### 1) `container-lifecycle`
|
||
|
||
- NestJS module dependency audit:
|
||
- `ContainerLifecycleModule` imports `ConfigModule`, `PrismaModule`, and `CryptoModule` required by `ContainerLifecycleService`.
|
||
- Providers/exports are correct (`ContainerLifecycleService` provided and exported).
|
||
- Security review:
|
||
- Container operations are user-scoped by `userId` and do not expose cross-user selectors in this module.
|
||
- AES token generation/decryption delegated to `CryptoService`.
|
||
- Input validation:
|
||
- No controller endpoints in this module; no direct request DTO surface here.
|
||
- Error handling:
|
||
- No direct HTTP layer here; errors flow to callers/global filter.
|
||
- Finding status: **No issues found in this module**.
|
||
|
||
### 2) `crypto`
|
||
|
||
- NestJS module dependency audit:
|
||
- `CryptoModule` correctly imports `ConfigModule` for `ConfigService`.
|
||
- `CryptoService` is correctly provided/exported.
|
||
- Security review:
|
||
- AES-256-GCM is implemented correctly.
|
||
- 96-bit IV generated via `randomBytes(12)` per encryption.
|
||
- Auth tag captured and verified on decrypt (`setAuthTag` + `decipher.final()`).
|
||
- HKDF derives a fixed 32-byte key from `MOSAIC_SECRET_KEY`.
|
||
- Input validation:
|
||
- No DTO/request surface in this module.
|
||
- Error handling:
|
||
- Decrypt failures are normalized to `Failed to decrypt value`.
|
||
- Finding status: **No issues found in this module**.
|
||
|
||
### 3) `agent-config`
|
||
|
||
- NestJS module dependency audit:
|
||
- `AgentConfigModule` imports `PrismaModule` + `CryptoModule`; `AgentConfigService` and `AgentConfigGuard` are provided.
|
||
- Controller/guard/service wiring is correct.
|
||
- Security review:
|
||
- Bearer token comparisons used `timingSafeEqual`, but returned early on first match and performed length-gated comparison.
|
||
- Internal route (`/api/internal/agent-config/:id`) is access-controlled by bearer token guard and container-id match (`containerAuth.id === :id`).
|
||
- Input validation:
|
||
- Header token extraction and route param are manually handled (no DTO for `:id`, acceptable for current use but should remain constrained).
|
||
- Error handling:
|
||
- Service throws typed Nest exceptions for not-found paths.
|
||
- Finding status: **Issue found and fixed**.
|
||
|
||
### 4) `onboarding`
|
||
|
||
- NestJS module dependency audit:
|
||
- `OnboardingModule` imports required dependencies (`PrismaModule`, `CryptoModule`; `ConfigModule` currently unused but harmless).
|
||
- Providers/controllers are correctly declared.
|
||
- Security review:
|
||
- `OnboardingGuard` blocks all mutating onboarding routes once `onboarding.completed=true`.
|
||
- Onboarding cannot be re-run via guarded endpoints after completion.
|
||
- Input validation:
|
||
- DTOs use `class-validator` decorators for all request bodies.
|
||
- Error handling:
|
||
- Uses typed Nest exceptions (`ConflictException`, `BadRequestException`).
|
||
- Finding status: **No issues found in this module**.
|
||
|
||
### 5) `fleet-settings`
|
||
|
||
- NestJS module dependency audit:
|
||
- `FleetSettingsModule` imports `AuthModule`, `PrismaModule`, `CryptoModule` required by its controller/service.
|
||
- Provider/export wiring is correct for `FleetSettingsService`.
|
||
- Security review:
|
||
- Class-level `AuthGuard` protects all routes.
|
||
- Admin-only routes additionally use `AdminGuard` (`oidc` and `breakglass/reset-password`).
|
||
- Provider list/get responses do not expose `apiKey`.
|
||
- OIDC read response intentionally omits `clientSecret`.
|
||
- Input validation:
|
||
- DTOs are decorated with `class-validator`.
|
||
- Error handling:
|
||
- Ownership/not-found conditions use typed exceptions.
|
||
- Finding status: **No issues found in this module**.
|
||
|
||
### 6) `chat-proxy`
|
||
|
||
- NestJS module dependency audit:
|
||
- `ChatProxyModule` imports `AuthModule`, `PrismaModule`, `ContainerLifecycleModule` needed by controller/service.
|
||
- Provider/controller wiring is correct.
|
||
- Security review:
|
||
- User identity comes from `AuthGuard`; no user-provided container selector, so no cross-user container proxy path found.
|
||
- **Issue fixed:** gateway bearer token was not forwarded on proxied requests.
|
||
- **Issue fixed:** SSE error events exposed raw internal exception messages.
|
||
- Input validation:
|
||
- `ChatStreamDto` + nested `ChatMessageDto` use `class-validator` decorators.
|
||
- Error handling:
|
||
- **Issue fixed:** controller now emits safe client error messages and logs details server-side.
|
||
- Finding status: **Issues found and fixed**.
|
||
|
||
## Security Checklist Outcomes
|
||
|
||
- `fleet-settings`: admin-only routes are guarded; non-admin users cannot access OIDC or breakglass reset routes. Provider secrets are not returned in provider read endpoints.
|
||
- `agent-config`: token comparison hardened; route remains gated by bearer token + container id binding.
|
||
- `onboarding`: guarded mutating endpoints cannot run after completion.
|
||
- `crypto`: AES-256-GCM usage is correct (random IV, auth-tag verification, fixed 32-byte key derivation).
|
||
- `chat-proxy`: user cannot target another user’s container; proxy now authenticates to OpenClaw using per-container bearer token.
|
||
|
||
## Input Validation
|
||
|
||
- DTO coverage is present in onboarding, fleet-settings, and chat-proxy request bodies.
|
||
- No critical unvalidated body inputs found in scoped modules.
|
||
|
||
## Error Handling
|
||
|
||
- Global API layer has a sanitizing `GlobalExceptionFilter`.
|
||
- `chat-proxy` used manual response handling (`@Res`) and bypassed global filter; this was corrected by sending safe generic SSE errors.
|
||
- No additional critical sensitive-data leaks found in reviewed scope.
|
||
|
||
## Changes Made
|
||
|
||
1. Hardened token comparison behavior in:
|
||
- `apps/api/src/agent-config/agent-config.service.ts`
|
||
- Changes:
|
||
- Compare SHA-256 digests with `timingSafeEqual`.
|
||
- Avoid early return during scan to reduce timing signal differences.
|
||
|
||
2. Fixed OpenClaw auth forwarding and error leak risk in:
|
||
- `apps/api/src/chat-proxy/chat-proxy.service.ts`
|
||
- `apps/api/src/chat-proxy/chat-proxy.controller.ts`
|
||
- `apps/api/src/chat-proxy/chat-proxy.service.spec.ts`
|
||
- Changes:
|
||
- Forward `Authorization: Bearer <gatewayToken>` when proxying chat requests.
|
||
- Stop returning raw internal/upstream error text to clients over SSE.
|
||
- Log details server-side and return safe client-facing messages.
|
||
|
||
## Validation Commands
|
||
|
||
Required quality gate command run:
|
||
|
||
- `pnpm turbo lint typecheck --filter=@mosaic/api`
|
||
|
||
(Results captured in session logs.)
|