fix(api): MS22 Phase 1 post-coding audit (#625)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful

Co-authored-by: Jason Woltje <jason@diversecanvas.com>
Co-committed-by: Jason Woltje <jason@diversecanvas.com>
This commit was merged in pull request #625.
This commit is contained in:
2026-03-01 19:53:49 +00:00
committed by jason.woltje
parent a25a77a43c
commit 3607554902
5 changed files with 233 additions and 27 deletions

View File

@@ -0,0 +1,157 @@
# MS22 Phase 1 Module Audit
Date: 2026-03-01
Branch: `fix/ms22-audit`
Scope:
- `apps/api/src/container-lifecycle/`
- `apps/api/src/crypto/`
- `apps/api/src/agent-config/`
- `apps/api/src/onboarding/`
- `apps/api/src/fleet-settings/`
- `apps/api/src/chat-proxy/`
## Summary
Audit completed for module wiring, security controls, input validation, and error handling.
Findings:
1. `chat-proxy`: raw internal/upstream error messages were returned to clients over SSE (fixed).
2. `chat-proxy`: proxy requests to OpenClaw did not forward the container bearer token returned by lifecycle startup (fixed).
3. `agent-config`: token validation returned early and used length-gated compare logic, creating avoidable timing side-channel behavior (hardened).
## Module Review Results
### 1) `container-lifecycle`
- NestJS module dependency audit:
- `ContainerLifecycleModule` imports `ConfigModule`, `PrismaModule`, and `CryptoModule` required by `ContainerLifecycleService`.
- Providers/exports are correct (`ContainerLifecycleService` provided and exported).
- Security review:
- Container operations are user-scoped by `userId` and do not expose cross-user selectors in this module.
- AES token generation/decryption delegated to `CryptoService`.
- Input validation:
- No controller endpoints in this module; no direct request DTO surface here.
- Error handling:
- No direct HTTP layer here; errors flow to callers/global filter.
- Finding status: **No issues found in this module**.
### 2) `crypto`
- NestJS module dependency audit:
- `CryptoModule` correctly imports `ConfigModule` for `ConfigService`.
- `CryptoService` is correctly provided/exported.
- Security review:
- AES-256-GCM is implemented correctly.
- 96-bit IV generated via `randomBytes(12)` per encryption.
- Auth tag captured and verified on decrypt (`setAuthTag` + `decipher.final()`).
- HKDF derives a fixed 32-byte key from `MOSAIC_SECRET_KEY`.
- Input validation:
- No DTO/request surface in this module.
- Error handling:
- Decrypt failures are normalized to `Failed to decrypt value`.
- Finding status: **No issues found in this module**.
### 3) `agent-config`
- NestJS module dependency audit:
- `AgentConfigModule` imports `PrismaModule` + `CryptoModule`; `AgentConfigService` and `AgentConfigGuard` are provided.
- Controller/guard/service wiring is correct.
- Security review:
- Bearer token comparisons used `timingSafeEqual`, but returned early on first match and performed length-gated comparison.
- Internal route (`/api/internal/agent-config/:id`) is access-controlled by bearer token guard and container-id match (`containerAuth.id === :id`).
- Input validation:
- Header token extraction and route param are manually handled (no DTO for `:id`, acceptable for current use but should remain constrained).
- Error handling:
- Service throws typed Nest exceptions for not-found paths.
- Finding status: **Issue found and fixed**.
### 4) `onboarding`
- NestJS module dependency audit:
- `OnboardingModule` imports required dependencies (`PrismaModule`, `CryptoModule`; `ConfigModule` currently unused but harmless).
- Providers/controllers are correctly declared.
- Security review:
- `OnboardingGuard` blocks all mutating onboarding routes once `onboarding.completed=true`.
- Onboarding cannot be re-run via guarded endpoints after completion.
- Input validation:
- DTOs use `class-validator` decorators for all request bodies.
- Error handling:
- Uses typed Nest exceptions (`ConflictException`, `BadRequestException`).
- Finding status: **No issues found in this module**.
### 5) `fleet-settings`
- NestJS module dependency audit:
- `FleetSettingsModule` imports `AuthModule`, `PrismaModule`, `CryptoModule` required by its controller/service.
- Provider/export wiring is correct for `FleetSettingsService`.
- Security review:
- Class-level `AuthGuard` protects all routes.
- Admin-only routes additionally use `AdminGuard` (`oidc` and `breakglass/reset-password`).
- Provider list/get responses do not expose `apiKey`.
- OIDC read response intentionally omits `clientSecret`.
- Input validation:
- DTOs are decorated with `class-validator`.
- Error handling:
- Ownership/not-found conditions use typed exceptions.
- Finding status: **No issues found in this module**.
### 6) `chat-proxy`
- NestJS module dependency audit:
- `ChatProxyModule` imports `AuthModule`, `PrismaModule`, `ContainerLifecycleModule` needed by controller/service.
- Provider/controller wiring is correct.
- Security review:
- User identity comes from `AuthGuard`; no user-provided container selector, so no cross-user container proxy path found.
- **Issue fixed:** gateway bearer token was not forwarded on proxied requests.
- **Issue fixed:** SSE error events exposed raw internal exception messages.
- Input validation:
- `ChatStreamDto` + nested `ChatMessageDto` use `class-validator` decorators.
- Error handling:
- **Issue fixed:** controller now emits safe client error messages and logs details server-side.
- Finding status: **Issues found and fixed**.
## Security Checklist Outcomes
- `fleet-settings`: admin-only routes are guarded; non-admin users cannot access OIDC or breakglass reset routes. Provider secrets are not returned in provider read endpoints.
- `agent-config`: token comparison hardened; route remains gated by bearer token + container id binding.
- `onboarding`: guarded mutating endpoints cannot run after completion.
- `crypto`: AES-256-GCM usage is correct (random IV, auth-tag verification, fixed 32-byte key derivation).
- `chat-proxy`: user cannot target another users container; proxy now authenticates to OpenClaw using per-container bearer token.
## Input Validation
- DTO coverage is present in onboarding, fleet-settings, and chat-proxy request bodies.
- No critical unvalidated body inputs found in scoped modules.
## Error Handling
- Global API layer has a sanitizing `GlobalExceptionFilter`.
- `chat-proxy` used manual response handling (`@Res`) and bypassed global filter; this was corrected by sending safe generic SSE errors.
- No additional critical sensitive-data leaks found in reviewed scope.
## Changes Made
1. Hardened token comparison behavior in:
- `apps/api/src/agent-config/agent-config.service.ts`
- Changes:
- Compare SHA-256 digests with `timingSafeEqual`.
- Avoid early return during scan to reduce timing signal differences.
2. Fixed OpenClaw auth forwarding and error leak risk in:
- `apps/api/src/chat-proxy/chat-proxy.service.ts`
- `apps/api/src/chat-proxy/chat-proxy.controller.ts`
- `apps/api/src/chat-proxy/chat-proxy.service.spec.ts`
- Changes:
- Forward `Authorization: Bearer <gatewayToken>` when proxying chat requests.
- Stop returning raw internal/upstream error text to clients over SSE.
- Log details server-side and return safe client-facing messages.
## Validation Commands
Required quality gate command run:
- `pnpm turbo lint typecheck --filter=@mosaic/api`
(Results captured in session logs.)