Files
stack/docs/reports/codebase-review-2026-02-05/00-executive-summary.md
Jason Woltje 9dfbf8cf61
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
chore: Remove pre-created task files, add review reports
- Delete docs/tasks.md (let orchestrator bootstrap from scratch)
- Delete docs/claude/task-tracking.md (superseded by universal guide)
- Add codebase review reports for orchestrator to parse

Tests orchestrator's autonomous bootstrap capability.
2026-02-05 15:08:02 -06:00

7.4 KiB

Mosaic Stack - Full Codebase Review: Executive Summary

Date: 2026-02-05 Scope: Full codebase (apps/api, apps/web, apps/orchestrator, apps/coordinator, packages/*) Method: 7 parallel review agents covering security, code quality, and QA/test coverage


At a Glance

Dimension Findings Critical High Medium Low
Security - API 28 4 7 13 4
Security - Web 37 2 9 14 12
Security - Orch/Coord 30 6 9 12 3
Code Quality - API 7 3 2 2 0
Code Quality - Web 12 5 2 3 2
Code Quality - Orchestrator 10 3 3 3 1
Totals 124 23 32 47 22

QA/Test Coverage: Separate grading per workspace (see report 03).


Top 10 Most Urgent Findings

These are the findings that pose the greatest risk and should be addressed first:

1. No Authentication on Orchestrator API (SEC-ORCH-2)

Severity: CRITICAL | The entire orchestrator API (spawn, kill, kill-all, status) has zero authentication. Any network client can drain API credits or kill all agents.

2. RLS Context Never Applied in Service Layer (SEC-API-4)

Severity: CRITICAL | The RLS infrastructure exists (withWorkspaceContext, setWorkspaceContext) but no service actually uses it. Tenant isolation relies entirely on application-level WHERE clauses.

3. WikiLinkRenderer Stored XSS (SEC-WEB-2)

Severity: CRITICAL | WikiLinkRenderer passes HTML through dangerouslySetInnerHTML. DOMPurify only sanitizes wiki-link display text; surrounding HTML is rendered raw. Stored XSS.

4. Secret Scanner Returns "Clean" on Error (SEC-ORCH-1)

Severity: CRITICAL | When scanFile() fails for any reason, it returns { hasSecrets: false }. A file that couldn't be read is indistinguishable from a clean file.

5. Docker Sandbox Disabled by Default (SEC-ORCH-3)

Severity: CRITICAL | Agents run directly on the host without container isolation unless SANDBOX_ENABLED=true is explicitly set.

6. Unauthenticated Inter-Service Communication (SEC-ORCH-4)

Severity: CRITICAL | Orchestrator communicates with coordinator over plain HTTP with no authentication. Quality gate responses can be spoofed.

7. Guards Swallow DB Errors as Auth Failures (SEC-API-2, SEC-API-3)

Severity: CRITICAL | WorkspaceGuard and PermissionGuard catch all database errors and return false/null, converting infrastructure failures into misleading "access denied" responses.

8. OIDC Config Silently Degrades to Empty Strings (SEC-API-1)

Severity: CRITICAL | Missing OIDC env vars default to empty strings instead of failing fast. The discovery URL becomes a relative path.

9. Redis KEYS Command in Production (CQ-ORCH-3, SEC-ORCH-5)

Severity: CRITICAL | listTasks() and listAgents() use the blocking KEYS command. This can freeze the entire Redis instance under load.

10. OAuth Callback Open Redirect (SEC-WEB-1)

Severity: CRITICAL | The OAuth error parameter is passed unsanitized into router.push(), enabling phishing attacks.


Summary by Workspace

apps/api (NestJS)

  • Security: Strong Prisma query safety, timing-safe comparisons, good guard chain. Weak: RLS never applied, guards swallow DB errors, OIDC silent degradation, admin guard conflates workspace ownership with system admin.
  • Code Quality: Good architecture overall. Memory leaks in WebSocket gateway and runner-jobs. Hardcoded TODO values in federation OIDC.
  • Test Grade: B- (2,174 tests, but 21 untested services including the 916-line knowledge.service.ts)

apps/web (Next.js)

  • Security: API client has good CSRF handling, but multiple components bypass it with raw fetch(). WikiLinkRenderer XSS. Mock data shipped in production paths. Dual auth mechanisms.
  • Code Quality: Stale closures in hooks, boolean logic bug in mindmap editor, race conditions in autocomplete, no React.memo usage, no CSP headers.
  • Test Grade: C+ (555 tests, 76 untested components/pages, 23 skipped tests)

apps/orchestrator (NestJS)

  • Security: No API authentication, no rate limiting, sandbox disabled by default, unauthenticated inter-service communication, YOLO mode bypasses all quality gates.
  • Code Quality: Unbounded session Map, KEYS command, TOCTOU race conditions in state transitions, N+1 queries, force-remove Docker containers.
  • Test Grade: A (452 tests, near-complete coverage, well-structured)

apps/coordinator (Python/FastAPI)

  • Security: Prompt injection via issue body, queue file corruption silently discarded, orchestration loops swallow all exceptions, Mock import in production code.
  • Test Grade: D (504 tests exist but only 16% line coverage reported -- likely test configuration issue)

packages/*

  • shared: Adequate (1 test file, covers utilities)
  • ui: 9 of 10 components untested (Grade: D+)
  • config/skills: No tests needed

Remediation Approach

Findings are organized into three reports for detailed remediation planning:

Report File Contents
Security Review 01-security-review.md All 95 security findings across API, Web, Orch/Coord
Code Quality Review 02-code-quality-review.md All 29 code quality findings
QA & Test Coverage 03-qa-test-coverage.md Test gaps, anti-patterns, coverage grades

Suggested Remediation Phases

Phase 1 - Critical Security (1-2 days)

  • Add authentication to orchestrator API
  • Fix WikiLinkRenderer XSS
  • Fix secret scanner error handling
  • Fix guards swallowing DB errors
  • Validate OIDC configuration at startup

Phase 2 - High Security + Infrastructure (2-3 days)

  • Enable Docker sandbox by default
  • Add inter-service authentication
  • Replace KEYS with SCAN in Valkey
  • Sanitize OAuth callback parameters
  • Route all fetch() through API client (CSRF)
  • Remove/gate mock data in production paths

Phase 3 - Code Quality + Memory Leaks (1-2 days)

  • Fix memory leaks (WebSocket, runner-jobs, Valkey listeners, session Map)
  • Fix stale closures in React hooks
  • Fix boolean logic bug in mindmap editor
  • Add runtime validation for deserialized data
  • Fix N+1 queries

Phase 4 - Test Coverage (2-3 days)

  • Investigate coordinator 16% coverage
  • Add tests for knowledge.service.ts, admin.guard.ts, embeddings.service.ts
  • Re-enable 23 skipped widget tests
  • Add tests for untested UI components
  • Replace placeholder assertions

Positive Observations

The codebase demonstrates strong engineering in several areas:

  • Prisma parameterized queries throughout (no raw SQL injection)
  • Timing-safe API key comparison
  • Federation private keys encrypted at rest (AES-256-GCM)
  • Comprehensive zip bomb protection
  • Guard chain consistency (Auth -> Workspace -> Permission)
  • Orchestrator test coverage is excellent (A grade)
  • State machine validation prevents invalid agent transitions
  • Webhook signature verification uses hmac.compare_digest()
  • Good separation of concerns across NestJS modules