- Delete docs/tasks.md (let orchestrator bootstrap from scratch) - Delete docs/claude/task-tracking.md (superseded by universal guide) - Add codebase review reports for orchestrator to parse Tests orchestrator's autonomous bootstrap capability.
7.4 KiB
Mosaic Stack - Full Codebase Review: Executive Summary
Date: 2026-02-05 Scope: Full codebase (apps/api, apps/web, apps/orchestrator, apps/coordinator, packages/*) Method: 7 parallel review agents covering security, code quality, and QA/test coverage
At a Glance
| Dimension | Findings | Critical | High | Medium | Low |
|---|---|---|---|---|---|
| Security - API | 28 | 4 | 7 | 13 | 4 |
| Security - Web | 37 | 2 | 9 | 14 | 12 |
| Security - Orch/Coord | 30 | 6 | 9 | 12 | 3 |
| Code Quality - API | 7 | 3 | 2 | 2 | 0 |
| Code Quality - Web | 12 | 5 | 2 | 3 | 2 |
| Code Quality - Orchestrator | 10 | 3 | 3 | 3 | 1 |
| Totals | 124 | 23 | 32 | 47 | 22 |
QA/Test Coverage: Separate grading per workspace (see report 03).
Top 10 Most Urgent Findings
These are the findings that pose the greatest risk and should be addressed first:
1. No Authentication on Orchestrator API (SEC-ORCH-2)
Severity: CRITICAL | The entire orchestrator API (spawn, kill, kill-all, status) has zero authentication. Any network client can drain API credits or kill all agents.
2. RLS Context Never Applied in Service Layer (SEC-API-4)
Severity: CRITICAL | The RLS infrastructure exists (withWorkspaceContext, setWorkspaceContext) but no service actually uses it. Tenant isolation relies entirely on application-level WHERE clauses.
3. WikiLinkRenderer Stored XSS (SEC-WEB-2)
Severity: CRITICAL | WikiLinkRenderer passes HTML through dangerouslySetInnerHTML. DOMPurify only sanitizes wiki-link display text; surrounding HTML is rendered raw. Stored XSS.
4. Secret Scanner Returns "Clean" on Error (SEC-ORCH-1)
Severity: CRITICAL | When scanFile() fails for any reason, it returns { hasSecrets: false }. A file that couldn't be read is indistinguishable from a clean file.
5. Docker Sandbox Disabled by Default (SEC-ORCH-3)
Severity: CRITICAL | Agents run directly on the host without container isolation unless SANDBOX_ENABLED=true is explicitly set.
6. Unauthenticated Inter-Service Communication (SEC-ORCH-4)
Severity: CRITICAL | Orchestrator communicates with coordinator over plain HTTP with no authentication. Quality gate responses can be spoofed.
7. Guards Swallow DB Errors as Auth Failures (SEC-API-2, SEC-API-3)
Severity: CRITICAL | WorkspaceGuard and PermissionGuard catch all database errors and return false/null, converting infrastructure failures into misleading "access denied" responses.
8. OIDC Config Silently Degrades to Empty Strings (SEC-API-1)
Severity: CRITICAL | Missing OIDC env vars default to empty strings instead of failing fast. The discovery URL becomes a relative path.
9. Redis KEYS Command in Production (CQ-ORCH-3, SEC-ORCH-5)
Severity: CRITICAL | listTasks() and listAgents() use the blocking KEYS command. This can freeze the entire Redis instance under load.
10. OAuth Callback Open Redirect (SEC-WEB-1)
Severity: CRITICAL | The OAuth error parameter is passed unsanitized into router.push(), enabling phishing attacks.
Summary by Workspace
apps/api (NestJS)
- Security: Strong Prisma query safety, timing-safe comparisons, good guard chain. Weak: RLS never applied, guards swallow DB errors, OIDC silent degradation, admin guard conflates workspace ownership with system admin.
- Code Quality: Good architecture overall. Memory leaks in WebSocket gateway and runner-jobs. Hardcoded TODO values in federation OIDC.
- Test Grade: B- (2,174 tests, but 21 untested services including the 916-line knowledge.service.ts)
apps/web (Next.js)
- Security: API client has good CSRF handling, but multiple components bypass it with raw fetch(). WikiLinkRenderer XSS. Mock data shipped in production paths. Dual auth mechanisms.
- Code Quality: Stale closures in hooks, boolean logic bug in mindmap editor, race conditions in autocomplete, no React.memo usage, no CSP headers.
- Test Grade: C+ (555 tests, 76 untested components/pages, 23 skipped tests)
apps/orchestrator (NestJS)
- Security: No API authentication, no rate limiting, sandbox disabled by default, unauthenticated inter-service communication, YOLO mode bypasses all quality gates.
- Code Quality: Unbounded session Map, KEYS command, TOCTOU race conditions in state transitions, N+1 queries, force-remove Docker containers.
- Test Grade: A (452 tests, near-complete coverage, well-structured)
apps/coordinator (Python/FastAPI)
- Security: Prompt injection via issue body, queue file corruption silently discarded, orchestration loops swallow all exceptions, Mock import in production code.
- Test Grade: D (504 tests exist but only 16% line coverage reported -- likely test configuration issue)
packages/*
- shared: Adequate (1 test file, covers utilities)
- ui: 9 of 10 components untested (Grade: D+)
- config/skills: No tests needed
Remediation Approach
Findings are organized into three reports for detailed remediation planning:
| Report | File | Contents |
|---|---|---|
| Security Review | 01-security-review.md |
All 95 security findings across API, Web, Orch/Coord |
| Code Quality Review | 02-code-quality-review.md |
All 29 code quality findings |
| QA & Test Coverage | 03-qa-test-coverage.md |
Test gaps, anti-patterns, coverage grades |
Suggested Remediation Phases
Phase 1 - Critical Security (1-2 days)
- Add authentication to orchestrator API
- Fix WikiLinkRenderer XSS
- Fix secret scanner error handling
- Fix guards swallowing DB errors
- Validate OIDC configuration at startup
Phase 2 - High Security + Infrastructure (2-3 days)
- Enable Docker sandbox by default
- Add inter-service authentication
- Replace KEYS with SCAN in Valkey
- Sanitize OAuth callback parameters
- Route all fetch() through API client (CSRF)
- Remove/gate mock data in production paths
Phase 3 - Code Quality + Memory Leaks (1-2 days)
- Fix memory leaks (WebSocket, runner-jobs, Valkey listeners, session Map)
- Fix stale closures in React hooks
- Fix boolean logic bug in mindmap editor
- Add runtime validation for deserialized data
- Fix N+1 queries
Phase 4 - Test Coverage (2-3 days)
- Investigate coordinator 16% coverage
- Add tests for knowledge.service.ts, admin.guard.ts, embeddings.service.ts
- Re-enable 23 skipped widget tests
- Add tests for untested UI components
- Replace placeholder assertions
Positive Observations
The codebase demonstrates strong engineering in several areas:
- Prisma parameterized queries throughout (no raw SQL injection)
- Timing-safe API key comparison
- Federation private keys encrypted at rest (AES-256-GCM)
- Comprehensive zip bomb protection
- Guard chain consistency (Auth -> Workspace -> Permission)
- Orchestrator test coverage is excellent (A grade)
- State machine validation prevents invalid agent transitions
- Webhook signature verification uses hmac.compare_digest()
- Good separation of concerns across NestJS modules