stack/docs/reports/codebase-review-2026-02-05/00-executive-summary.md

# Mosaic Stack - Full Codebase Review: Executive Summary

**Date:** 2026-02-05
**Scope:** Full codebase (apps/api, apps/web, apps/orchestrator, apps/coordinator, packages/\*)
**Method:** 7 parallel review agents covering security, code quality, and QA/test coverage

---

## At a Glance

| Dimension                   | Findings | Critical | High   | Medium | Low    |
| --------------------------- | -------- | -------- | ------ | ------ | ------ |
| Security - API              | 28       | 4        | 7      | 13     | 4      |
| Security - Web              | 37       | 2        | 9      | 14     | 12     |
| Security - Orch/Coord       | 30       | 6        | 9      | 12     | 3      |
| Code Quality - API          | 7        | 3        | 2      | 2      | 0      |
| Code Quality - Web          | 12       | 5        | 2      | 3      | 2      |
| Code Quality - Orchestrator | 10       | 3        | 3      | 3      | 1      |
| **Totals**                  | **124**  | **23**   | **32** | **47** | **22** |

**QA/Test Coverage:** Separate grading per workspace (see report 03).

---

## Top 10 Most Urgent Findings

These are the findings that pose the greatest risk and should be addressed first:

### 1. No Authentication on Orchestrator API (SEC-ORCH-2)

**Severity: CRITICAL** | The entire orchestrator API (spawn, kill, kill-all, status) has zero authentication. Any network client can drain API credits or kill all agents.

### 2. RLS Context Never Applied in Service Layer (SEC-API-4)

**Severity: CRITICAL** | The RLS infrastructure exists (withWorkspaceContext, setWorkspaceContext) but no service actually uses it. Tenant isolation relies entirely on application-level WHERE clauses.

### 3. WikiLinkRenderer Stored XSS (SEC-WEB-2)

**Severity: CRITICAL** | WikiLinkRenderer passes HTML through dangerouslySetInnerHTML. DOMPurify only sanitizes wiki-link display text; surrounding HTML is rendered raw. Stored XSS.

### 4. Secret Scanner Returns "Clean" on Error (SEC-ORCH-1)

**Severity: CRITICAL** | When scanFile() fails for any reason, it returns `{ hasSecrets: false }`. A file that couldn't be read is indistinguishable from a clean file.

### 5. Docker Sandbox Disabled by Default (SEC-ORCH-3)

**Severity: CRITICAL** | Agents run directly on the host without container isolation unless `SANDBOX_ENABLED=true` is explicitly set.

### 6. Unauthenticated Inter-Service Communication (SEC-ORCH-4)

**Severity: CRITICAL** | Orchestrator communicates with coordinator over plain HTTP with no authentication. Quality gate responses can be spoofed.

### 7. Guards Swallow DB Errors as Auth Failures (SEC-API-2, SEC-API-3)

**Severity: CRITICAL** | WorkspaceGuard and PermissionGuard catch all database errors and return false/null, converting infrastructure failures into misleading "access denied" responses.

### 8. OIDC Config Silently Degrades to Empty Strings (SEC-API-1)

**Severity: CRITICAL** | Missing OIDC env vars default to empty strings instead of failing fast. The discovery URL becomes a relative path.

### 9. Redis KEYS Command in Production (CQ-ORCH-3, SEC-ORCH-5)

**Severity: CRITICAL** | listTasks() and listAgents() use the blocking `KEYS` command. This can freeze the entire Redis instance under load.

### 10. OAuth Callback Open Redirect (SEC-WEB-1)

**Severity: CRITICAL** | The OAuth error parameter is passed unsanitized into router.push(), enabling phishing attacks.

---

## Summary by Workspace

### apps/api (NestJS)

- **Security:** Strong Prisma query safety, timing-safe comparisons, good guard chain. Weak: RLS never applied, guards swallow DB errors, OIDC silent degradation, admin guard conflates workspace ownership with system admin.
- **Code Quality:** Good architecture overall. Memory leaks in WebSocket gateway and runner-jobs. Hardcoded TODO values in federation OIDC.
- **Test Grade: B-** (2,174 tests, but 21 untested services including the 916-line knowledge.service.ts)

### apps/web (Next.js)

- **Security:** API client has good CSRF handling, but multiple components bypass it with raw fetch(). WikiLinkRenderer XSS. Mock data shipped in production paths. Dual auth mechanisms.
- **Code Quality:** Stale closures in hooks, boolean logic bug in mindmap editor, race conditions in autocomplete, no React.memo usage, no CSP headers.
- **Test Grade: C+** (555 tests, 76 untested components/pages, 23 skipped tests)

### apps/orchestrator (NestJS)

- **Security:** No API authentication, no rate limiting, sandbox disabled by default, unauthenticated inter-service communication, YOLO mode bypasses all quality gates.
- **Code Quality:** Unbounded session Map, KEYS command, TOCTOU race conditions in state transitions, N+1 queries, force-remove Docker containers.
- **Test Grade: A** (452 tests, near-complete coverage, well-structured)

### apps/coordinator (Python/FastAPI)

- **Security:** Prompt injection via issue body, queue file corruption silently discarded, orchestration loops swallow all exceptions, Mock import in production code.
- **Test Grade: D** (504 tests exist but only 16% line coverage reported -- likely test configuration issue)

### packages/\*

- **shared:** Adequate (1 test file, covers utilities)
- **ui:** 9 of 10 components untested (Grade: D+)
- **config/skills:** No tests needed

---

## Remediation Approach

Findings are organized into three reports for detailed remediation planning:

| Report              | File                        | Contents                                             |
| ------------------- | --------------------------- | ---------------------------------------------------- |
| Security Review     | `01-security-review.md`     | All 95 security findings across API, Web, Orch/Coord |
| Code Quality Review | `02-code-quality-review.md` | All 29 code quality findings                         |
| QA & Test Coverage  | `03-qa-test-coverage.md`    | Test gaps, anti-patterns, coverage grades            |

### Suggested Remediation Phases

**Phase 1 - Critical Security (1-2 days)**

- Add authentication to orchestrator API
- Fix WikiLinkRenderer XSS
- Fix secret scanner error handling
- Fix guards swallowing DB errors
- Validate OIDC configuration at startup

**Phase 2 - High Security + Infrastructure (2-3 days)**

- Enable Docker sandbox by default
- Add inter-service authentication
- Replace KEYS with SCAN in Valkey
- Sanitize OAuth callback parameters
- Route all fetch() through API client (CSRF)
- Remove/gate mock data in production paths

**Phase 3 - Code Quality + Memory Leaks (1-2 days)**

- Fix memory leaks (WebSocket, runner-jobs, Valkey listeners, session Map)
- Fix stale closures in React hooks
- Fix boolean logic bug in mindmap editor
- Add runtime validation for deserialized data
- Fix N+1 queries

**Phase 4 - Test Coverage (2-3 days)**

- Investigate coordinator 16% coverage
- Add tests for knowledge.service.ts, admin.guard.ts, embeddings.service.ts
- Re-enable 23 skipped widget tests
- Add tests for untested UI components
- Replace placeholder assertions

---

## Positive Observations

The codebase demonstrates strong engineering in several areas:

- Prisma parameterized queries throughout (no raw SQL injection)
- Timing-safe API key comparison
- Federation private keys encrypted at rest (AES-256-GCM)
- Comprehensive zip bomb protection
- Guard chain consistency (Auth -> Workspace -> Permission)
- Orchestrator test coverage is excellent (A grade)
- State machine validation prevents invalid agent transitions
- Webhook signature verification uses hmac.compare_digest()
- Good separation of concerns across NestJS modules