stack/docs/reports/codebase-review-2026-02-05/03-qa-test-coverage.md

# Mosaic Stack - QA & Test Coverage Report

**Date:** 2026-02-05
**Scope:** All workspaces (api, web, orchestrator, coordinator, packages)
**Total Test Files:** 552 | **Total Test Cases:** ~3,685

---

## Overall Test Health

| Workspace         | Tests  | Files | Coverage              | Grade  | Key Issue                             |
| ----------------- | ------ | ----- | --------------------- | ------ | ------------------------------------- |
| apps/orchestrator | ~452   | 19    | 85% enforced          | **A**  | Near-complete, well-structured        |
| apps/api          | ~2,174 | 143   | Not enforced          | **B-** | 21 untested services, weak assertions |
| apps/web          | ~555   | 51    | 85% on components/lib | **C+** | 76 untested components, 23 skipped    |
| apps/coordinator  | ~504   | 23    | **16% reported**      | **D**  | Coverage crisis despite test files    |
| packages/shared   | ~25    | 1     | N/A                   | **B+** | Adequate for scope                    |
| packages/ui       | ~15    | 1     | N/A                   | **D+** | 9 of 10 components untested           |

---

## Critical Coverage Gaps

### GAP-1: Coordinator 16% Line Coverage [CRITICAL - Priority 10/10]

Despite having 23 test files and ~504 test cases, the coordinator reports only 16% line coverage with 14 of 22 source files at 0% execution. Files at 0% include the core `coordinator.py`, `queue.py`, `webhook.py`, `security.py`, `parser.py`, and `metrics.py`.

**Root Cause (likely):** Tests import types/models but mock everything, so actual source code never executes; or coverage run only executes a subset of tests.

**Action:** Run `cd apps/coordinator && python -m pytest tests/ -v --cov=src --cov-report=term-missing` and diagnose.

### GAP-2: knowledge.service.ts - 916 Lines, No Tests [CRITICAL - Priority 9/10]

The largest service file in the API has no direct unit tests. Core CRUD operations, pagination, filtering, slug generation, cache invalidation, and embedding queue integration are all untested. Only version-specific tests exist.

**Regressions at risk:** Pagination off-by-one, slug collision handling, stale cache after updates, embedding queue not triggered.

### GAP-3: admin.guard.ts - Security Guard, No Tests [CRITICAL - Priority 9/10]

This guard determines system admin access by checking workspace ownership. No tests verify it correctly grants/denies admin access.

**Regressions at risk:** Non-admin users gaining admin access, valid admins locked out, missing ForbiddenException.

### GAP-4: embeddings.service.ts - 249 Lines, Raw SQL, No Tests [CRITICAL - Priority 9/10]

Uses raw SQL for pgvector operations. No tests exist for embedding validation, vector SQL construction, or similarity search.

**Regressions at risk:** SQL injection through embedding data, invalid vector dimensions, wrong search results.

### GAP-5: widget-data.service.ts - 695 Lines, No Tests [HIGH - Priority 8/10]

Second-largest untested file. Fetches data from multiple sources for dashboard widgets.

### GAP-6: ideas.service.ts - 321 Lines, No Tests [HIGH - Priority 8/10]

User-facing CRUD feature with domain/project associations and activity logging.

---

## Untested Files by Workspace

### apps/api - 21 Untested Service/Controller Files

| File                                          | Lines | Risk     |
| --------------------------------------------- | ----- | -------- |
| knowledge/knowledge.service.ts                | 916   | HIGH     |
| widgets/widget-data.service.ts                | 695   | HIGH     |
| ideas/ideas.service.ts                        | 321   | HIGH     |
| database/embeddings.service.ts                | 249   | HIGH     |
| ideas/ideas.controller.ts                     | 123   | MEDIUM   |
| widgets/widgets.controller.ts                 | 129   | MEDIUM   |
| widgets/widgets.service.ts                    | 59    | MEDIUM   |
| users/preferences.service.ts                  | 99    | MEDIUM   |
| users/preferences.controller.ts               | 56    | MEDIUM   |
| common/throttler/throttler-storage.service.ts | 80+   | MEDIUM   |
| auth/guards/admin.guard.ts                    | 46    | SECURITY |
| federation/audit.service.ts                   | 80+   | LOW      |
| common/throttler/throttler-api-key.guard.ts   | -     | MEDIUM   |
| knowledge/import-export.controller.ts         | -     | MEDIUM   |
| knowledge/knowledge.controller.ts             | -     | MEDIUM   |
| knowledge/stats.controller.ts                 | -     | LOW      |
| knowledge/queues/embedding-queue.service.ts   | -     | MEDIUM   |
| layouts/layouts.controller.ts                 | -     | LOW      |
| cron/cron.controller.ts                       | -     | LOW      |
| bridge/parser/command-parser.service.ts       | -     | MEDIUM   |
| app.service.ts                                | -     | LOW      |

Additionally, 22 DTO directories lack validation tests.

### apps/web - 76 Untested Component/Page Files

**Critical pages (user-facing routes):**

- Main dashboard page.tsx
- Calendar page
- Knowledge page + 5 sub-pages
- Federation connections + 2 sub-pages
- Settings (4 sub-pages)

**Critical components:**

- Chat system: Chat.tsx, ChatInput.tsx, MessageList.tsx, ConversationSidebar.tsx, BackendStatusBanner.tsx
- Dashboard widgets: DomainOverview, QuickCapture, RecentTasks, UpcomingEvents
- HUD system: HUD.tsx, WidgetGrid.tsx, WidgetRenderer.tsx, WidgetWrapper.tsx
- Knowledge: EntryCard, EntryList, EntryViewer, EntryFilters, VersionHistory, ImportExportActions, StatsDashboard
- Navigation.tsx
- Workspace: WorkspaceCard, WorkspaceSettings, MemberList, InviteMember
- Team: TeamCard, TeamMemberList, TeamSettings

**Untested API client modules (11 files):**

- chat.ts, domains.ts, events.ts, federation.ts, ideas.ts, knowledge.ts, personalities.ts, teams.ts, api.ts, auth-client.ts

**Untested hooks:** useChat.ts, useLayout.ts

### apps/orchestrator - 1 Untested File

- health/health.service.ts (minimal risk)

### packages/ui - 9 Untested Components

- Avatar, Badge, Card, Input, Modal, Select, Textarea, Toast (only Button tested)

---

## 23 Skipped Tests (apps/web)

| File                        | Count | Reason                                                   |
| --------------------------- | ----- | -------------------------------------------------------- |
| CalendarWidget.test.tsx     | 5     | Component migrated from setTimeout mock data to real API |
| TasksWidget.test.tsx        | 6     | Same - setTimeout mock data mismatch                     |
| QuickCaptureWidget.test.tsx | 3     | Submit and keyboard shortcut tests                       |
| LinkAutocomplete.test.tsx   | 9     | Debounce search, keyboard nav, link insertion, dropdown  |

**Action:** Re-enable and update tests to match current component implementations.

---

## Test Anti-Patterns Found

### Placeholder Assertions (expect(true).toBe(true))

| File                                | Line     | Context                 |
| ----------------------------------- | -------- | ----------------------- |
| ChatOverlay.test.tsx                | 259, 267 | Responsive design tests |
| rejection-handler.service.spec.ts   | 307      | Notification sending    |
| semantic-search.integration.spec.ts | 122      | Conditional branch      |

**Impact:** Tests always pass, provide zero regression protection.

### Sole toBeDefined() Assertions (30+ instances)

Most concerning in:

- `llm-telemetry.decorator.spec.ts` -- 6 tests verify decorator doesn't throw but never check span attributes
- `federation/query.service.spec.ts` -- 8 tests
- `federation/query.controller.spec.ts` -- 3 tests
- `layouts.service.spec.ts` -- 2 tests
- `workspace-settings.service.spec.ts` -- 1 test

**Impact:** Tests verify existence but not correctness. Regressions slip through.

### Testing Implementation Details Instead of Behavior

- `cors.spec.ts` -- Tests CORS by asserting on JS objects, not actual HTTP headers/middleware
- `Button.test.tsx` -- Asserts on CSS class names (`bg-blue-600`) instead of behavior

**Impact:** Tests break on implementation changes even when behavior is correct.

### Potential Flaky Patterns

setTimeout-based timing in 5 test files:

- `runner-jobs.service.spec.ts:620,833`
- `semantic-search.integration.spec.ts:153`
- `mcp/stdio-transport.spec.ts` (6 instances)
- `coordinator-integration.service.concurrency.spec.ts:170`
- `health.controller.spec.ts:63` (1100ms wait)

---

## Missing Test Categories

### No Playwright E2E Tests

The project documents Playwright as the E2E framework but no playwright.config.ts or E2E test files exist.

### No DTO Validation Tests

22 DTO directories lack validation testing. DTOs define input validation rules via class-validator decorators, but these are never tested in isolation.

### Limited Integration Tests

Only 8 integration test files exist across the entire codebase. Most module interactions are untested.

---

## Recommended Test Additions (Priority Order)

| Priority | Item                                 | Effort | Impact                           |
| -------- | ------------------------------------ | ------ | -------------------------------- |
| P0       | Investigate coordinator 16% coverage | 2hr    | Unblocks all coordinator testing |
| P0       | knowledge.service.ts unit tests      | 4hr    | Covers largest untested service  |
| P0       | admin.guard.ts unit tests            | 1hr    | Security-critical                |
| P0       | embeddings.service.ts unit tests     | 2hr    | Raw SQL validation               |
| P1       | widget-data.service.ts unit tests    | 3hr    | Dashboard reliability            |
| P1       | ideas.service.ts unit tests          | 2hr    | User-facing CRUD                 |
| P1       | Re-enable 23 skipped widget tests    | 2hr    | Immediate coverage gain          |
| P1       | Replace placeholder assertions       | 1hr    | Fix false-positive tests         |
| P2       | Chat system component tests          | 3hr    | Core user interaction            |
| P2       | API client module tests (11 files)   | 3hr    | Request/response validation      |
| P2       | Throttler storage tests              | 2hr    | Security infrastructure          |
| P2       | Preferences service tests            | 1hr    | User settings                    |
| P3       | Strengthen toBeDefined-only tests    | 2hr    | Better regression detection      |
| P3       | UI package component tests           | 3hr    | Design system reliability        |
| P3       | Playwright E2E setup + smoke tests   | 4hr    | End-to-end confidence            |

**Estimated total effort: ~5-6 days for P0+P1 items**

---

## Positive Test Observations

1. **Orchestrator is exemplary** -- 452 tests, near-complete coverage, behavioral testing, good mocking
2. **Federation security tests are thorough** -- Crypto, signature, timeout, workspace access, capability guard
3. **API client test (web) is comprehensive** -- 721 lines covering error handling, retries, auth
4. **Sanitization utilities well-tested** -- XSS prevention, log sanitization, query builder
5. **Coverage thresholds enforced** -- 85% on orchestrator and web components/lib
6. **Concurrency tests exist** -- coordinator-integration and runner-jobs
7. **Good test infrastructure** -- Shared fixtures, proper NestJS testing module usage