Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Delete docs/tasks.md (let orchestrator bootstrap from scratch) - Delete docs/claude/task-tracking.md (superseded by universal guide) - Add codebase review reports for orchestrator to parse Tests orchestrator's autonomous bootstrap capability.
231 lines
11 KiB
Markdown
231 lines
11 KiB
Markdown
# Mosaic Stack - QA & Test Coverage Report
|
|
|
|
**Date:** 2026-02-05
|
|
**Scope:** All workspaces (api, web, orchestrator, coordinator, packages)
|
|
**Total Test Files:** 552 | **Total Test Cases:** ~3,685
|
|
|
|
---
|
|
|
|
## Overall Test Health
|
|
|
|
| Workspace | Tests | Files | Coverage | Grade | Key Issue |
|
|
| ----------------- | ------ | ----- | --------------------- | ------ | ------------------------------------- |
|
|
| apps/orchestrator | ~452 | 19 | 85% enforced | **A** | Near-complete, well-structured |
|
|
| apps/api | ~2,174 | 143 | Not enforced | **B-** | 21 untested services, weak assertions |
|
|
| apps/web | ~555 | 51 | 85% on components/lib | **C+** | 76 untested components, 23 skipped |
|
|
| apps/coordinator | ~504 | 23 | **16% reported** | **D** | Coverage crisis despite test files |
|
|
| packages/shared | ~25 | 1 | N/A | **B+** | Adequate for scope |
|
|
| packages/ui | ~15 | 1 | N/A | **D+** | 9 of 10 components untested |
|
|
|
|
---
|
|
|
|
## Critical Coverage Gaps
|
|
|
|
### GAP-1: Coordinator 16% Line Coverage [CRITICAL - Priority 10/10]
|
|
|
|
Despite having 23 test files and ~504 test cases, the coordinator reports only 16% line coverage with 14 of 22 source files at 0% execution. Files at 0% include the core `coordinator.py`, `queue.py`, `webhook.py`, `security.py`, `parser.py`, and `metrics.py`.
|
|
|
|
**Root Cause (likely):** Tests import types/models but mock everything, so actual source code never executes; or coverage run only executes a subset of tests.
|
|
|
|
**Action:** Run `cd apps/coordinator && python -m pytest tests/ -v --cov=src --cov-report=term-missing` and diagnose.
|
|
|
|
### GAP-2: knowledge.service.ts - 916 Lines, No Tests [CRITICAL - Priority 9/10]
|
|
|
|
The largest service file in the API has no direct unit tests. Core CRUD operations, pagination, filtering, slug generation, cache invalidation, and embedding queue integration are all untested. Only version-specific tests exist.
|
|
|
|
**Regressions at risk:** Pagination off-by-one, slug collision handling, stale cache after updates, embedding queue not triggered.
|
|
|
|
### GAP-3: admin.guard.ts - Security Guard, No Tests [CRITICAL - Priority 9/10]
|
|
|
|
This guard determines system admin access by checking workspace ownership. No tests verify it correctly grants/denies admin access.
|
|
|
|
**Regressions at risk:** Non-admin users gaining admin access, valid admins locked out, missing ForbiddenException.
|
|
|
|
### GAP-4: embeddings.service.ts - 249 Lines, Raw SQL, No Tests [CRITICAL - Priority 9/10]
|
|
|
|
Uses raw SQL for pgvector operations. No tests exist for embedding validation, vector SQL construction, or similarity search.
|
|
|
|
**Regressions at risk:** SQL injection through embedding data, invalid vector dimensions, wrong search results.
|
|
|
|
### GAP-5: widget-data.service.ts - 695 Lines, No Tests [HIGH - Priority 8/10]
|
|
|
|
Second-largest untested file. Fetches data from multiple sources for dashboard widgets.
|
|
|
|
### GAP-6: ideas.service.ts - 321 Lines, No Tests [HIGH - Priority 8/10]
|
|
|
|
User-facing CRUD feature with domain/project associations and activity logging.
|
|
|
|
---
|
|
|
|
## Untested Files by Workspace
|
|
|
|
### apps/api - 21 Untested Service/Controller Files
|
|
|
|
| File | Lines | Risk |
|
|
| --------------------------------------------- | ----- | -------- |
|
|
| knowledge/knowledge.service.ts | 916 | HIGH |
|
|
| widgets/widget-data.service.ts | 695 | HIGH |
|
|
| ideas/ideas.service.ts | 321 | HIGH |
|
|
| database/embeddings.service.ts | 249 | HIGH |
|
|
| ideas/ideas.controller.ts | 123 | MEDIUM |
|
|
| widgets/widgets.controller.ts | 129 | MEDIUM |
|
|
| widgets/widgets.service.ts | 59 | MEDIUM |
|
|
| users/preferences.service.ts | 99 | MEDIUM |
|
|
| users/preferences.controller.ts | 56 | MEDIUM |
|
|
| common/throttler/throttler-storage.service.ts | 80+ | MEDIUM |
|
|
| auth/guards/admin.guard.ts | 46 | SECURITY |
|
|
| federation/audit.service.ts | 80+ | LOW |
|
|
| common/throttler/throttler-api-key.guard.ts | - | MEDIUM |
|
|
| knowledge/import-export.controller.ts | - | MEDIUM |
|
|
| knowledge/knowledge.controller.ts | - | MEDIUM |
|
|
| knowledge/stats.controller.ts | - | LOW |
|
|
| knowledge/queues/embedding-queue.service.ts | - | MEDIUM |
|
|
| layouts/layouts.controller.ts | - | LOW |
|
|
| cron/cron.controller.ts | - | LOW |
|
|
| bridge/parser/command-parser.service.ts | - | MEDIUM |
|
|
| app.service.ts | - | LOW |
|
|
|
|
Additionally, 22 DTO directories lack validation tests.
|
|
|
|
### apps/web - 76 Untested Component/Page Files
|
|
|
|
**Critical pages (user-facing routes):**
|
|
|
|
- Main dashboard page.tsx
|
|
- Calendar page
|
|
- Knowledge page + 5 sub-pages
|
|
- Federation connections + 2 sub-pages
|
|
- Settings (4 sub-pages)
|
|
|
|
**Critical components:**
|
|
|
|
- Chat system: Chat.tsx, ChatInput.tsx, MessageList.tsx, ConversationSidebar.tsx, BackendStatusBanner.tsx
|
|
- Dashboard widgets: DomainOverview, QuickCapture, RecentTasks, UpcomingEvents
|
|
- HUD system: HUD.tsx, WidgetGrid.tsx, WidgetRenderer.tsx, WidgetWrapper.tsx
|
|
- Knowledge: EntryCard, EntryList, EntryViewer, EntryFilters, VersionHistory, ImportExportActions, StatsDashboard
|
|
- Navigation.tsx
|
|
- Workspace: WorkspaceCard, WorkspaceSettings, MemberList, InviteMember
|
|
- Team: TeamCard, TeamMemberList, TeamSettings
|
|
|
|
**Untested API client modules (11 files):**
|
|
|
|
- chat.ts, domains.ts, events.ts, federation.ts, ideas.ts, knowledge.ts, personalities.ts, teams.ts, api.ts, auth-client.ts
|
|
|
|
**Untested hooks:** useChat.ts, useLayout.ts
|
|
|
|
### apps/orchestrator - 1 Untested File
|
|
|
|
- health/health.service.ts (minimal risk)
|
|
|
|
### packages/ui - 9 Untested Components
|
|
|
|
- Avatar, Badge, Card, Input, Modal, Select, Textarea, Toast (only Button tested)
|
|
|
|
---
|
|
|
|
## 23 Skipped Tests (apps/web)
|
|
|
|
| File | Count | Reason |
|
|
| --------------------------- | ----- | -------------------------------------------------------- |
|
|
| CalendarWidget.test.tsx | 5 | Component migrated from setTimeout mock data to real API |
|
|
| TasksWidget.test.tsx | 6 | Same - setTimeout mock data mismatch |
|
|
| QuickCaptureWidget.test.tsx | 3 | Submit and keyboard shortcut tests |
|
|
| LinkAutocomplete.test.tsx | 9 | Debounce search, keyboard nav, link insertion, dropdown |
|
|
|
|
**Action:** Re-enable and update tests to match current component implementations.
|
|
|
|
---
|
|
|
|
## Test Anti-Patterns Found
|
|
|
|
### Placeholder Assertions (expect(true).toBe(true))
|
|
|
|
| File | Line | Context |
|
|
| ----------------------------------- | -------- | ----------------------- |
|
|
| ChatOverlay.test.tsx | 259, 267 | Responsive design tests |
|
|
| rejection-handler.service.spec.ts | 307 | Notification sending |
|
|
| semantic-search.integration.spec.ts | 122 | Conditional branch |
|
|
|
|
**Impact:** Tests always pass, provide zero regression protection.
|
|
|
|
### Sole toBeDefined() Assertions (30+ instances)
|
|
|
|
Most concerning in:
|
|
|
|
- `llm-telemetry.decorator.spec.ts` -- 6 tests verify decorator doesn't throw but never check span attributes
|
|
- `federation/query.service.spec.ts` -- 8 tests
|
|
- `federation/query.controller.spec.ts` -- 3 tests
|
|
- `layouts.service.spec.ts` -- 2 tests
|
|
- `workspace-settings.service.spec.ts` -- 1 test
|
|
|
|
**Impact:** Tests verify existence but not correctness. Regressions slip through.
|
|
|
|
### Testing Implementation Details Instead of Behavior
|
|
|
|
- `cors.spec.ts` -- Tests CORS by asserting on JS objects, not actual HTTP headers/middleware
|
|
- `Button.test.tsx` -- Asserts on CSS class names (`bg-blue-600`) instead of behavior
|
|
|
|
**Impact:** Tests break on implementation changes even when behavior is correct.
|
|
|
|
### Potential Flaky Patterns
|
|
|
|
setTimeout-based timing in 5 test files:
|
|
|
|
- `runner-jobs.service.spec.ts:620,833`
|
|
- `semantic-search.integration.spec.ts:153`
|
|
- `mcp/stdio-transport.spec.ts` (6 instances)
|
|
- `coordinator-integration.service.concurrency.spec.ts:170`
|
|
- `health.controller.spec.ts:63` (1100ms wait)
|
|
|
|
---
|
|
|
|
## Missing Test Categories
|
|
|
|
### No Playwright E2E Tests
|
|
|
|
The project documents Playwright as the E2E framework but no playwright.config.ts or E2E test files exist.
|
|
|
|
### No DTO Validation Tests
|
|
|
|
22 DTO directories lack validation testing. DTOs define input validation rules via class-validator decorators, but these are never tested in isolation.
|
|
|
|
### Limited Integration Tests
|
|
|
|
Only 8 integration test files exist across the entire codebase. Most module interactions are untested.
|
|
|
|
---
|
|
|
|
## Recommended Test Additions (Priority Order)
|
|
|
|
| Priority | Item | Effort | Impact |
|
|
| -------- | ------------------------------------ | ------ | -------------------------------- |
|
|
| P0 | Investigate coordinator 16% coverage | 2hr | Unblocks all coordinator testing |
|
|
| P0 | knowledge.service.ts unit tests | 4hr | Covers largest untested service |
|
|
| P0 | admin.guard.ts unit tests | 1hr | Security-critical |
|
|
| P0 | embeddings.service.ts unit tests | 2hr | Raw SQL validation |
|
|
| P1 | widget-data.service.ts unit tests | 3hr | Dashboard reliability |
|
|
| P1 | ideas.service.ts unit tests | 2hr | User-facing CRUD |
|
|
| P1 | Re-enable 23 skipped widget tests | 2hr | Immediate coverage gain |
|
|
| P1 | Replace placeholder assertions | 1hr | Fix false-positive tests |
|
|
| P2 | Chat system component tests | 3hr | Core user interaction |
|
|
| P2 | API client module tests (11 files) | 3hr | Request/response validation |
|
|
| P2 | Throttler storage tests | 2hr | Security infrastructure |
|
|
| P2 | Preferences service tests | 1hr | User settings |
|
|
| P3 | Strengthen toBeDefined-only tests | 2hr | Better regression detection |
|
|
| P3 | UI package component tests | 3hr | Design system reliability |
|
|
| P3 | Playwright E2E setup + smoke tests | 4hr | End-to-end confidence |
|
|
|
|
**Estimated total effort: ~5-6 days for P0+P1 items**
|
|
|
|
---
|
|
|
|
## Positive Test Observations
|
|
|
|
1. **Orchestrator is exemplary** -- 452 tests, near-complete coverage, behavioral testing, good mocking
|
|
2. **Federation security tests are thorough** -- Crypto, signature, timeout, workspace access, capability guard
|
|
3. **API client test (web) is comprehensive** -- 721 lines covering error handling, retries, auth
|
|
4. **Sanitization utilities well-tested** -- XSS prevention, log sanitization, query builder
|
|
5. **Coverage thresholds enforced** -- 85% on orchestrator and web components/lib
|
|
6. **Concurrency tests exist** -- coordinator-integration and runner-jobs
|
|
7. **Good test infrastructure** -- Shared fixtures, proper NestJS testing module usage
|