chore(orchestrator): bootstrap MS20 Site Stabilization mission

PRD updated with FR-020 (site stabilization scope) and 6 new assumptions.
Mission manifest, scratchpad, and TASKS.md created with 13 tasks across
workspace context, orchestrator connectivity, missing APIs, and UI gaps.
Milestone MS20-SiteStabilization created. Issue #534 tracks the epic.

Estimated total: ~215K tokens across 4 execution waves.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-27 04:09:05 -06:00
parent 11f22a7e96
commit 44062e6882
4 changed files with 223 additions and 112 deletions

View File

@@ -134,21 +134,28 @@ This is the active mission scope. MS16 (Pages) and MS17 (Backend Integration) ar
13. Global terminal: project/orchestrator level, smart (MS19)
14. Project-level orchestrator chat (MS19)
15. Master chat session: collapsible sidebar/slideout, always available (MS19)
16. Settings page for ALL environment variables, dynamically configurable via webUI (MS20)
17. Multi-tenant configuration with admin user management (MS20)
18. Team management with shared data spaces and chat rooms (MS20)
19. RBAC for file access, resources, models (MS20)
20. Federation: master-master and master-slave with key exchange (MS21)
21. Federation testing: 3 instances on Portainer (woltje.com domain) (MS21)
22. Agent task mapping configuration: system-level defaults, user-level overrides (MS22)
23. Telemetry: opt-out, customizable endpoint, sanitized data (MS22)
24. File manager with WYSIWYG editing: system/user/project levels (MS18)
25. User-level and project-level Kanban with filtering (MS18)
26. Break-glass authentication user (MS20)
27. Playwright E2E tests for all pages (MS23)
28. API documentation via Swagger (MS23)
29. Backend endpoints for all dashboard data (MS17 — already complete for existing modules)
30. Profile page linked from user card (MS16)
16. Site stabilization: workspace context propagation for mutations (MS20)
17. Site stabilization: personalities API + UI (MS20)
18. Site stabilization: user preferences API endpoint (MS20)
19. Site stabilization: orchestrator 502 and WebSocket connectivity (MS20)
20. Site stabilization: credential management UI (MS20)
21. Site stabilization: terminal page route (MS20)
22. Site stabilization: favicon, dark mode dropdown fix (MS20)
23. Settings page for ALL environment variables, dynamically configurable via webUI (MS21)
24. Multi-tenant configuration with admin user management (MS21)
25. Team management with shared data spaces and chat rooms (MS21)
26. RBAC for file access, resources, models (MS21)
27. Federation: master-master and master-slave with key exchange (MS22)
28. Federation testing: 3 instances on Portainer (woltje.com domain) (MS22)
29. Agent task mapping configuration: system-level defaults, user-level overrides (MS23)
30. Telemetry: opt-out, customizable endpoint, sanitized data (MS23)
31. File manager with WYSIWYG editing: system/user/project levels (MS18)
32. User-level and project-level Kanban with filtering (MS18)
33. Break-glass authentication user (MS20)
34. Playwright E2E tests for all pages (MS23)
35. API documentation via Swagger (MS23)
36. Backend endpoints for all dashboard data (MS17 — already complete for existing modules)
37. Profile page linked from user card (MS16)
### Out of Scope
@@ -334,7 +341,46 @@ This is the active mission scope. MS16 (Pages) and MS17 (Backend Integration) ar
- ASSUMPTION: Orchestrator commands route through existing web proxy (/api/orchestrator/\*) to orchestrator service. Rationale: Proxy routes already exist and handle auth.
- **Status: COMPLETE (MS19) — PRs #521 (commands), #522 (agent terminal). /status, /agents, /jobs, /pause, /resume, /help commands. Agent output streaming via SSE. 113 web tests.**
### FR-020: Settings Configuration (Future — MS20)
### FR-020: Site Stabilization & Feature Gaps (MS20) — IN PROGRESS
Runtime bugs and feature gaps discovered during live testing of mosaic.woltje.com.
**Workspace Context Propagation:**
- Domains page: "Workspace ID is required" when creating domains
- Projects page: "Workspace ID is required" when creating projects
- Credentials page: unable to add credentials (button disabled, feature stub)
- ASSUMPTION: The `useWorkspaceId()` hook + auto-detect in `apiRequest` from PR #532 handles reads, but mutation endpoints on some pages don't pass workspace ID correctly. Rationale: GET requests work after PR #532 but POST/mutation requests still fail on domains and projects pages.
**Missing API Endpoints:**
- `/api/personalities` — no controller/service exists; frontend expects GET/POST/PATCH/DELETE
- `/users/me/preferences` — listed in PRD API table but returns 404; frontend profile page depends on it
- ASSUMPTION: Personalities API follows existing NestJS module patterns (controller + service + DTO + Prisma model). Rationale: Consistent with all other API modules in the codebase.
- ASSUMPTION: User preferences endpoint is part of the existing users module but route is not registered. Rationale: PRD lists it as an existing endpoint.
**Orchestrator Connectivity:**
- All orchestrator-proxied endpoints return HTTP 502
- Orchestrator WebSocket connection fails ("Reconnecting to server...")
- Dashboard widgets: Agent Status, Task Progress, Orchestrator Events all error
- ASSUMPTION: The orchestrator service container runs but the Next.js API proxy cannot reach it. Root cause is likely environment variable or network configuration in Docker Swarm. Rationale: The orchestrator container exists in the compose file and has Traefik labels.
**UI/UX Issues:**
- Dark mode theming on Formality Level dropdown in Personalities page incorrect
- favicon.ico missing (404)
- Terminal sidebar link uses `#terminal` anchor instead of page route
- `useWorkspaceId` warning in console: no workspace ID in localStorage on fresh sessions
- ASSUMPTION: Terminal should have a dedicated page route `/terminal` that renders the terminal panel full-screen. Rationale: The sidebar has a Terminal link in the Operations section alongside Logs, implying it should be a navigable page.
**Credential Management:**
- "Add Credential" button is `disabled` in code — feature was stubbed as "coming soon"
- Need to implement credential creation UI and wire to existing `/api/credentials` CRUD endpoints
- ASSUMPTION: Credential CRUD frontend can use the existing `/api/credentials` API which was built during M7-CredentialSecurity. Rationale: Backend endpoints exist per audit.
### FR-021: Settings Configuration (Future — MS21)
- All environment variables configurable via UI
- Minimal launch env vars, rest configurable dynamically
@@ -496,10 +542,11 @@ These 19 NestJS modules are already implemented with Prisma and available for fr
| MS16+MS17-PagesDataIntegration | 0.0.17 | All pages built + wired to real API data | COMPLETE |
| MS18-ThemeWidgets | 0.0.18 | Theme package system, widget registry, WYSIWYG, Kanban filtering | COMPLETE |
| MS19-ChatTerminal | 0.0.19 | Global terminal, project chat, master chat session | COMPLETE |
| MS20-MultiTenant | 0.0.20 | Multi-tenant, teams, RBAC, RLS enforcement, break-glass auth | NOT STARTED |
| MS21-Federation | 0.0.21 | Federation (M-M, M-S), 3 instances, key exchange, data separation | NOT STARTED |
| MS22-AgentTelemetry | 0.0.22 | Agent task mapping, telemetry, wide-event logging | NOT STARTED |
| MS23-Testing | 0.0.23 | Playwright E2E, federation tests, documentation finalization | NOT STARTED |
| MS20-SiteStabilization | 0.0.20 | Runtime bug fixes, missing endpoints, orchestrator connectivity | IN PROGRESS |
| MS21-MultiTenant | 0.0.21 | Multi-tenant, teams, RBAC, RLS enforcement, break-glass auth | NOT STARTED |
| MS22-Federation | 0.0.22 | Federation (M-M, M-S), 3 instances, key exchange, data separation | NOT STARTED |
| MS23-AgentTelemetry | 0.0.23 | Agent task mapping, telemetry, wide-event logging | NOT STARTED |
| MS24-Testing | 0.0.24 | Playwright E2E, federation tests, documentation finalization | NOT STARTED |
## Assumptions
@@ -511,3 +558,9 @@ These 19 NestJS modules are already implemented with Prisma and available for fr
6. ASSUMPTION: Theme packages are code-level TypeScript files (not runtime-installable npm packages). Each theme exports CSS variable overrides. Rationale: Keeps the system simple for MS18; runtime package loading can be added in a future milestone.
7. ASSUMPTION: WYSIWYG editor uses Tiptap (ProseMirror-based, headless). Rationale: Headless approach integrates naturally with the CSS variable design system, excellent markdown import/export, TypeScript-first, battle-tested.
8. ASSUMPTION: MS18 includes WYSIWYG editing for knowledge entries and Kanban filtering enhancements in addition to themes and widgets. These were originally listed separately but are grouped into MS18 per PRD scope items 24-25. Rationale: All are frontend-focused enhancements that build on the existing page infrastructure.
9. ASSUMPTION: The `useWorkspaceId()` hook + auto-detect in `apiRequest` from PR #532 handles reads, but mutation endpoints on some pages don't pass workspace ID correctly. Rationale: GET requests work after PR #532 but POST/mutation requests still fail on domains and projects pages.
10. ASSUMPTION: Personalities API follows existing NestJS module patterns (controller + service + DTO + Prisma model). Rationale: Consistent with all other API modules in the codebase.
11. ASSUMPTION: User preferences endpoint is part of the existing users module but route is not registered. Rationale: PRD lists it as an existing endpoint.
12. ASSUMPTION: The orchestrator service container runs but the Next.js API proxy cannot reach it. Root cause is likely environment variable or network configuration in Docker Swarm. Rationale: The orchestrator container exists in the compose file and has Traefik labels.
13. ASSUMPTION: Terminal should have a dedicated page route `/terminal` that renders the terminal panel full-screen. Rationale: The sidebar has a Terminal link in the Operations section alongside Logs, implying it should be a navigable page.
14. ASSUMPTION: Credential CRUD frontend can use the existing `/api/credentials` API which was built during M7-CredentialSecurity. Rationale: Backend endpoints exist per audit.