Files
stack/docs/scratchpads/mvp-20260312.md
Jarvis 7383380f64 feat(gateway): tier-detector with fail-fast PG/Valkey/pgvector probes (FED-M1-04)
Implements `apps/gateway/src/bootstrap/tier-detector.ts` invoked from
`main.ts` before NestJS bootstraps. For each tier:

- `local`: no-op (PGlite is in-process)
- `standalone`: probe Postgres + Valkey
- `federated`: probe Postgres + Valkey + pgvector extension; reject
  config upfront if `queue.type !== 'bullmq'`

Each probe has a 5-second hard cap and emits a structured
`TierDetectionError` with service / host / port / remediation. The
remediation field discriminates pgvector failure modes ("library not
available" vs "permission denied") so operators get actionable hints
without leaking credentials.

Adds `postgres` and `ioredis` as direct gateway deps; previously only
transitive. 12 unit tests cover happy paths and each fail-fast branch.

Refs #460
2026-04-19 19:02:12 -05:00

28 KiB
Raw Blame History

Mission Scratchpad — MVP

Append-only log. NEVER delete entries. NEVER overwrite sections. This is the orchestrator's working memory across sessions.

Original Mission Prompt

Active mission detected: MVP. Read the mission state files and report status.
User confirmed: start the planning gate.

Planning Decisions

2026-03-13 — Milestone and task breakdown

  • PRD defines 8 phases (Phase 07), mapped 1:1 to Gitea milestones
  • 59 issues created on git.mosaicstack.dev/mosaic/mosaic-stack (#1#59)
  • Each phase has a verification task as the final issue
  • Task IDs use P{phase}-{seq} format (P0-001 through P7-008)
  • Repo created as mosaic/mosaic-stack (private) on Gitea
  • Milestones: ms-157 (Phase 0) through ms-164 (Phase 7)
  • Total: 59 tasks across 8 milestones

Phase structure

Phase Version Tasks Focus
0 v0.0.1 9 Foundation — monorepo, types, db, auth, OTEL, Docker, CI
1 v0.0.2 9 Core API — gateway, brain, queue, routes, WebSocket
2 v0.0.3 7 Agent Layer — Pi SDK, multi-provider, routing, coord
3 v0.0.4 8 Web Dashboard — Next.js, chat, tasks, projects, admin
4 v0.0.5 7 Memory & Intelligence — memory, log, summarization, skills
5 v0.0.6 5 Remote Control — Discord, Telegram, SSO
6 v0.0.7 6 CLI & Tools — CLI, prdy, quality-rails, installer, TUI
7 v0.1.0 8 Polish & Beta — MCP, providers, E2E, docs, release

Session Log

Session Date Milestone Tasks Done Outcome
1 2026-03-13 Planning Planning gate Milestones created, 59 issues created, TASKS.md populated, manifest updated
2 2026-03-13 Vertical slice P1-001, P1-007, P1-008, P2-001, P5-002, P6-005 Communication spine built and merged (PR #61). Gateway + TUI + Discord. 3-agent gatekeeper review, 10/16 issues remediated, 4 deferred.
3 2026-03-13 Foundation P0-002, P0-005, P0-006 Foundation layer merged (PR #65). Docker Compose (PG+pgvector, Valkey, OTEL Collector, Jaeger), OTEL auto-instrumentation in gateway, @mosaicstack/types with DTOs + Socket.IO typed event maps.

Session 4 — Docker Compose fix

Session Date Milestone Tasks Done Outcome
4 2026-03-12 Foundation (fix) Fixed Jaeger tag (2→2.6.0), remapped PG/Valkey ports (5433/6380) to avoid host conflicts. PR #66 merged to main.

Verification evidence:

  • All 4 containers healthy (PG, Valkey, OTEL Collector, Jaeger)
  • OTEL pipeline proven: mosaic-gateway service visible in Jaeger UI
  • Gateway traces flow through Collector → Jaeger

Session 5 — Phase 0-1 completion

Session Date Milestone Tasks Done Outcome
5 2026-03-12 Phase 0, Phase 1 P0-003, P0-004, P0-007, P0-008, P0-009, P1-002P1-006, P1-009 Foundation + Core API complete. DB, auth, CI, brain, queue, CRUD routes all merged and green.

Session 6 — Phase 2 agent layer

Session Date Milestone Tasks Done Outcome
6 2026-03-12 Phase 2 P2-002, P2-003, P2-004, P2-005, P2-006, FIX-01 Multi-provider routing, tool registration, coord migration, session management, dispose() fix. PRs #74#78.

Session 7-8 — Phase 2 verification + completion

Session Date Milestone Tasks Done Outcome
7-8 2026-03-12 Phase 2 P2-007 19 unit tests (routing + coord). PR #79 merged, issue #25 closed. Phase 2 complete.

Session 11 — Phase 5 completion

Session Date Milestone Tasks Done Outcome
11 2026-03-14 Phase 5 P5-005 Wired Telegram plugin into gateway (was stubbed). Updated .env.example with all P5 env vars. PR #99 merged, issue #45 closed. Phase 5 complete.

Findings during verification:

  • Telegram plugin was built but not wired into gateway (stub warning in plugin.module.ts)
  • Discord plugin was fully wired
  • SSO/Authentik OIDC adapter was fully wired
  • All three quality gates passing

Session 11 (continued) — Phase 6 completion

Session Date Milestone Tasks Done Outcome
11 2026-03-14 Phase 6 P6-002, P6-003, P6-004, P6-001, P6-006 Full CLI & Tools migration. PRs #100-#104 merged. Also fixed 2 gateway startup bugs (PR #102). Phase 6 complete.

Phase 6 details:

  • P6-002: @mosaicstack/prdy migrated from v0 (~400 LOC). PR #101.
  • P6-003: @mosaicstack/quality-rails migrated from v0 (~500 LOC). PR #100.
  • P6-004: @mosaicstack/mosaic wizard migrated from v0 (2272 LOC, 28 files). PR #103.
  • P6-001: CLI subcommands wired — tui, prdy, quality-rails, wizard all working. PR #104.
  • BUG-1: PLUGIN_REGISTRY circular import fixed via plugin.tokens.ts. PR #102.
  • BUG-2: AuthStorage.create() → .inMemory() to prevent silent exit. PR #102.

Session 11 (continued) — E2E testing + bug fixes + Phase 7 rescope

Bug fixes merged during E2E testing (PRs #107-#117):

  • CI: from_secret syntax for Woodpecker v2 (#107)
  • Gateway: dotenv loading from monorepo root (#108)
  • Gateway: missing @Inject() decorators causing silent hang (#109)
  • Gateway: CORS + memory userId + pgvector auto-init (#110)
  • Auth: BetterAuth trustedOrigins for web dashboard (#111)
  • Auth: CORS headers on raw BetterAuth HTTP handler (#112)
  • Husky: removed deprecated v9 shim lines (#113)
  • CLI: login command + authenticated TUI sessions (#114)
  • CLI: Origin header on auth requests (#115)
  • Agent: Ollama provider registration with openai-completions API (#116, #117)

E2E testing results:

  • Web UI: login works, projects list, chats list (but chat doesn't function)
  • TUI: authenticated connection works, agent responds via Ollama llama3.2
  • Agent tools: brain, coord, memory tools confirmed working
  • Gateway: all routes mapped, providers register correctly

Phase 7 rescoped (Jason directed):

  • Phase 7 renamed from "Polish & Beta" to "Feature Completion (v0.0.8)"
  • Added 13 new tasks (P7-009 through P7-021): web UI, agent tools, CLI, coord architecture
  • P7-002 (extra SSO), P7-003 (extra LLM), P7-005 (perf), P7-008 (v0.1.0 tag) moved to Phase 8
  • Phase 8 added as "Polish & Beta (v0.1.0)"
  • Reason: platform isn't feature-complete enough for beta — web UI is scaffolded but non-functional for real use, agent tooling is minimal, CLI needs model switching

Open Questions

(none at this time)

Corrections

2026-03-13 — Vertical slice reorder (Jason directed)

Original plan: Linear Phase 0 → 1 → 2 → ... execution.

Correction: Vertical slice first. Scaffold monorepo, then build the Pi TUI → Gateway → Discord communication spine end-to-end before backfilling auth, brain, memory, CRUD, etc.

Why: Validate the architecture's core message flow before investing in horizontal layers. If the communication channels don't work, nothing else matters.

Revised execution sequence:

Step Tasks (cross-phase) What it proves
1 P0-001: Scaffold monorepo Build system works
2 P0-005: Docker Compose (PG + Valkey) Infrastructure runs
3 P0-002: @mosaicstack/types (minimal — gateway, agent, chat types) Shared contracts
4 P1-001: Gateway scaffold (minimal NestJS + Fastify) API surface boots
5 P1-007: WebSocket server (chat streaming) Real-time channel works
6 P1-008: Basic agent dispatch (single provider) LLM responds
7 P2-001: @mosaicstack/agent — Pi SDK integration (minimal) Pi sessions work
8 P6-005: Pi TUI integration (mosaic tui → gateway) TUI ↔ Gateway proven
9 P5-001: Plugin host (channel plugin interface) Plugin arch works
10 P5-002: Discord plugin (bot + channel) Discord ↔ Gateway proven
Then backfill: auth, brain, db, queue, OTEL, CI, web dashboard, etc.

Session 9 — Phase 3 Web Dashboard (P3-001 through P3-007)

Session Date Milestone Tasks Done Outcome
9 2026-03-12 Phase 3 P3-001 through P3-007 Full web dashboard: Next.js 16 scaffold, auth pages, chat UI, tasks (list+kanban), projects, settings, admin. PRs #82-#89 merged.

Session 10 — Phase 3 verification (P3-008)

Session Date Milestone Tasks Done Outcome
10 2026-03-13 Phase 3 P3-008 Phase 3 verification: typecheck 18/18, lint 18/18, format clean, build green (10 routes), 10 tests pass. Phase 3 complete.

Session 10 (continued) — Phase 4 Memory & Intelligence

Session Date Milestone Tasks Done Outcome
10 2026-03-13 Phase 4 P4-001 through P4-007 Full memory + log system: DB schema (preferences, insights w/ pgvector, agent_logs, skills, summarization_jobs), @mosaicstack/memory + @mosaicstack/log packages, embedding service, summarization pipeline w/ cron, memory tools in agent sessions, skill management CRUD. All gates green.

Session 12 — Phase 7 planning + execution start

Session Date Milestone Tasks Done Outcome
12 2026-03-15 Phase 7 Planning Merged rescope PR #119. Created 15 Gitea issues (#120-#134) for P7-009 through P7-021 + FIX-02/FIX-03. Planned 10-wave execution order with 2-worker parallelism.

Phase 7 execution plan (10 waves, max 2 parallel workers):

Wave Task A Task B
1 P7-009 Web chat WS (#120) P7-001 MCP hardening (#52)
2 P7-010 Conversation mgmt (#121) P7-015 Agent tools (#126)
3 P7-011 Project views (#122) P7-016 MCP client (#127)
4 P7-012 Provider UI (#123) P7-017 Skill invocation (#128)
5 P7-013 Settings persist (#124) P7-018 CLI model switch (#129)
6 P7-014 Admin panel (#125) P7-019 CLI sessions (#130)
7 P7-020 Coord DB (#131)
8 FIX-02 TUI state (#133) FIX-03 Agent sandbox (#134)
9 P7-004 E2E Playwright (#55) P7-006 Docs (#57) + P7-007 Deploy docs (#58)
10 P7-021 Verify Phase 7 (#132)

Session 12 — Phase 7 completion summary

All 17 Phase 7 tasks + 2 backlog fixes completed in a single session.

PRs merged: #136, #137, #138, #139, #140, #141, #142, #143, #144, #145, #146, #147, #148, #149, #150, #151, #152, #153 Issues closed: #52, #55, #57, #58, #120-#134

Verification evidence:

  • Typecheck: 32/32 tasks green
  • Lint: 18/18 packages green
  • Format: All files clean
  • 19 PRs squash-merged to main, all quality gates passed

Phase 7 delivered:

  • Web: functional chat (WS streaming), conversation management, project detail views, provider UI, settings persistence, admin panel
  • Agent: 7 new tools (file/git/shell/web), MCP server (14 tools), MCP client (external server bridge), skill invocation
  • CLI: model/provider switching, session management
  • Infrastructure: coord DB migration, agent sandbox hardening
  • Quality: E2E Playwright suite (~35 tests), comprehensive docs (user/admin/dev/deployment)
  • Fixes: TUI state updater, agent session sandboxing

Session 13 — CLI Command Architecture (P8-005, P8-006)

Session Date Milestone Tasks Done Outcome
13 2026-03-15 Phase 8 P8-005, P8-006 CLI command architecture implemented. DB schema, brain repo, gateway endpoints, CLI commands. PR #158 merged.

Changes delivered:

  • DB: Extended agents table (projectId, ownerId, systemPrompt, allowedTools, skills, isSystem). Added agentId to conversations.
  • Brain: New agents repository with findAccessible (owner's + system agents).
  • Gateway: /api/agents CRUD, consolidated /api/missions with user-scoped CRUD + /tasks sub-routes, coord slimmed to file-based only, agentConfigId wired into session creation.
  • CLI: mosaic agent (--list, --new, --show, --update, --delete), mosaic mission (--list, --init, --plan, --update, task subcommand), mosaic prdy (gateway-aware), shared with-auth + select-dialog utilities.
  • TUI: --agent and --project flags, agent name display in top bar, agentId in socket payload.
  • Types: agentId added to ChatMessagePayload.
  • Tests: 23/23 gateway tests pass (updated ownership test for user-scoped missions).

Session 14 — Platform Architecture Plan Augmentation + Task Breakdown

Session Date Milestone Tasks Done Outcome
14 2026-03-15 Phase 8 P8-018 Augmented plan, created 13 issues, created Phase 8 milestone.

Decisions made:

  • This plan is Phase 7 feature extension work, not Phase 8 beta scope. P8-001P8-004 (SSO, LLM, perf, release gate) are deferred to far future.
  • /provider OAuth in TUI: URL-to-clipboard + Valkey poll token pattern (same as Pi agent)
  • Add mutable column to preferences now (P8-007 DB migration)
  • Teams architecture: teams + team_members tables, teamId/ownerType on projects. Workspace path branches on owner type: users/<uid>/ vs teams/<tid>/.
  • Phase dependency chain decided: Wave 1 (DB+Types) → Wave 2 (TUI+toolhardening) → Wave 3 (gateway registry, gating) → Wave 4 (prefs+commands) → Wave 5 (reload+GC) → Wave 6 (workspaces) → Wave 7 (autocomplete) → Wave 8 (verify).

Plan augmentations added:

  • Teams Architecture section (DB schema, workspace paths, RBAC)
  • REST Route Specifications table
  • /provider OAuth flow (URL+clipboard+polling)
  • Preferences mutable migration spec
  • Test Strategy (per-task test files + key test cases)
  • Phase Execution Order (dependency graph + wave plan)

Issues created: #160#172 (Gitea milestone ms-165) P8-018 closed: Spin-off stubs created (gatekeeper-service.md, task-queue-unification.md, chroot-sandboxing.md)

Next: Begin execution at Wave 1 — P8-007 (DB migrations) + P8-008 (Types) in parallel.


Session 15 — 2026-04-19 — MVP Rollup Manifest Authored

Session Date Milestone Tasks Done Outcome
15 2026-04-19 (rollup-level) MVP-T01 (manifest), MVP-T02 (archive iuv-v2), MVP-T03 (land FED planning) Authored MVP rollup manifest at docs/MISSION-MANIFEST.md. Federation v1 planning merged to main (PR #468 / commit 66512550). Install-ux-v2 archived as complete.

Gap context: The MVP scratchpad was last updated at Session 14 (2026-03-15). In the intervening month, two sub-missions ran outside the MVP framework: install-ux-hardening (complete, mosaic-v0.0.25) and install-ux-v2 (complete on 2026-04-19, 0.0.270.0.29). Both archived under docs/archive/missions/. The phase-based execution from Sessions 114 (Phases 08, issues #1#172) substantially shipped during this window via those sub-missions and standalone PRs — the MVP mission was nominally active but had no rollup manifest tracking it.

User reframe (this session):

There will be more in the MVP. This will inevitably become scope creep. I need a solution that works via webUI, TUI, CLI, and just works for MVP. Federation is required because I need it to work NOW, so my disparate jarvis-brain usage can be consolidated properly.

Decisions:

  1. MVP is the rollup mission, not a single-purpose mission. Federation v1 is one workstream of MVP, not MVP itself. Phase 08 work is preserved as historical context but is no longer the primary control plane.
  2. Three-surface parity (webUI / TUI / CLI) is a cross-cutting MVP requirement (MVP-X1), not a workstream. Encoded explicitly so it can't be silently dropped.
  3. Scope creep is named and accommodated. Manifest has explicit "Likely Additional Workstreams" section listing PRD-derived candidates without committing execution capacity to them.
  4. Workstream isolation — each workstream gets its own manifest under docs/{workstream}/MISSION-MANIFEST.md. MVP manifest is rollup only.
  5. Archive-don't-delete — install-ux-v2 manifest moved to docs/archive/missions/install-ux-v2-20260405/ with status corrected to complete (IUV-M03 closeout note added pointing at PR #446 + releases 0.0.27 → 0.0.29).
  6. Federation planning landed first — PR #468 merged before MVP manifest authored, so the manifest references real on-main artifacts.

Open items:

  • .mosaic/orchestrator/mission.json MVP slot remains empty (zero milestones). Tracked as MVP-T04. Defer until next session — does not block W1 kickoff. Open question: hand-edit vs. mosaic coord init reinit.
  • Additional workstreams (web dashboard parity, TUI/CLI completion, remote control, multi-user/SSO, LLM provider expansion, MCP, brain) anticipated per PRD but not declared. Pre-staged in manifest's "Likely Additional Workstreams" list.

Artifacts this session:

Artifact Status
PR #468 (docs(federation): PRD, milestones, mission manifest, and M1 task breakdown) merged 2026-04-19 → main (commit 66512550)
docs/MISSION-MANIFEST.md (MVP rollup, replaces install-ux-v2 manifest) authored on docs/mvp-mission-manifest branch
docs/TASKS.md (MVP rollup, points at workstream task files) authored
Install-ux-v2 manifest + tasks + scratchpad + iuv-m03-design moved to docs/archive/missions/install-ux-v2-20260405/ with status corrected to complete

Next: PR docs/mvp-mission-manifest → merge to main → next session begins W1 / FED-M1 from clean state.


Session 16 — 2026-04-19 — claude

Mode: Delivery (W1 / FED-M1 execution) Branch: feat/federation-m1-tier-config Context budget: 200K, currently ~45% used (compaction-aware)

Goal: FED-M1-01 — extend mosaic.config.json schema: add "federated" to tier enum.

Critical reconciliation surfaced during pre-flight:

The federation PRD (docs/federation/PRD.md line 247) defines three tiers: local | standalone | federated. The existing code (packages/config/src/mosaic-config.ts, packages/mosaic/src/types.ts, packages/mosaic/src/stages/gateway-config.ts) uses local | team.

team is the same conceptual tier as PRD standalone (Postgres + Valkey, no pgvector). Rather than carrying a confusing alias forever, FED-M1-01 will rename teamstandalone and add federated as a third value, so all downstream federation work has a coherent vocabulary.

Affected files (storage-tier semantics only — Team/workspace usages unaffected):

  • packages/config/src/mosaic-config.ts (StorageTier type, validator enum, defaults)
  • packages/mosaic/src/types.ts (GatewayStorageTier)
  • packages/mosaic/src/stages/gateway-config.ts (~10 references)
  • packages/mosaic/src/stages/gateway-config.spec.ts (test references)
  • Possibly tools/e2e-install-test.sh (referenced grep) and headless env hint string

Worker plan:

  1. Spawn sonnet subagent with explicit task spec + the reconciliation context above.
  2. Worker delivers diff; orchestrator runs pnpm typecheck && pnpm lint && pnpm format:check.
  3. Independent feature-dev:code-reviewer subagent reviews diff.
  4. Second independent verification subagent (general-purpose, sonnet) verifies reviewer's claims and confirms all 'team' storage-tier references migrated, no Team/workspace bleed.
  5. Open PR via tea CLI; wait for CI; queue-guard; squash merge; record actuals.

Open items:

  • MVP-T04 (sync .mosaic/orchestrator/mission.json) still deferred.
  • team tier rename touches install wizard headless env vars (MOSAIC_STORAGE_TIER=team); will need 0.0.x deprecation note in scratchpad if release notes are written this milestone.

Session 17 — 2026-04-19 — claude

Mode: Delivery (W1 / FED-M1 execution; resumed after compaction) Branches landed this run: feat/federation-m1-tier-config (PR #470), feat/federation-m1-compose (PR #471), feat/federation-m1-pgvector (PR #472) Branch active at end: feat/federation-m1-detector (FED-M1-04, ready to push)

Tasks closed: FED-M1-01, FED-M1-02, FED-M1-03 (all merged to main via squash, CI green, issue #460 still open as milestone).

FED-M1-04 — tier-detector: Worker delivered apps/gateway/src/bootstrap/tier-detector.ts (~210 lines) + tier-detector.spec.ts (12 tests). Independent code review (sonnet) returned changes-required with 3 issues:

  1. CRITICAL: probeValkey missing connectTimeout: 5000 on the ioredis Redis client (defaulted to 10s, violated fail-fast spec).
  2. IMPORTANT: probePgvector catch block did not discriminate "library not installed" (use pgvector/pgvector:pg17) from permission errors.
  3. IMPORTANT: Federated tier silently skipped Valkey probe when queue.type !== 'bullmq' (computed Valkey URL conditionally).

Worker fix-up round addressed all three:

  • L147: connectTimeout: 5000 added to Redis options
  • L113-117: catch block branches on extension "vector" is not available substring → distinct remediation per failure mode
  • L206-215: federated branch fails fast with service: 'config' if queue.type !== 'bullmq', then probes Valkey unconditionally
  • 4 new tests (8 → 12 total) cover each fix specifically

Independent verifier (haiku) confirmed all 6 verification claims (line numbers, test presence, suite green: 12/12 PASS).

Process note — review pipeline working as designed:

Initial verifier (haiku) on the first delivery returned "OK to ship" but missed the 3 deeper issues that the sonnet code-reviewer caught. This validates the user's "always verify subagent claims independently with another subagent" rule — but specifically with the right tier for the task: code review needs sonnet-level reasoning, while haiku is fine for verifying surface claims (line counts, file existence) once review issues are known. Going forward: code review uses sonnet (feature-dev:code-reviewer), claim verification uses haiku.

Followup tasks tracked but deferred:

  • #7: tier=local hardcoded in gateway-config resume branches (~262, ~317) — pre-existing bug, fix during M1-06 (doctor) or M1-09 (regression).
  • #8: confirm packages/config/dist not git-tracked.

Next: PR for FED-M1-04 → CI wait → merge. Then FED-M1-05 (migration script, codex/sonnet, 10K).