Files
stack/docs/scratchpads/install-ux-v2-20260405.md
jason.woltje 43667d7349
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
docs: orchestrator close-out IUV-M01 — mark tasks done, append session 2 (#443)
2026-04-05 22:40:08 +00:00

149 lines
10 KiB
Markdown

# Install UX v2 — Orchestrator Scratchpad
## Session 1 — 2026-04-05 (orchestrator scaffold)
### Trigger
Real-run testing of `@mosaicstack/mosaic@0.0.25` (fresh install of the release we just shipped from the parent mission `install-ux-hardening-20260405`) surfaced a critical regression and a cluster of UX failings. User feedback verbatim:
> The skill/additional feature installation section of install.sh is unsable
> The "quick-start" is asking way too many questions. This process should be much faster to get a quick start.
> The installater should have a main menu that allows for a drill-down install approach.
> "Plugins" — Install Recommended Plugins / Custom
> "Providers" — …
> The gateway port is not prefilling with 14242 for default
> What is the CORS origin for? Is that the webUI that isn't working yet? Maybe we should ask for the fqdn/hostname instead? There must be a better way to handle this.
Plus the critical bug, reproduced verbatim:
```
◇ Admin email
│ jason@woltje.com
Admin password (min 8 chars): ****************
Confirm password: ****************
▲ Bootstrap failed (400): {"message":["property email should not exist","property password should not exist"],"error":"Bad Request","statusCode":400}
✔ Wizard complete.
✔ Install manifest written: /home/jarvis/.config/mosaic/.install-manifest.json
✔ Done.
```
Note the `✔ Wizard complete` and `✔ Done` lines **after** the 400. That's a second bug — failure didn't propagate in interactive mode.
### Diagnosis — orchestrator pre-scope
To avoid handing workers a vague prompt, pre-identified the concrete fix sites:
**Bug 1 (critical) — DTO class erasure.** `apps/gateway/src/admin/bootstrap.controller.ts:16`:
```ts
import type { BootstrapSetupDto, BootstrapStatusDto, BootstrapResultDto } from './bootstrap.dto.js';
```
`import type` erases the class at runtime. `@Body() dto: BootstrapSetupDto` then has no runtime metatype — `design:paramtypes` reflects `Object`. Nest's `ValidationPipe` with `whitelist: true` + `forbidNonWhitelisted: true` receives a plain Object metatype, treats every incoming property as non-whitelisted, and 400s with `"property email should not exist", "property password should not exist"`.
**One-character fix:** drop the `type` keyword on the `BootstrapSetupDto` import. `BootstrapStatusDto` and `BootstrapResultDto` are fine as type-only imports because they're used only in return type positions, not as `@Body()` metatypes.
Must be covered by an **integration test that binds through Nest**, not a controller unit test that imports the DTO directly — the unit test path would pass even with `import type` because it constructs the pipe manually. An e2e test with `@nestjs/testing` + `supertest` against the real `/api/bootstrap/setup` endpoint is the right guard.
**Bug 2 — interactive silent failure.** `packages/mosaic/src/wizard.ts:147-150`:
```ts
if (!bootstrapResult.completed && headlessRun) {
prompter.warn('Admin bootstrap failed in headless mode — aborting wizard.');
process.exit(1);
}
```
The guard is `&& headlessRun`. In interactive mode, `completed: false` is silently swallowed and the wizard continues to the success lines. Fix: propagate failure in both modes. Decision for the worker — either `throw` or `process.exit(1)` with a clear error.
**Bug 3 — port prefill.** `packages/mosaic/src/stages/gateway-config.ts:77-88`:
```ts
const raw = await p.text({
message: 'Gateway port',
defaultValue: defaultPort.toString(),
...
});
```
The stage is passing `defaultValue`. Either the `WizardPrompter.text` adapter is dropping it, or the underlying `@clack/prompts` call expects `initialValue` (which actually prefills the buffer) vs `defaultValue` (which is used only if the user submits an empty string). Worker should verify the adapter and likely switch to `initialValue` semantics so the user sees `14242` in the field.
**Bug 4 — Pi SDK copy gap.** The `"What is Mosaic?"` intro text enumerates Claude Code, Codex, and OpenCode but never mentions Pi SDK, which is the actual agent runtime behind those frontends. Purely a copy edit — find the string, add Pi SDK.
### Mission shape
Three milestones, three tracks, different tiers:
1. **IUV-M01 Hotfix** (sonnet) — the four bugs above + release `mosaic-v0.0.26`. Small, fast, unblocks the 0.0.25 happy path.
2. **IUV-M02 UX polish** (sonnet) — CORS origin → FQDN/hostname abstraction; diagnose and rework the skill installer section. Diagnostic-heavy.
3. **IUV-M03 Provider-first intelligent flow** (opus) — the big one: drill-down main menu, Quick Start path that's actually quick, provider-first natural-language intake with agent self-naming (OpenClaw-style). Architectural.
Sequencing: strict. M01 ships first as a hotfix release (mosaic-v0.0.26). M02 is diagnostic-heavy and can share groundwork with M03 but ships separately for clean release notes. M03 is the architectural anchor and lands last as `mosaic-v0.0.27`.
### Open design questions (to be resolved by workers, not pre-decided)
- M01: does `process.exit(1)` vs `throw` matter for how `tools/install.sh` surfaces the error? Worker should check the install.sh call site and pick the behavior that surfaces cleanly.
- M03: what LLM call powers the intent intake, and what's the offline fallback? Options: (a) reuse the provider the user is configuring (chicken-and-egg — provider setup hasn't happened yet), (b) a bundled deterministic "advisor" that hard-codes common intents, (c) require a provider key up-front before intake. Design doc (IUV-03-01) must resolve.
- M03: is the "agent self-naming" persistent across all future `mosaic` invocations, or a per-session nickname? Probably persistent — lives in `~/.config/mosaic/agent.json` or similar. Worker to decide + document.
### Non-goals for this mission
- No GUI / web UI
- No registry / pipeline migration
- No multi-user / multi-tenant onboarding
- No rework of `mosaic uninstall` (stable from parent mission)
### Known tooling caveats (carry forward from parent mission)
- `issue-create.sh` / `pr-create.sh` wrappers have an `eval` bug with multiline bodies — use Gitea REST API fallback with `load_credentials gitea-mosaicstack`
- `pr-ci-wait.sh` reports `state=unknown` against Woodpecker (combined-status endpoint gap) — use `tea pr` glyphs or poll the commit status endpoint directly
- Protected `main`, squash-merge only, PR-required
- CI queue guard before push/merge: `~/.config/mosaic/tools/git/ci-queue-wait.sh --purpose push|merge`
### Next action
1. Create Gitea issues for M01, M02, M03
2. Open the mission-scaffold docs PR (same pattern as parent mission's PR #430)
3. After merge, delegate IUV-M01 to a sonnet subagent in an isolated worktree with the concrete fix-site pointers above
## Session 2 — 2026-04-05 (IUV-M01 delivery + close-out)
### Outcome
IUV-M01 shipped. `mosaic-v0.0.26` released and registry latest confirmed `0.0.26`.
### PRs merged
| PR | Title | Merge |
| ---- | ------------------------------------------------------------------------ | -------- |
| #440 | fix: bootstrap hotfix — DTO erasure, wizard failure, port prefill, copy | 0ae932ab |
| #441 | fix: add vitest.config.ts to eslint allowDefaultProject (#440 build fix) | c08aa6fa |
| #442 | docs: mark IUV-M01 complete — mosaic-v0.0.26 released | 78388437 |
### Bugs fixed (all 4 in worker's PR #440)
1. **DTO class erasure**`apps/gateway/src/admin/bootstrap.controller.ts:16` — dropped `type` from `import { BootstrapSetupDto }`. Guarded by new e2e test `bootstrap.e2e.spec.ts` (4 cases) that binds through a real Nest app with `ValidationPipe { whitelist, forbidNonWhitelisted }`. Test suite needed `unplugin-swc` in `apps/gateway/vitest.config.ts` to emit `decoratorMetadata` (tsx/esbuild can't).
2. **Wizard silent failure**`packages/mosaic/src/wizard.ts` — removed the `&& headlessRun` guard so `!bootstrapResult.completed` now aborts in both modes.
3. **Port prefill** — root cause was clack's `defaultValue` vs `initialValue` semantics (`defaultValue` only fills on empty submit, `initialValue` prefills the buffer). Added an `initialValue` field to `WizardPrompter.text()` interface, threaded through clack and headless prompters, switched `gateway-config.ts` port/url prompts to use it.
4. **Pi SDK copy**`packages/mosaic/src/stages/welcome.ts` — intro copy now lists Pi SDK.
### Mid-delivery hiccup — tsconfig/eslint cross-contamination
Worker's initial approach added `vitest.config.ts` to `apps/gateway/tsconfig.json`'s `include` to appease the eslint parser. That broke `pnpm --filter @mosaicstack/gateway build` with TS6059 (`vitest.config.ts` outside `rootDir: "src"`). The publish pipeline on the `#440` merge commit failed.
**Correct fix** (worker's PR #441): leave `tsconfig.json` clean (`include: ["src/**/*"]`) and instead add the file to `allowDefaultProject` in the root `eslint.config.mjs`. This keeps the tsc program strict while letting eslint resolve a parser project for the standalone config file.
**Pattern to remember**: when adding root-level `.ts` config files (vitest, build scripts) to a package with `rootDir: "src"`, the eslint parser project conflict is solved with `allowDefaultProject`, NEVER by widening tsconfig include. I had independently arrived at the same fix on a branch before the worker shipped #441 — deleted the duplicate.
### Residual follow-ups carried forward
1. Headless prompter fallback order: worker set `initialValue > defaultValue` in the headless path. Correct semantic, but any future headless test that explicitly depends on `defaultValue` precedence will need review.
2. Vitest + SWC decorator metadata pattern is now the blessed approach for NestJS e2e tests in this monorepo. Any other package that adds NestJS e2e tests should mirror `apps/gateway/vitest.config.ts`.
### Next action
- Close out orchestrator doc sync (this commit): mark M01 subtasks done in `TASKS.md`, update manifest phase to Execution, commit scratchpad session 2, PR to main.
- After merge, delegate IUV-M02 (sonnet, isolated worktree). Dependencies: IUV-02-01 (CORS→FQDN) starts unblocked since M01 is released; first real task for the M02 worker is diagnosing the skill installer failure modes (IUV-02-02) against the fresh 0.0.26 install.