stack/docs/scratchpads/harness-20260321.md

# Mission Scratchpad — Harness Foundation

> Append-only log. NEVER delete entries. NEVER overwrite sections.
> This is the orchestrator's working memory across sessions.

## Original Mission Prompt

```
Jason wants to get the gateway and TUI working as a real daily-driver harness.
The system needs: multi-provider LLM access, task-aware agent routing, conversation persistence,
security isolation, session hardening, job queue foundation, and channel protocol design for
future Matrix/remote integration.

Provider decisions: Anthropic (Sonnet 4.6, Opus 4.6), OpenAI (Codex gpt-5.4), Z.ai (GLM-5),
OpenRouter, Ollama. Embeddings via Ollama local models.

Pi SDK stays as agent runtime. Build with Matrix integration in mind but foundation first.
Agent routing per task with granular specification is required.
```

## Planning Decisions

### 2026-03-21 — Phase 9 PRD and mission setup

- PRD created as `docs/PRD-Harness_Foundation.md` with canonical Mosaic template format
- 7 milestones, 71 tasks total
- Milestone order: M1 (persistence) → M2 (security) → M3 (providers) → M4 (routing) → M5 (sessions) → M6 (jobs) → M7 (channel design)
- M1 and M2 are hard prerequisites — no provider or routing work until conversations persist and data is user-scoped
- Pi SDK kept as agent runtime; providers plug in via adapter pattern underneath
- Embeddings migrated from OpenAI to Ollama local (nomic-embed-text or mxbai-embed-large)
- BullMQ chosen for job queue (Valkey-compatible, TypeScript-native)
- Channel protocol is design-only in this phase; Matrix implementation deferred to Phase 10
- Models confirmed: Claude Sonnet 4.6, Opus 4.6, Haiku 4.5, Codex gpt-5.4, GLM-5, Ollama locals
- Routing engine: rule-based classification first, LLM-assisted later
- Default routing: coding-complex→Opus, coding-moderate→Sonnet, coding-simple→Codex, research→Codex, summarization→GLM-5, conversation→Sonnet, cheap/general→Haiku, offline→Ollama

### Architecture decisions

- Provider adapter pattern: each provider implements IProviderAdapter, registered in Pi SDK's provider registry
- Routing flow: classify message → match rules by priority → check provider health → fallback chain → dispatch
- Context window management: summarize older messages when history exceeds 80% of model context
- OAuth pattern: URL-display + clipboard + Valkey poll token (same as P8-012 design)
- Embedding dimension: migration from 1536 (OpenAI) to 768/1024 (Ollama) — may require re-embedding existing insights

## Session Log

| Session | Date       | Milestone | Tasks Done                       | Outcome                                        |
| ------- | ---------- | --------- | -------------------------------- | ---------------------------------------------- |
| 1       | 2026-03-21 | Planning  | PRD, manifest, tasks, scratchpad | Mission initialized, planning gate in progress |

## Open Questions

1. Z.ai GLM-5 API format — OpenAI-compatible or custom? (Research in M3-005)
2. Which Ollama embedding model: nomic-embed-text (768-dim) vs mxbai-embed-large (1024-dim)? (Test in M3-009)
3. Provider credentials: env vars for system defaults + DB for per-user overrides? (ASSUMPTION: hybrid)
4. Pi SDK provider adapter support — needs verification in M3-001 before committing to adapter pattern

## Corrections

<!-- Record any corrections to earlier decisions or assumptions. -->