68 lines
4.2 KiB
Markdown
68 lines
4.2 KiB
Markdown
# 544: Agent Reflection Loop — durable kernel
|
||
|
||
**Issue:** [#544](http://git.mosaicstack.dev/mosaicstack/stack/issues/544)
|
||
**PRD:** [`docs/plans/agent-reflection-loop-PRD.md`](../plans/agent-reflection-loop-PRD.md)
|
||
**Branch:** `feat/agent-reflection-loop`
|
||
|
||
## Context
|
||
|
||
Build the **durable kernel** of the agent reflection loop: passive end-of-run
|
||
capture of the doer's end-state as structured `reflection.v1` data, plus a
|
||
deterministic diff **review risk-floor**. The closed calibration / skill-synthesis
|
||
loop (design §7–§8) stays **gated** behind Phase-0 experiments P1/P2/P3 and is
|
||
explicitly out of scope here. Source design: jarvis-brain
|
||
`docs/planning/AGENT-REFLECTION-LOOP.md` (debate-hardened v2).
|
||
|
||
Scope rule, non-goals, the full `reflection.v1` field list, and acceptance
|
||
criteria live in the PRD. This file is the task breakdown + status.
|
||
|
||
## Work items
|
||
|
||
| # | Item | Path | Status |
|
||
| --- | ----------------------------------------------------- | --------------------------------------------------------- | ------ |
|
||
| 1 | Diff risk-floor (pure, deterministic) + unit tests | `packages/macp/src/risk-floor.ts`, `risk-floor.spec.ts` | done |
|
||
| 2 | `reflection.v1` JSON Schema (documented contract) | `packages/macp/src/schemas/reflection.v1.schema.json` | done |
|
||
| 3 | `reflection.v1` zod schemas + self-report DTO + tests | `packages/types/src/reflection/*` | done |
|
||
| 4 | Stop hook (fail-closed capture) | `packages/mosaic/framework/tools/qa/reflect-stop-hook.sh` | done |
|
||
| 5 | Hook registration (`hooks.Stop`) | `packages/mosaic/framework/runtime/claude/settings.json` | done |
|
||
| 6 | Phase-0 experiment harnesses (P1/P2/P3) | `scripts/analysis/reflect-*.sh` | done |
|
||
|
||
## Design decisions (this implementation)
|
||
|
||
- **Mechanical vs self-reported split.** A bash Stop hook cannot author the
|
||
agent's self-assessment, so it writes the mechanical fields (risk-floor verdict,
|
||
`files_changed`, ids, provenance) and merges an optional agent-supplied
|
||
`$REFLECTION_INPUT` self-report; absent/unreadable ⇒ those fields `null` and
|
||
`provenance.degraded = true`.
|
||
- **Risk-floor authority.** `evaluateRiskFloor` (TS, tested) is the source of
|
||
truth. The hook ports the same surface table inline to avoid a node/build
|
||
dependency on the hook path; the two are documented as kept in sync.
|
||
- **Hook registration deviation.** `settings-overlays/` has no merge mechanism
|
||
(docs-only), so a hooks overlay there would be inert. The Stop hook is
|
||
registered in the canonical `runtime/claude/settings.json` — the same file the
|
||
`mosaic` launcher reflects into `~/.claude/settings.json`. Still vendored in-repo.
|
||
- **DTO without class-transformer.** `reflection.dto.ts` uses class-validator only
|
||
(no `@Type`), matching `chat.dto.ts`, so the module imports without a
|
||
`reflect-metadata` shim in the types-package test env. Deep nested validation is
|
||
owned by the zod `ReflectionSelfReportSchema` (the runtime authority the hook uses).
|
||
- **`.mosaic/` excluded** from the change surface — it is agent scratch
|
||
(reflections, locks, self-report input), not part of the diff under review.
|
||
|
||
## Verification
|
||
|
||
- `pnpm --filter @mosaicstack/macp test` → 88 passed (15 new risk-floor).
|
||
- `pnpm --filter @mosaicstack/types test` → 64 passed (10 new reflection).
|
||
- Root `pnpm typecheck`, `pnpm lint`, `pnpm format:check`, `pnpm build` → green.
|
||
- Stop hook smoke: fail-closed no-op (mode unset), solo capture (degraded),
|
||
self-report merge (degraded=false), re-fire lock guard — all pass.
|
||
- All bash (hook + 3 Phase-0 scripts) shellcheck-clean; Phase-0 scripts emit
|
||
structured JSON/markdown and print their pre-registered kill conditions.
|
||
|
||
## Activation (post-merge, deployment concern — not a blocker)
|
||
|
||
The Stop hook only activates when a launcher/profile sets
|
||
`REFLECTION_MODE=solo|orchestrated`; unset/`off` is a strict no-op, so global
|
||
registration is safe. `framework/install.sh` rsyncs the hook into
|
||
`~/.config/mosaic/tools/qa/`, and the `mosaic` launcher reflects the updated
|
||
`settings.json` (`hooks.Stop`) into `~/.claude/settings.json`.
|