feat(agent-reflection): durable kernel (reflection.v1 capture + risk-floor + Phase-0 scripts) #544

Closed
opened 2026-06-16 20:29:11 +00:00 by jason.woltje · 0 comments
Owner

Agent Reflection Loop — durable kernel + Phase-0 gate

Bakes the durable kernel of the agent reflection/calibration design into the stack. The closed calibration/skill-synthesis loop stays gated behind three Phase-0 experiments (it must be earned, not assumed) — per the debate-hardened design in jarvis-brain docs/planning/AGENT-REFLECTION-LOOP.md.

Scope (this issue)

  • (a) End-of-run reflection capture — a non-blocking, sentinel-guarded Claude Code Stop hook that writes a structured reflection.v1 JSON sidecar. Fail-closed: no-op unless REFLECTION_MODE is enabled. Emits no decision, always exit 0.
    • packages/mosaic/framework/tools/qa/reflect-stop-hook.sh + registration in runtime/claude/settings.json.
  • (b) reflection.v1 schema/types — JSON Schema (packages/macp/src/schemas/) + zod/DTO (packages/types/src/reflection/).
  • (c) Diff-triggered review risk-floor — pure function mapping change-surface → { needs_review, reason, score }, subordinate to CI (precedence: CI/tests > human merge > reviewer > self-reflection). packages/macp/src/risk-floor.ts.
  • (d) Phase-0 experiment scripts — offline analysis of real git/board history (scripts/analysis/): P1 confidence-signal, P2 only-self-reflection bucket, P3 outcome-detectability base-rate. Each carries its pre-registered kill condition.

Out of scope (gated on Phase-0 P1/P2/P3)

  • The closed calibration loop and skill-synthesis (design §7–§8). Not built until the experiments pass.

Acceptance criteria

  • Hook writes a schema-valid reflection.v1 sidecar and is a strict no-op when REFLECTION_MODE unset.
  • reflection.v1 types exported from @mosaicstack/types; JSON Schema validates the sidecar.
  • risk-floor returns a deterministic verdict; unit-tested across surfaces (auth/data/infra/ui/build/none).
  • Phase-0 scripts run offline and emit structured results.
  • pnpm typecheck && pnpm lint && pnpm format:check && pnpm test green.

Source design: jarvis-brain docs/planning/AGENT-REFLECTION-LOOP.md (commit df6576fc).

## Agent Reflection Loop — durable kernel + Phase-0 gate Bakes the **durable kernel** of the agent reflection/calibration design into the stack. The closed calibration/skill-synthesis loop stays **gated** behind three Phase-0 experiments (it must be earned, not assumed) — per the debate-hardened design in jarvis-brain `docs/planning/AGENT-REFLECTION-LOOP.md`. ### Scope (this issue) - **(a) End-of-run reflection capture** — a non-blocking, sentinel-guarded Claude Code `Stop` hook that writes a structured `reflection.v1` JSON sidecar. Fail-closed: no-op unless `REFLECTION_MODE` is enabled. Emits no `decision`, always exit 0. - `packages/mosaic/framework/tools/qa/reflect-stop-hook.sh` + registration in `runtime/claude/settings.json`. - **(b) `reflection.v1` schema/types** — JSON Schema (`packages/macp/src/schemas/`) + zod/DTO (`packages/types/src/reflection/`). - **(c) Diff-triggered review risk-floor** — pure function mapping change-surface → `{ needs_review, reason, score }`, subordinate to CI (precedence: CI/tests > human merge > reviewer > self-reflection). `packages/macp/src/risk-floor.ts`. - **(d) Phase-0 experiment scripts** — offline analysis of real git/board history (`scripts/analysis/`): P1 confidence-signal, P2 only-self-reflection bucket, P3 outcome-detectability base-rate. Each carries its pre-registered kill condition. ### Out of scope (gated on Phase-0 P1/P2/P3) - The closed calibration loop and skill-synthesis (design §7–§8). Not built until the experiments pass. ### Acceptance criteria - Hook writes a schema-valid `reflection.v1` sidecar and is a strict no-op when `REFLECTION_MODE` unset. - `reflection.v1` types exported from `@mosaicstack/types`; JSON Schema validates the sidecar. - `risk-floor` returns a deterministic verdict; unit-tested across surfaces (auth/data/infra/ui/build/none). - Phase-0 scripts run offline and emit structured results. - `pnpm typecheck && pnpm lint && pnpm format:check && pnpm test` green. Source design: jarvis-brain `docs/planning/AGENT-REFLECTION-LOOP.md` (commit df6576fc).
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaicstack/stack#544