feat(agent-reflection): durable kernel — reflection.v1 capture + risk-floor + Phase-0 (#544)
Build the durable kernel of the agent reflection loop. Passive end-of-run capture of the doer's end-state as structured `reflection.v1` data, plus a deterministic diff review risk-floor. The closed calibration/skill-synthesis loop (design §7–§8) stays gated behind Phase-0 experiments P1/P2/P3. - packages/macp: evaluateRiskFloor (pure, deterministic surface classifier) + reflection.v1 JSON Schema; 15 unit tests. - packages/types: reflection.v1 zod schemas + self-report DTO; 10 unit tests. - framework: fail-closed Stop hook (reflect-stop-hook.sh) writing the sidecar, registered as hooks.Stop in runtime/claude/settings.json. Strict no-op unless REFLECTION_MODE=solo|orchestrated; never blocks or fails a session. - scripts/analysis: P1/P2/P3 experiment harnesses with pre-registered kill conditions and structured output. Mechanical fields (risk, files_changed, ids, provenance) are written by the hook; self-report fields (confidence, most_likely_wrong, known_not_in_diff) are merged from an optional $REFLECTION_INPUT, else null + provenance.degraded=true. Independent review remediations: empty/all-.mosaic diff still writes a sidecar (grep no-match no longer aborts); session_id sanitized before path use. Refs #544 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
173
docs/plans/agent-reflection-loop-PRD.md
Normal file
173
docs/plans/agent-reflection-loop-PRD.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# PRD — Agent Reflection Loop (durable kernel)
|
||||
|
||||
**Issue:** [#544](http://git.mosaicstack.dev/mosaicstack/stack/issues/544)
|
||||
**Source design:** jarvis-brain `docs/planning/AGENT-REFLECTION-LOOP.md` (commit df6576fc, debate-hardened v2)
|
||||
**Status:** in-progress
|
||||
**Scope rule:** Build the **durable kernel** only. The closed calibration/skill-synthesis loop
|
||||
(design §7–§8) is **gated** behind Phase-0 experiments P1/P2/P3 and is explicitly out of scope here.
|
||||
|
||||
---
|
||||
|
||||
## 1. Problem
|
||||
|
||||
At end-of-run an agent holds context that never reaches the diff or the "done" message —
|
||||
assumptions, shortcuts, untested paths, the single most-likely way the work is wrong. That context
|
||||
is what a lead/human needs to judge trust, and it evaporates when the session ends. Capture it
|
||||
mechanically as **structured data** (`reflection.v1`), and derive a **review risk-floor** from the
|
||||
change surface so risky diffs are flagged for independent review.
|
||||
|
||||
## 2. Non-goals (gated on Phase-0)
|
||||
|
||||
- No closed calibration loop (predicted-vs-actual scoring as a routing input).
|
||||
- No skill synthesis.
|
||||
- No automated reviewer routing/dispatch. The kernel **writes** the sidecar; pickup is future work.
|
||||
|
||||
## 3. Components & exact placement (main-branch truth)
|
||||
|
||||
| # | Component | Path | Mirror |
|
||||
| --- | -------------------- | ------------------------------------------------------------------------------------------------ | ----------------------------------- |
|
||||
| a | Stop hook (capture) | `packages/mosaic/framework/tools/qa/reflect-stop-hook.sh` | `tools/qa/prevent-memory-write.sh` |
|
||||
| a | Hook registration | `packages/mosaic/framework/runtime/claude/settings.json` (`hooks.Stop`) | existing `PreToolUse`/`PostToolUse` |
|
||||
| b | JSON Schema | `packages/macp/src/schemas/reflection.v1.schema.json` | `schemas/task.schema.json` |
|
||||
| b | TS types (zod) + DTO | `packages/types/src/reflection/{index.ts,reflection.dto.ts}` + re-export from `src/index.ts` | `packages/types/src/federation/*` |
|
||||
| c | Diff risk-floor | `packages/macp/src/risk-floor.ts` (+ `__tests__/risk-floor.test.ts`, export from `src/index.ts`) | `packages/macp/src/gate-runner.ts` |
|
||||
| d | Phase-0 scripts | `scripts/analysis/reflect-{git-history,board-history,calibration}.sh` | `scripts/publish-npmjs.sh` |
|
||||
|
||||
**Activation note (deliberate deviation):** the `settings-overlays/` directory has **no merge
|
||||
mechanism** (referenced only in docs), so a hooks overlay there would be inert. The Stop hook is
|
||||
registered in the canonical `runtime/claude/settings.json` — the same file the `mosaic` launcher
|
||||
reflects into `~/.claude/settings.json` (verified byte-identical hooks live there). Still fully
|
||||
vendored in-repo.
|
||||
|
||||
## 4. `reflection.v1` schema (authoritative field list)
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"schema": "reflection.v1", // literal
|
||||
"task_ref": "string", // canonical task ref; kernel derives from REFLECTION_TASK_REF or repo+branch
|
||||
"agent": "string", // persona/runtime id (REFLECTION_AGENT or "unknown")
|
||||
"session_id": "string", // from Stop payload session_id, else "unknown"
|
||||
"timestamp": "string", // ISO-8601 UTC
|
||||
"repo": "string", // repo root basename
|
||||
"confidence": 0.0, // FLOAT [0,1] — SELF-REPORTED (optional; null if not supplied)
|
||||
"most_likely_wrong": {
|
||||
// SELF-REPORTED (optional)
|
||||
"surface": "auth|data|infra|ui|build|test|docs|none",
|
||||
"description": "string",
|
||||
},
|
||||
"known_not_in_diff": "string|null", // SELF-REPORTED: "what I know that isn't visible in the diff"
|
||||
"risk": {
|
||||
// MECHANICAL — from risk-floor
|
||||
"needs_review": true,
|
||||
"score": 0.0, // [0,1]
|
||||
"surface": "auth|data|infra|ui|build|test|docs|none",
|
||||
"reason": "string",
|
||||
},
|
||||
"files_changed": ["string"], // MECHANICAL — git diff name-only
|
||||
"provenance": {
|
||||
"source": "stop-hook",
|
||||
"reflection_attempt": 1,
|
||||
"degraded": false, // true if self-report inputs missing/unreadable
|
||||
"reflection_mode": "off|solo|orchestrated",
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
**Mechanical vs self-reported.** A bash Stop hook cannot author the agent's self-assessment. The
|
||||
hook populates the **mechanical** fields deterministically (risk, files_changed, provenance, ids).
|
||||
The **self-reported** fields are read from an optional agent-supplied input file
|
||||
(`$REFLECTION_INPUT`, default `<repo>/.mosaic/reflection-input.json`) and merged if present;
|
||||
absent/unreadable → those fields null and `provenance.degraded=true`. This realizes the design's
|
||||
"hook is a pre-seed, not the asker" (§4).
|
||||
|
||||
## 5. Stop hook behavior (fail-closed, non-blocking)
|
||||
|
||||
1. Read Stop payload JSON from stdin.
|
||||
2. **Fail-closed:** if `REFLECTION_MODE` is unset or `off` → `exit 0` immediately (strict no-op). This
|
||||
is the global-registration safety guarantee.
|
||||
3. **Sentinel guard:** if `<sidecar>.lock` exists → `exit 0` (prevents re-fire loops). Create it,
|
||||
`trap` cleanup.
|
||||
4. Determine output dir: `$REFLECTION_DIR` else `<repo>/.mosaic/reflections/`. `mkdir -p`.
|
||||
5. Compute mechanical fields: `git diff --name-only` (HEAD + staged + worktree, best-effort),
|
||||
call risk-floor logic (inline bash port OR `node -e` into `@mosaicstack/macp` — see §6), session
|
||||
ids from payload + env.
|
||||
6. Merge optional `$REFLECTION_INPUT` self-report if readable JSON.
|
||||
7. Write `reflection.v1` to a temp file, `mv` (atomic) to `<dir>/<session>-<ts>.reflection.json`.
|
||||
8. Always `exit 0`. **Never** emit a `decision` field (Stop hooks are observational).
|
||||
|
||||
Hook must never fail the session: wrap risky steps, default to `degraded:true` on any error, exit 0.
|
||||
|
||||
## 6. Risk-floor (`packages/macp/src/risk-floor.ts`)
|
||||
|
||||
Pure, deterministic, no IO. Single source of truth for the verdict; the hook calls it via
|
||||
`node --input-type=module -e` (importing the built package) **or**, to avoid a node dependency in the
|
||||
hook path, the hook ports the same surface table. **Decision:** implement the canonical logic in TS
|
||||
(tested), and have the hook shell out to node when available, else fall back to a minimal inline
|
||||
classifier flagged `degraded:true`. (Keep the TS the authority; the inline path is a safety net.)
|
||||
|
||||
```ts
|
||||
export type ReviewSurface = 'auth' | 'data' | 'infra' | 'ui' | 'build' | 'test' | 'docs' | 'none';
|
||||
export interface RiskFloorInput {
|
||||
filesChanged: string[];
|
||||
insertions?: number;
|
||||
deletions?: number;
|
||||
}
|
||||
export interface RiskFloorVerdict {
|
||||
needs_review: boolean;
|
||||
score: number;
|
||||
surface: ReviewSurface;
|
||||
reason: string;
|
||||
}
|
||||
export function evaluateRiskFloor(input: RiskFloorInput): RiskFloorVerdict;
|
||||
```
|
||||
|
||||
Surface classification by path regex (first match wins, highest-risk surface dominates):
|
||||
|
||||
- `auth` (weight 1.0): `auth`, `login`, `session`, `token`, `permission`, `rbac`, `credential`, `secret`
|
||||
- `data` (0.9): `migration`, `prisma`, `schema`, `\.sql`, `entity`, `repository`, `seed`
|
||||
- `infra` (0.85): `docker`, `\.woodpecker`, `compose`, `traefik`, `deploy`, `helm`, `k8s`, `terraform`
|
||||
- `build` (0.6): `package.json`, `tsconfig`, `turbo.json`, `pnpm-`, `\.config\.`, `eslint`, `vite`
|
||||
- `ui` (0.4): `\.tsx`, `\.css`, `components/`, `apps/web/`
|
||||
- `test` (0.2): `\.spec\.`, `\.test\.`, `__tests__/`
|
||||
- `docs` (0.1): `\.md`, `docs/`
|
||||
- `none` (0.0): anything else
|
||||
|
||||
`needs_review = score >= THRESHOLD` (default `0.5`, overridable). `reason` names the files+surface
|
||||
that tripped it. **Subordinate to CI:** this is a _floor_ (minimum review requirement) only;
|
||||
consumers MUST treat CI/tests as authoritative above the floor (precedence: CI/tests > human merge >
|
||||
reviewer verdict > self-reflection). Documented in the module header.
|
||||
|
||||
## 7. Phase-0 experiment scripts (`scripts/analysis/`)
|
||||
|
||||
Offline, no-infra bash. Each script: `#!/usr/bin/env bash`, `set -euo pipefail`, header `Usage:` +
|
||||
`Requirements:`, flag parsing, **prints its pre-registered kill condition**, emits structured
|
||||
(JSON/markdown) output. They are harnesses + rubrics — real corpora are wired later.
|
||||
|
||||
- `reflect-git-history.sh` (**P2** — only-self-reflection bucket): scan `git log` for failure signals
|
||||
(reverts, `fix:`/`hotfix` shortly after a feature merge) over a window; classify each by which gate
|
||||
would catch it (CI / human-review / only-self-reflection) via a pre-registered heuristic; tally.
|
||||
Kill: bucket-3 near-empty → no §7/§8.
|
||||
- `reflect-board-history.sh` (**P3** — outcome detectability): given a task/board export (or the
|
||||
git history of `data/` task files), measure the fraction of completed tasks with a
|
||||
machine-detectable correct/wrong signal within 30 days. Kill: base-rate < 20% → caveat-notes only.
|
||||
- `reflect-calibration.sh` (**P1** — confidence signal): consume a labeled corpus (JSONL of
|
||||
`{confidence, correct}`), compute discrimination (AUC/lift) on the self-rated-high subset, print
|
||||
the metric vs the pre-registered chance threshold. Kill: AUC ≈ chance on the high subset → no §7/§8.
|
||||
|
||||
## 8. CI / quality gates
|
||||
|
||||
- TS packages: `pnpm typecheck` (tsc --noEmit), `pnpm lint` (eslint), `pnpm format:check`
|
||||
(prettier), `pnpm test` (vitest). ESM, NodeNext, `.js` import specifiers, `*.dto.ts` at boundaries.
|
||||
- New files in existing packages need no CI config change; add ≥1 vitest spec per new TS module.
|
||||
- Bash scripts/hook are dev/runtime tooling, not CI-built; keep them `shellcheck`-clean.
|
||||
|
||||
## 9. Acceptance criteria
|
||||
|
||||
1. `REFLECTION_MODE` unset → hook is a strict no-op (`exit 0`, no file written). **(test)**
|
||||
2. With `REFLECTION_MODE=solo`, hook writes a schema-valid `reflection.v1` with correct mechanical
|
||||
fields; self-report merged when `$REFLECTION_INPUT` present, `degraded:true` when absent.
|
||||
3. `evaluateRiskFloor` deterministic across all surfaces; unit-tested incl. auth/data/infra → review,
|
||||
docs/test → no review, empty → `none`/no review.
|
||||
4. `reflection.v1` zod type + JSON Schema agree; sidecar validates against the schema.
|
||||
5. Phase-0 scripts run offline, print kill conditions, emit structured output, shellcheck-clean.
|
||||
6. `pnpm typecheck && pnpm lint && pnpm format:check && pnpm test` green; independent review passed.
|
||||
55
docs/scratchpads/544-agent-reflection-loop.md
Normal file
55
docs/scratchpads/544-agent-reflection-loop.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Scratchpad — #544 Agent Reflection Loop (durable kernel)
|
||||
|
||||
**Started:** 2026-06-16 · **Branch:** `feat/agent-reflection-loop` · **Base:** `main` @ c461380
|
||||
|
||||
## Goal
|
||||
|
||||
Bake the durable kernel of the agent reflection loop into the Mosaic Stack
|
||||
monorepo through full delivery gates. Kernel only; closed loop (§7–§8) gated on
|
||||
Phase-0. Authoritative spec: `docs/plans/agent-reflection-loop-PRD.md`. Task
|
||||
breakdown: `docs/tasks/544-agent-reflection-loop.md`.
|
||||
|
||||
## Timeline / decisions
|
||||
|
||||
- Mapped house style against `main` truth (the earlier recon had mapped a dirty
|
||||
feature branch and returned non-existent paths; re-cloned `main` clean).
|
||||
- macp uses co-located `*.spec.ts`; types uses `src/<mod>/{*.ts, *.dto.ts, __tests__/*.spec.ts}`.
|
||||
- zod v4 + class-validator/class-transformer present in `@mosaicstack/types`;
|
||||
`packages/types/tsconfig.json` enables `experimentalDecorators`/`emitDecoratorMetadata`.
|
||||
- **Gotcha (fixed):** `class-transformer`'s `@Type` calls `Reflect.getMetadata`
|
||||
at module-load time; the types vitest env has no `reflect-metadata`, so any test
|
||||
importing the reflection barrel crashed on import. `chat.dto.ts` avoids this by
|
||||
using class-validator only. Fix: dropped `@Type`/`@ValidateNested` from the DTO;
|
||||
zod owns deep nested validation.
|
||||
- **Gotcha (fixed):** Stop hook `EXIT` trap referenced a `main`-local `lock` →
|
||||
`unbound variable` under `set -u` at exit. Promoted to a global `LOCKFILE`.
|
||||
- **Gotcha (fixed):** the hook's own lock + `.mosaic/` scratch leaked into
|
||||
`files_changed`. Excluded `^\.mosaic/` from the change-surface scan.
|
||||
|
||||
## Verification evidence
|
||||
|
||||
- macp: typecheck OK, lint OK, **88 tests pass** (15 new risk-floor).
|
||||
- types: typecheck OK, lint OK, **64 tests pass** (10 new reflection).
|
||||
- Root: `pnpm typecheck` (41 tasks), `pnpm lint` (23), `pnpm format:check`, `pnpm build` (23) — all green.
|
||||
- Stop hook smoke (throwaway git repo): TEST1 no-op (mode unset, 0 files);
|
||||
TEST2 solo degraded, `.mosaic/` excluded, auth→needs_review; TEST3 self-report
|
||||
merged, degraded=false; TEST4 lock suppresses re-fire. All pass, always exit 0.
|
||||
- shellcheck clean: hook + `reflect-{git-history,board-history,calibration}.sh`.
|
||||
- Phase-0 smoke: P2 on this repo (142 failures classified), P1 AUC=0.875 on a
|
||||
synthetic fixture, P3 base-rate on a synthetic board — all emit structured output
|
||||
- kill conditions.
|
||||
|
||||
## Open risks / follow-ups
|
||||
|
||||
- Full `pnpm test` (DB-bound packages) validated via CI's postgres service, not
|
||||
locally; affected packages (macp, types) are DB-independent and green here.
|
||||
- sequential-thinking MCP was registered mid-session (effective next session);
|
||||
this session compensated with the written PRD as the planning artifact.
|
||||
- Phase-0 corpora are not yet wired — scripts are harnesses + pre-registered
|
||||
rubrics (P1/P2/P3 tasks tracked in jarvis-brain `agent-reflection-loop` project).
|
||||
|
||||
## Gate status
|
||||
|
||||
- [x] PRD authored · [x] issue #544 created + linked · [x] code + tests
|
||||
- [x] local gates green · [ ] independent code review · [ ] PR opened
|
||||
- [ ] CI terminal green · [ ] merged to main · [ ] issue closed
|
||||
67
docs/tasks/544-agent-reflection-loop.md
Normal file
67
docs/tasks/544-agent-reflection-loop.md
Normal file
@@ -0,0 +1,67 @@
|
||||
# 544: Agent Reflection Loop — durable kernel
|
||||
|
||||
**Issue:** [#544](http://git.mosaicstack.dev/mosaicstack/stack/issues/544)
|
||||
**PRD:** [`docs/plans/agent-reflection-loop-PRD.md`](../plans/agent-reflection-loop-PRD.md)
|
||||
**Branch:** `feat/agent-reflection-loop`
|
||||
|
||||
## Context
|
||||
|
||||
Build the **durable kernel** of the agent reflection loop: passive end-of-run
|
||||
capture of the doer's end-state as structured `reflection.v1` data, plus a
|
||||
deterministic diff **review risk-floor**. The closed calibration / skill-synthesis
|
||||
loop (design §7–§8) stays **gated** behind Phase-0 experiments P1/P2/P3 and is
|
||||
explicitly out of scope here. Source design: jarvis-brain
|
||||
`docs/planning/AGENT-REFLECTION-LOOP.md` (debate-hardened v2).
|
||||
|
||||
Scope rule, non-goals, the full `reflection.v1` field list, and acceptance
|
||||
criteria live in the PRD. This file is the task breakdown + status.
|
||||
|
||||
## Work items
|
||||
|
||||
| # | Item | Path | Status |
|
||||
| --- | ----------------------------------------------------- | --------------------------------------------------------- | ------ |
|
||||
| 1 | Diff risk-floor (pure, deterministic) + unit tests | `packages/macp/src/risk-floor.ts`, `risk-floor.spec.ts` | done |
|
||||
| 2 | `reflection.v1` JSON Schema (documented contract) | `packages/macp/src/schemas/reflection.v1.schema.json` | done |
|
||||
| 3 | `reflection.v1` zod schemas + self-report DTO + tests | `packages/types/src/reflection/*` | done |
|
||||
| 4 | Stop hook (fail-closed capture) | `packages/mosaic/framework/tools/qa/reflect-stop-hook.sh` | done |
|
||||
| 5 | Hook registration (`hooks.Stop`) | `packages/mosaic/framework/runtime/claude/settings.json` | done |
|
||||
| 6 | Phase-0 experiment harnesses (P1/P2/P3) | `scripts/analysis/reflect-*.sh` | done |
|
||||
|
||||
## Design decisions (this implementation)
|
||||
|
||||
- **Mechanical vs self-reported split.** A bash Stop hook cannot author the
|
||||
agent's self-assessment, so it writes the mechanical fields (risk-floor verdict,
|
||||
`files_changed`, ids, provenance) and merges an optional agent-supplied
|
||||
`$REFLECTION_INPUT` self-report; absent/unreadable ⇒ those fields `null` and
|
||||
`provenance.degraded = true`.
|
||||
- **Risk-floor authority.** `evaluateRiskFloor` (TS, tested) is the source of
|
||||
truth. The hook ports the same surface table inline to avoid a node/build
|
||||
dependency on the hook path; the two are documented as kept in sync.
|
||||
- **Hook registration deviation.** `settings-overlays/` has no merge mechanism
|
||||
(docs-only), so a hooks overlay there would be inert. The Stop hook is
|
||||
registered in the canonical `runtime/claude/settings.json` — the same file the
|
||||
`mosaic` launcher reflects into `~/.claude/settings.json`. Still vendored in-repo.
|
||||
- **DTO without class-transformer.** `reflection.dto.ts` uses class-validator only
|
||||
(no `@Type`), matching `chat.dto.ts`, so the module imports without a
|
||||
`reflect-metadata` shim in the types-package test env. Deep nested validation is
|
||||
owned by the zod `ReflectionSelfReportSchema` (the runtime authority the hook uses).
|
||||
- **`.mosaic/` excluded** from the change surface — it is agent scratch
|
||||
(reflections, locks, self-report input), not part of the diff under review.
|
||||
|
||||
## Verification
|
||||
|
||||
- `pnpm --filter @mosaicstack/macp test` → 88 passed (15 new risk-floor).
|
||||
- `pnpm --filter @mosaicstack/types test` → 64 passed (10 new reflection).
|
||||
- Root `pnpm typecheck`, `pnpm lint`, `pnpm format:check`, `pnpm build` → green.
|
||||
- Stop hook smoke: fail-closed no-op (mode unset), solo capture (degraded),
|
||||
self-report merge (degraded=false), re-fire lock guard — all pass.
|
||||
- All bash (hook + 3 Phase-0 scripts) shellcheck-clean; Phase-0 scripts emit
|
||||
structured JSON/markdown and print their pre-registered kill conditions.
|
||||
|
||||
## Activation (post-merge, deployment concern — not a blocker)
|
||||
|
||||
The Stop hook only activates when a launcher/profile sets
|
||||
`REFLECTION_MODE=solo|orchestrated`; unset/`off` is a strict no-op, so global
|
||||
registration is safe. `framework/install.sh` rsyncs the hook into
|
||||
`~/.config/mosaic/tools/qa/`, and the `mosaic` launcher reflects the updated
|
||||
`settings.json` (`hooks.Stop`) into `~/.claude/settings.json`.
|
||||
Reference in New Issue
Block a user