stack/scripts/analysis/reflect-board-history.sh at b76666166e9d73345835e0fb68934b30003475e0

mosaicstack/stack

Fork 0

Files

Hermes Agent b76666166e

ci/woodpecker/push/ci Pipeline was successful

Details

ci/woodpecker/pr/ci Pipeline was successful

Details

feat(agent-reflection): durable kernel — reflection.v1 capture + risk-floor + Phase-0 (#544 )

Build the durable kernel of the agent reflection loop. Passive end-of-run
capture of the doer's end-state as structured `reflection.v1` data, plus a
deterministic diff review risk-floor. The closed calibration/skill-synthesis
loop (design §7–§8) stays gated behind Phase-0 experiments P1/P2/P3.

- packages/macp: evaluateRiskFloor (pure, deterministic surface classifier)
  + reflection.v1 JSON Schema; 15 unit tests.
- packages/types: reflection.v1 zod schemas + self-report DTO; 10 unit tests.
- framework: fail-closed Stop hook (reflect-stop-hook.sh) writing the sidecar,
  registered as hooks.Stop in runtime/claude/settings.json. Strict no-op unless
  REFLECTION_MODE=solo|orchestrated; never blocks or fails a session.
- scripts/analysis: P1/P2/P3 experiment harnesses with pre-registered kill
  conditions and structured output.

Mechanical fields (risk, files_changed, ids, provenance) are written by the
hook; self-report fields (confidence, most_likely_wrong, known_not_in_diff) are
merged from an optional $REFLECTION_INPUT, else null + provenance.degraded=true.

Independent review remediations: empty/all-.mosaic diff still writes a sidecar
(grep no-match no longer aborts); session_id sanitized before path use.

Refs #544

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-16 15:55:15 -05:00

4.3 KiB

Executable File

Raw Blame History

View Raw

4.3 KiB Executable File Raw Blame History

4.3 KiB

Executable File

Raw Blame History