feat(agent-reflection): durable kernel — reflection.v1 capture + risk-floor + Phase-0 (#544)

Build the durable kernel of the agent reflection loop. Passive end-of-run capture of the doer's end-state as structured `reflection.v1` data, plus a deterministic diff review risk-floor. The closed calibration/skill-synthesis loop (design §7–§8) stays gated behind Phase-0 experiments P1/P2/P3. - packages/macp: evaluateRiskFloor (pure, deterministic surface classifier) + reflection.v1 JSON Schema; 15 unit tests. - packages/types: reflection.v1 zod schemas + self-report DTO; 10 unit tests. - framework: fail-closed Stop hook (reflect-stop-hook.sh) writing the sidecar, registered as hooks.Stop in runtime/claude/settings.json. Strict no-op unless REFLECTION_MODE=solo|orchestrated; never blocks or fails a session. - scripts/analysis: P1/P2/P3 experiment harnesses with pre-registered kill conditions and structured output. Mechanical fields (risk, files_changed, ids, provenance) are written by the hook; self-report fields (confidence, most_likely_wrong, known_not_in_diff) are merged from an optional $REFLECTION_INPUT, else null + provenance.degraded=true. Independent review remediations: empty/all-.mosaic diff still writes a sidecar (grep no-match no longer aborts); session_id sanitized before path use. Refs #544 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 15:55:15 -05:00
parent c461380a4a
commit b76666166e
17 changed files with 1498 additions and 0 deletions
--- a/scripts/analysis/reflect-git-history.sh
+++ b/scripts/analysis/reflect-git-history.sh
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+# reflect-git-history.sh — Phase-0 experiment P2 ("only-self-reflection" bucket)
+#
+# Question: of the failures visible in git history, what fraction would ONLY
+# have been caught by end-of-run self-reflection — i.e. NOT by CI and NOT by
+# independent human review? If that bucket is near-empty, the closed
+# calibration / skill-synthesis loop (design §7–§8) is not worth building.
+#
+# Method: scan `git log` over a window for failure signals (reverts, and
+# fix:/hotfix commits landing shortly after a feature merge). Classify each by
+# the gate most likely to have caught it, using a pre-registered heuristic.
+# This is a HARNESS + RUBRIC; the classifier is deliberately simple and the
+# real corpus/labelling is wired later. It emits a structured tally.
+#
+# Usage:
+#   scripts/analysis/reflect-git-history.sh [--repo PATH] [--since SINCE] [--json|--md]
+#
+# Options:
+#   --repo PATH   repo to analyse (default: current repo)
+#   --since SINCE git log --since value (default: "6 months ago")
+#   --json        emit JSON (default)
+#   --md          emit markdown
+#
+# Requirements: git, awk.
+#
+# PRE-REGISTERED KILL CONDITION:
+#   bucket "only_self_reflection" is near-empty (< 10% of classified failures)
+#   ⇒ do NOT build design §7–§8 (closed loop). Caveat-notes capture only.
+
+set -euo pipefail
+
+REPO="."
+SINCE="6 months ago"
+FORMAT="json"
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --repo) REPO="$2"; shift 2 ;;
+    --since) SINCE="$2"; shift 2 ;;
+    --json) FORMAT="json"; shift ;;
+    --md) FORMAT="md"; shift ;;
+    -h|--help) sed -n '2,30p' "$0"; exit 0 ;;
+    *) echo "unknown arg: $1" >&2; exit 2 ;;
+  esac
+done
+
+KILL_CONDITION='bucket only_self_reflection < 10% of classified failures ⇒ do NOT build §7–§8'
+echo "# pre-registered kill condition: ${KILL_CONDITION}" >&2
+
+command -v git >/dev/null 2>&1 || { echo "git required" >&2; exit 3; }
+
+# Collect candidate failure commits: reverts + fix/hotfix subjects.
+mapfile -t LINES < <(
+  git -C "$REPO" log --since="$SINCE" --pretty='%H%x09%s' 2>/dev/null \
+    | grep -iE 'revert|hotfix|hot-fix|regression|fix(\(|:|!| )' || true
+)
+
+total=0; ci=0; human=0; selfonly=0
+for line in "${LINES[@]}"; do
+  [[ -z "$line" ]] && continue
+  subj="${line#*$'\t'}"
+  total=$((total + 1))
+  # Pre-registered classification heuristic (gate most likely to have caught it):
+  #   - build/test/lint/type/ci signals → CI would have caught it
+  #   - security/auth/permission/data/migration → human review would flag it
+  #   - everything else (logic/UX/assumption/edge) → only-self-reflection bucket
+  if printf '%s' "$subj" | grep -qiE 'test|lint|type|build|ci|compile|typo'; then
+    ci=$((ci + 1))
+  elif printf '%s' "$subj" | grep -qiE 'security|auth|permission|rbac|secret|migration|data|sql|injection'; then
+    human=$((human + 1))
+  else
+    selfonly=$((selfonly + 1))
+  fi
+done
+
+pct() { awk "BEGIN{ if ($2==0) print \"0.0\"; else printf \"%.1f\", 100*$1/$2 }"; }
+self_pct="$(pct "$selfonly" "$total")"
+verdict="$(awk "BEGIN{print ($self_pct < 10.0) ? \"KILL §7–§8\" : \"signal present — proceed to deeper labelling\"}")"
+
+if [[ "$FORMAT" == "md" ]]; then
+  cat <<EOF
+## P2 — git-history failure-gate attribution
+
+- window: \`${SINCE}\` · repo: \`${REPO}\`
+- classified failures: **${total}**
+
+| gate | count | share |
+|---|---:|---:|
+| CI would catch | ${ci} | $(pct "$ci" "$total")% |
+| human review would catch | ${human} | $(pct "$human" "$total")% |
+| only-self-reflection | ${selfonly} | ${self_pct}% |
+
+- kill condition: ${KILL_CONDITION}
+- verdict: **${verdict}**
+EOF
+else
+  awk -v t="$total" -v c="$ci" -v h="$human" -v s="$selfonly" -v sp="$self_pct" \
+      -v v="$verdict" -v since="$SINCE" -v repo="$REPO" -v kc="$KILL_CONDITION" 'BEGIN{
+    printf "{\n"
+    printf "  \"experiment\": \"P2-git-history\",\n"
+    printf "  \"repo\": \"%s\",\n", repo
+    printf "  \"since\": \"%s\",\n", since
+    printf "  \"classified_failures\": %d,\n", t
+    printf "  \"buckets\": { \"ci\": %d, \"human_review\": %d, \"only_self_reflection\": %d },\n", c, h, s
+    printf "  \"only_self_reflection_pct\": %s,\n", sp
+    printf "  \"kill_condition\": \"%s\",\n", kc
+    printf "  \"verdict\": \"%s\"\n", v
+    printf "}\n"
+  }'
+fi