mosaicstack/stack

Fork 0

Files

Mos (Agent) 10689a30d2

ci/woodpecker/push/ci Pipeline failed

Details

ci/woodpecker/pr/ci Pipeline failed

Details

feat: monorepo consolidation — forge pipeline, MACP protocol, framework plugin, profiles/guides/skills

Work packages completed:
- WP1: packages/forge — pipeline runner, stage adapter, board tasks, brief classifier,
  persona loader with project-level overrides. 89 tests, 95.62% coverage.
- WP2: packages/macp — credential resolver, gate runner, event emitter, protocol types.
  65 tests, 96.24% coverage. Full Python-to-TS port preserving all behavior.
- WP3: plugins/mosaic-framework — OC rails injection plugin (before_agent_start +
  subagent_spawning hooks for Mosaic contract enforcement).
- WP4: profiles/ (domains, tech-stacks, workflows), guides/ (17 docs),
  skills/ (5 universal skills), forge pipeline assets (48 markdown files).

Board deliberation: docs/reviews/consolidation-board-memo.md
Brief: briefs/monorepo-consolidation.md

Consolidates mosaic/stack (forge, MACP, bootstrap framework) into mosaic/mosaic-stack.
154 new tests total. Zero Python — all TypeScript/ESM.

2026-03-30 19:43:24 +00:00

34 KiB

Raw Blame History

Specialist Pipeline — Progressive Refinement Architecture

Status: DRAFT v4 — post architecture review Created: 2026-03-24 Last Updated: 2026-03-24 20:40 CDT

Vision

Replace "throw it at a Codex worker and hope" with a railed pipeline where each stage narrows scope, increases precision, and catches mistakes before they compound. Spend more time up-front declaring requirements; spend less time at the end fixing broken output.

Core principles:

One agent, one specialty. No generalists pretending to be experts.
Agents must be willing to argue, debate, and push back — not eagerly agree and move on.
The pipeline is a set of customizable rails — agents stay on track, don't get sidetracked or derailed.
Dynamic composition — only relevant specialists are called in per task.
Hard gates between stages — mechanical checks + agent oversight for final decision.
Minimal human oversight once the PRD is declared.

The Pipeline

PRD.md (human declares requirements)
    │
    ▼
BRIEFS (PRD decomposed into discrete work units)
    │
    ▼
BOARD OF DIRECTORS (strategic go/no-go per brief)
    │  Static composition. CEO, CTO, CFO, COO.
    │  Output: Approved brief with business constraints, priority, budget
    │  Board does NOT select technical participants — that's the Brief Analyzer's job
    │  Gate: Board consensus required to proceed
    │  REJECTED → archive + notify human. NEEDS REVISION → back to Intake.
    │
    │  POST-RUN REVIEW: Board reviews memos from completed pipeline
    │  runs. Analyzes for conflicts, adjusts strategy, feeds learnings
    │  back into future briefs. The Board is not fire-and-forget.
    │
    ▼
BRIEF ANALYZER (technical composition)
    │  Sonnet agent analyzes approved brief + project context
    │  Selects which generalists/specialists participate in each planning stage
    │  Separates strategic decisions (Board) from technical composition
    │
    ▼
PLANNING 1 — Architecture (Domain Generalists)
    │  Dynamic composition based on brief requirements.
    │  Software Architect + relevant generalists only.
    │  Output: Architecture Decision Record (ADR)
    │  Agents MUST debate trade-offs. No rubber-stamping.
    │  Gate: ADR approved, all dissents resolved or recorded
    │
    ▼
PLANNING 2 — Implementation Design (Language/Domain Specialists)
    │  Dynamic composition — only languages/domains in the ADR.
    │  Output: Implementation spec per component
    │  Each specialist argues for their domain's best practices.
    │  Gate: All specs reviewed by Architecture, no conflicts
    │
    ▼
PLANNING 3 — Task Decomposition & Estimation
    │  Context Manager + Task Distributor
    │  Output: Task breakdown with dependency graph, estimates,
    │          context packets per worker, acceptance criteria
    │  Gate: Every task has one owner, one completion condition,
    │         estimated rounds, and explicit test criteria
    │
    ▼
CODING (Workers execute)
    │  Codex/Claude workers with specialist subagents loaded
    │  Each worker gets: context packet + implementation spec + acceptance criteria
    │  Workers stay in their lane — the rails prevent drift
    │  Gate: Code compiles, lints, passes unit tests
    │
    ▼
REVIEW (Specialist review)
    │  Code reviewer (evidence-driven, severity-ranked)
    │  Security auditor (attack paths, secrets, auth)
    │  Language specialist for the relevant language
    │  Gate: All findings addressed or explicitly accepted with rationale
    │
    ▼
REMEDIATE (if review finds issues)
    │  Worker fixes based on review findings
    │  Loops back to REVIEW
    │  Gate: Same as REVIEW — clean pass required
    │
    ▼
TEST (Integration + acceptance)
    │  QA Strategist validates against acceptance criteria from Planning 3
    │  Gate: All acceptance criteria pass, no regressions
    │
    ▼
DEPLOY
       Infrastructure Lead handles deployment
       Gate: Smoke tests pass in target environment

Orchestration — Who Watches the Pipeline?

The Orchestrator (Mosaic's role)

Not me (Jarvis). Not any single agent. The Orchestrator is a dedicated, mechanical process with AI oversight.

The Orchestrator is:

Primarily mechanical — moves work through stages, enforces gates, tracks state
AI-assisted at decision points — an agent reviews gate results and makes go/no-go calls
The thing Mosaic Stack productizes — this IS the engine from the North Star vision

How it works:

Stage Runner (mechanical): Advances work through the pipeline. Checks gate conditions. Purely deterministic — "did all gate criteria pass? yes → advance. no → hold."
Gate Reviewer (AI agent): When a gate's mechanical checks pass, the Gate Reviewer does a final sanity check. "The code lints and tests pass, but does this actually solve the problem?" This is the lightweight oversight layer.
Escalation (to human): If the Gate Reviewer is uncertain, or if debate in a planning stage is unresolved after N rounds, escalate to Jason.

What Sends a Plan Back for More Debate?

Triggers for rework/rejection:

Gate failure — mechanical checks don't pass → automatic rework
Gate Reviewer dissent — AI reviewer flags a concern → sent back with specific objection
Unresolved debate — planning agents can't reach consensus after N rounds → escalate or send back with the dissenting positions documented
Scope creep detection — if a stage's output significantly exceeds the brief's scope → flag and return
Dependency conflict — Planning 3 finds the task breakdown has circular deps or impossible ordering → return to Planning 2
Review severity threshold — if Review finds CRITICAL-severity issues → auto-reject back to Coding, no discussion

Human Touchpoints (minimal by design)

PRD.md — Human writes this. This is where you spend the time.
Board escalation — Only if the Board can't reach consensus on a brief.
Planning escalation — Only if debate is unresolved after max rounds.
Deploy approval — Optional. Could be fully automated for low-risk deploys.

Everything else runs autonomously on rails.

Gate System

Every gate has mechanical checks (automated, deterministic) and an agent review (final judgment call).

Stage →	Mechanical Checks	Agent Review
Board → Planning 1	Brief exists, has success criteria, has budget	Gate Reviewer: "Is this brief well-scoped enough to architect?"
Planning 1 → Planning 2	ADR exists, covers all components in brief	Gate Reviewer: "Does this architecture actually solve the problem?"
Planning 2 → Planning 3	Implementation spec per component, no unresolved conflicts	Gate Reviewer: "Are the specs consistent with each other and the ADR?"
Planning 3 → Coding	Task breakdown exists, all tasks have owner + criteria + estimate	Gate Reviewer: "Is this actually implementable as decomposed?"
Coding → Review	Compiles, lints, unit tests pass	Gate Reviewer: "Does the code match the implementation spec?"
Review → Test (or → Remediate)	All review findings addressed	Gate Reviewer: "Are the fixes real or did the worker just suppress warnings?"
Test → Deploy	All acceptance criteria pass, no regressions	Gate Reviewer: "Ready for production?"

Dynamic Composition

Board of Directors — STATIC

Always the same participants. These are strategic, not technical.

Role	Model	Personality
CEO	Opus	Visionary, asks "does this serve the mission?"
CTO	Opus	Technical realist, asks "can we actually build this?"
CFO	Sonnet	Cost-conscious, asks "what does this cost vs return?" — needs real analytical depth for budget/ROI, not a lightweight model
COO	Sonnet	Operational, asks "what's the timeline and resource impact?"

Planning Stages — DYNAMIC

The Orchestrator selects participants based on the brief's requirements. Not every specialist is needed for every task.

Selection logic:

Parse the brief/ADR for languages mentioned → include those Language Specialists
Parse for infrastructure concerns → include Infra Lead, Docker/Swarm, CI/CD as needed
Parse for data concerns → include Data Architect, SQL Pro
Parse for UI concerns → include UX Strategist, Web Design, React/RN Specialist
Parse for security concerns → include Security Architect
Always include: Software Architect (Planning 1), QA Strategist (Planning 3)

Example: A TypeScript NestJS API endpoint with Prisma:

Planning 1: Software Architect, Security Architect, Data Architect
Planning 2: TypeScript Pro, NestJS Expert, SQL Pro
Planning 3: Task Distributor, Context Manager

Example: A React dashboard with no backend changes:

Planning 1: Software Architect, UX Strategist
Planning 2: React Specialist, Web Design, UX/UI Design
Planning 3: Task Distributor, Context Manager

Go Pro doesn't sit in on a TypeScript project. Solidity Pro doesn't weigh in on a dashboard.

Debate Culture

Agents in planning stages are required to:

State their position with reasoning — no "sounds good to me"
Challenge other positions — "I disagree because..."
Identify risks the others haven't raised — adversarial by design
Formally dissent if not convinced — dissents are recorded in the ADR/spec
Not capitulate just to move forward — the Orchestrator tracks rounds and will call time, but agents shouldn't fold under social pressure

Round limits: Min 3, Max 30. The discussion must be allowed to properly work. Don't cut debate short — premature consensus produces bad architecture. The Orchestrator tracks rounds and will intervene only when debate is genuinely circular (repeating the same arguments) rather than still productive.

This is enforced via personality in the agent definitions:

Architects are opinionated and will argue for clean boundaries
Security Architect is paranoid by design — always looking for what can go wrong
QA Strategist is skeptical — "prove it works, don't tell me it works"
Language specialists are purists about their domain's best practices

The goal: By the time code is written, the hard decisions are already made and debated. The workers just execute a well-argued plan.

Model Assignments

Pipeline Stage	Model	Rationale
Board of Directors	Opus (CEO/CTO) / Sonnet (CFO/COO)	Strategic deliberation needs depth across the board
Planning 1 (Architecture)	Opus	Complex trade-offs, needs deep reasoning
Planning 2 (Implementation)	Sonnet	Domain expertise, detailed specs
Planning 3 (Decomposition)	Sonnet	Structured output, dependency analysis
Coding	Codex	Primary workhorse, separate budget
Review	Sonnet (code) + Opus (security)	Code review = Sonnet, security = Opus for depth
Remediation	Codex	Same worker, fix the issues
Test	Haiku	Mechanical validation, low complexity
Deploy	Haiku	Scripted deployment, mechanical
Gate Reviewer	Sonnet	Judgment calls, moderate complexity
Orchestrator (mechanical)	None — deterministic code	State machine, not AI

Roster

Board of Directors (static)

Role	Scope
CEO	Vision, priorities, go/no-go
CTO	Technical direction, risk tolerance
CFO	Budget, cost/benefit
COO	Operations, timeline, resource allocation

Domain Generalists (dynamic — called per brief)

Role	Scope	Selected When
Software Architect	System design, component boundaries, data flow, API contracts	Always in Planning 1
Security Architect	Threat modeling, auth patterns, secrets, OWASP	Always — security is cross-cutting; implicit requirements are the norm
Infrastructure Lead	Deployment, networking, monitoring, scaling, DR	Brief involves deploy, infra, scaling
Data Architect	Schema design, migrations, query strategy, caching	Brief involves DB, data models, migrations
QA Strategist	Test strategy, coverage, integration test design	Always in Planning 3
UX Strategist	User flows, information architecture, accessibility	Brief involves UI/frontend

Language Specialists (dynamic — one language, one agent)

Specialist	Selected When
TypeScript Pro	Project uses TypeScript
JavaScript Pro	Project uses vanilla JS / Node.js
Go Pro	Project uses Go
Rust Pro	Project uses Rust
Solidity Pro	Project involves smart contracts
Python Pro	Project uses Python
SQL Pro	Project involves database queries / Prisma
LangChain/AI Pro	Project involves AI/ML/agent frameworks

Domain Specialists (dynamic — cross-cutting expertise)

Specialist	Selected When
Web Design	Frontend work involving HTML/CSS
UX/UI Design	Component design, design system work
React Specialist	Frontend uses React
React Native Pro	Mobile app work
Blockchain/DeFi	Chain interactions, DeFi protocols
Docker/Swarm	Containerization, deployment
CI/CD	Pipeline changes, deploy automation
NestJS Expert	Backend uses NestJS

Source Material — What to Pull From External Repos

From VoltAgent/awesome-codex-subagents (`.toml` format)

File	What We Take	What We Customize
`09-meta-orchestration/context-manager.toml`	Context packaging for workers	Add our monorepo structure, Gitea CI, project conventions
`09-meta-orchestration/task-distributor.toml`	Dependency graphs, write-scope separation, output contracts	Add worktree rules, PR workflow, completion gates
`09-meta-orchestration/workflow-orchestrator.toml`	Stage design with explicit wait points and gates	Wire to our pipeline stages
`09-meta-orchestration/agent-organizer.toml`	Task decomposition by objective (not file list)	Add our agent registry, model hierarchy rules
`04-quality-security/reviewer.toml`	Evidence-driven review, severity ranking	Add NestJS import rules, Prisma gotchas, our recurring bugs
`04-quality-security/security-auditor.toml`	Attack path mapping, secrets handling review	Add our Docker Swarm patterns, credential loader conventions

From VoltAgent/awesome-openclaw-skills (ClawHub)

Skill	What We Take	How We Use It
`brainstorming-2`	Socratic pre-coding design workflow	Planning 1 — requirements refinement before architecture
`agent-estimation`	Task effort in tool-call rounds	Planning 3 — scope tasks before spawning workers
`agent-nestjs-skills`	40 prioritized NestJS rules with code examples	NestJS specialist + backend workers
`agent-team-orchestration`	Structured handoff protocols, task state transitions	Reference for pipeline stage handoffs
`b3ehive`	Competitive implementation (3 agents, cross-evaluate)	Critical components: crypto strategies, auth flows
`agent-council`	Agent scaffolding automation	Automate specialist creation as we expand
`astrai-code-review`	Model routing by diff complexity	Review stage cost optimization
`bug-audit`	6-phase Node.js audit methodology	Periodic codebase health checks

From VoltAgent/awesome-claude-code-subagents (`.md` format)

File	What We Take	Notes
Language specialist `.md` files	System prompts for TS, Go, Rust, Solidity, etc.	Strip generic stuff, inject project-specific knowledge
`09-meta-orchestration/agent-organizer.md`	Detailed organizer pattern	Reference — Codex `.toml` is tighter

Gaps This Fills

Gap	Current State	After Pipeline
No pre-coding design	Brief → Codex starts coding immediately	3 planning stages before anyone writes code
Agents get sidetracked/derailed	No rails, workers drift from task	Mechanical pipeline + context packets keep workers on track
No debate on approach	First idea wins	Agents required to argue, dissent, challenge
No task estimation	Eyeball everything	Tool-call-round estimation in Planning 3
Code review is a checkbox	"Did it lint? Ship it."	Evidence-driven reviewer + specialist knowledge
Security review is hand-waved	Never actually done	Real attack path mapping, secrets review
Workers get bad context	Ad-hoc prompts, stale assumptions	Context-manager produces execution-ready packets
Task decomposition is sloppy	"Here's a task, go do it"	Dependency graphs, write-scope separation, output contracts
Wrong specialists involved	Everyone weighs in on everything	Dynamic composition — only relevant experts
No rework mechanism	Ship it or start over	Explicit remediation loop with review re-check
Too much human oversight	Jason babysits every stage	Mechanical gates + AI oversight, human only at PRD and escalation

Implementation Plan

Phase 1 — Foundation (this week)

Pull and customize Codex subagents: reviewer.toml, security-auditor.toml, context-manager.toml, task-distributor.toml, workflow-orchestrator.toml
Inject our project-specific knowledge
Install to ~/.codex/agents/
Define agent personality templates for debate culture (opinionated, adversarial, skeptical)

Phase 2 — Specialist Definitions (next week)

Create language specialist definitions (TS, JS, Go, Rust, Solidity, Python, SQL, LangChain, C++)
Create domain specialist definitions (NestJS, React, Docker/Swarm, CI/CD, Web Design, UX/UI, Blockchain/DeFi, React Native)
Create generalist definitions (Software Architect, Security Architect, Infra Lead, Data Architect, QA Strategist, UX Strategist)
Format as Codex .toml + OpenClaw skills
Test each against a real past task

Phase 3 — Pipeline Wiring (week after)

Build the Orchestrator (mechanical stage runner + gate checker)
Build the Gate Reviewer agent
Wire dynamic composition (brief → participant selection)
Wire the debate protocol (round tracking, dissent recording, escalation rules)
Wire Planning 1 → 2 → 3 handoff contracts
Wire Review → Remediate → Review loop
Test end-to-end with a real feature request

Phase 4 — Mosaic Integration (future)

The Orchestrator becomes a Mosaic Stack feature
Pipeline stages map to Mosaic task states
Gate results feed the Mission Control dashboard
This IS the engine — the dashboard is just the window

Phase 5 — Advanced Patterns (future)

b3ehive competitive implementation for critical paths
astrai-code-review model routing for cost optimization
agent-council automated scaffolding for new specialists
Estimation feedback loop (compare estimates to actuals)
Pipeline analytics (which stages catch the most issues, where do we bottleneck)

Resolved Decisions

#	Question	Decision	Rationale
1	Gate Reviewer model	Sonnet for all gates	Sufficient depth for judgment calls; Opus reserved for planning deliberation
2	Debate rounds	Min 3, Max 30 per stage	Let discussions work. Don't cut short. Intervene on circular repetition, not round count.
3	PRD format	Use existing Mosaic PRD template	`~/.config/mosaic/templates/docs/PRD.md.template` + `~/.config/mosaic/skills-local/prd/SKILL.md` already proven. Iterate from there.
4	Small tasks	Pipeline is for projects/features, not typo fixes	This is for getting a project or feature built smoothly. Single-file fixes go direct to a worker. Threshold: if it needs architecture decisions, it goes through the pipeline.
5	Specialist memory	Yes — specialists accumulate knowledge with rails	Similar to OpenClaw memory model. Specialists learn from past tasks ("last time X caused Y") but must maintain their specialty rails. Knowledge is domain-scoped, not freeform.
6	Cost ceiling	~$500 per pipeline run (11+ stages)	Using subs (Anthropic, OpenAI), so API costs are minimized or eliminated. Budget is time/throughput, not dollars.
7	Where this lives	Standalone service, Pi under the hood	Must be standalone so it can migrate to Mosaic Stack in the future. Pi (mosaic bootstrap) provides the execution substrate. Already using Pi for BOD. Dogfood → prove → productize.

PRD Template

The pipeline uses the existing Mosaic PRD infrastructure:

Template: ~/.config/mosaic/templates/docs/PRD.md.template
Skill: ~/.config/mosaic/skills-local/prd/SKILL.md (guided PRD generation with clarifying questions)
Guide: ~/.config/mosaic/guides/PRD.md (hard rules — PRD must exist before coding begins)

Required PRD Sections (from Mosaic guide)

Problem statement and objective
In-scope and out-of-scope
User/stakeholder requirements
Functional requirements
Non-functional requirements (security, performance, reliability, observability)
Acceptance criteria
Constraints and dependencies
Risks and open questions
Testing and verification expectations
Delivery/milestone intent

The PRD skill also generates user stories with specific acceptance criteria ("Button shows confirmation dialog before deleting" not "Works correctly").

Key rule from Mosaic: Implementation that diverges from PRD without PRD updates is a blocker. Change control: update PRD first → update plan → then implement.

Board Post-Run Review

The Board of Directors is NOT fire-and-forget. After a pipeline run completes (deploy or failure):

Memos from each stage are compiled into a run summary
Board reviews the summary for:
- Conflicts between stage outputs
- Scope drift from original brief
- Cost/timeline variance from estimates
- Strategic alignment issues
Board adjusts strategy, priorities, or constraints for future briefs
Learnings feed back into specialist memory and Orchestrator heuristics

This closes the loop. The pipeline doesn't just ship code — it learns from every run.

Architecture Review Fixes (v4, 2026-03-24)

Fixes applied based on Sonnet architecture review:

Finding	Fix Applied
Dead-end states (REJECTED, NEEDS REVISION, CI failure, worker confusion)	All paths explicitly defined in orchestrator + Board stage
Security Architect conditional (keyword matching misses implicit auth)	Security Architect now ALWAYS included in Planning 1
Board making technical composition decisions	New Brief Analyzer agent handles technical composition after Board approval
Orchestrator claimed "purely mechanical" but needs semantic analysis	Split into State Machine (mechanical) + Gate Reviewer (AI). Circularity detection is Gate Reviewer's job.
Test→Remediate had no loop limit	Shared 3-loop budget across Review + Test remediation
Open-ended debate (3-30 rounds) too loose, framing bias	Structured 3-phase debate: Independent positions → Responses → Synthesis. Tighter round limits (17-53 calls vs 12-120+).
Review only gets diff	Review now gets full module context + context packet, not just diff
Cross-brief dependency not enforced at runtime	State Machine enforces dependency ordering + file-level locking
Gate Reviewer reading full transcripts (context problem)	Gate Reviewer reads structured summaries, requests full transcript only on suspicion
No minimum specialist composition for Planning 2	Guard added: at least 1 Language + 1 Domain specialist required

Remaining Open Questions

Pi integration specifics: How exactly does Pi serve as the execution substrate? Board sessions already work via mosaic yolo pi. Does the full pipeline run as a Pi orchestration, or does Pi just handle individual stage sessions?
Specialist memory storage: OpenBrain? Per-specialist markdown files? Scoped memory namespaces?
Pipeline analytics: What metrics do we track per run? Stage duration, rework count, gate failure rate, estimate accuracy?
Parallel briefs: Can multiple briefs from the same PRD run through the pipeline concurrently? Or strictly serial?
Escalation UX: When the pipeline escalates to Jason, where does that notification go? Discord? TUI? Both?

Connection to Mosaic North Star

This pipeline IS the Mosaic vision, just running on agent infrastructure instead of a proper platform:

PRD.md → Mosaic's task queue API
Orchestrator → Mosaic's agent lifecycle management
Gates → Mosaic's review gates
Pipeline stages → Mosaic's workflow engine
Dynamic composition → Mosaic's agent selection

Everything we build here gets dogfooded, refined, and eventually productized as Mosaic Stack features. We're building the engine that Mosaic will sell.

Standalone Architecture (decided)

The pipeline is built as a standalone service — not embedded in OpenClaw or tightly coupled to any single agent framework. This is deliberate:

Pi (mosaic bootstrap) is the execution substrate — already proven with BOD sessions
The Orchestrator is a mechanical state machine — it doesn't need an LLM, it needs a process manager
Stage sessions are Pi/agent sessions — each planning/review stage spawns a session with the right participants
Migration path to Mosaic Stack is clean — standalone service → Mosaic feature, not "rip out of OpenClaw"

The pattern: dogfood on our projects → track what works → extract into Mosaic Stack as a first-class feature.

References

VoltAgent/awesome-codex-subagents: https://github.com/VoltAgent/awesome-codex-subagents
VoltAgent/awesome-claude-code-subagents: https://github.com/VoltAgent/awesome-claude-code-subagents
VoltAgent/awesome-openclaw-skills: https://github.com/VoltAgent/awesome-openclaw-skills
Board implementation: mosaic/board branch (commit ad4304b)
Mosaic North Star: ~/.openclaw/workspace/memory/mosaic-north-star.md
Existing agent registry: ~/.openclaw/workspace/agents/REGISTRY.yaml
Mosaic Queue PRD: ~/src/jarvis-brain/docs/planning/MOSAIC-QUEUE-PRD.md

Brief Classification System (skip-BOD support)

Added: 2026-03-26

Not every brief needs full Board of Directors review. The classification system lets briefs skip stages based on their nature.

Classes

Class	Pipeline	Use case
`strategic`	BOD → BA → Planning 1 → 2 → 3	New features, architecture, integrations, security, budget decisions
`technical`	BA → Planning 1 → 2 → 3	Refactors, bugfixes, UI tweaks, style changes
`hotfix`	Planning 1 → 2 → 3	Urgent patches — skip both BOD and BA

Classification priority (highest wins)

--class CLI flag on forge run or forge resume
YAML frontmatter class: field in the brief
Auto-classification via keyword analysis

Auto-classification keywords

Strategic: security, pricing, architecture, integration, budget, strategy, compliance, migration, partnership, launch
Technical: bugfix, bug, refactor, ui, style, tweak, typo, lint, cleanup, rename, hotfix, patch, css, format
Default (no keyword match): strategic (conservative — full pipeline)

Overrides

--force-board — forces BOD stage to run even for technical/hotfix briefs
--class on resume — re-classifies a run mid-flight (stages already passed are not re-run)

Backward compatibility

Existing briefs without a class field are auto-classified. The default (no matching keywords) is strategic, so all existing runs get the full pipeline unless keywords trigger technical.

34 KiB Raw Blame History