Files
stack/docs/claude/orchestrator.md
Jason Woltje a69904a47b docs(#344): Add CI verification to orchestrator guide
- Document CI configuration requirements
- Add CI verification step to execution loop
- Document auto-diagnosis categories and patterns
- Add CLI integration examples
- Add service integration code examples

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-07 11:22:58 -06:00

25 KiB
Raw Permalink Blame History

Mosaic Stack Orchestrator Guide

Platform-specific orchestrator protocol for Mosaic Stack.

Overview

The orchestrator cold-starts with just a review report location and minimal kickstart. It autonomously:

  1. Parses review reports to extract findings
  2. Categorizes findings into phases by severity
  3. Estimates token usage per task
  4. Creates Gitea issues (phase-level)
  5. Bootstraps docs/tasks.md from scratch
  6. Coordinates completion using worker agents

Key principle: The orchestrator is the sole writer of docs/tasks.md. Worker agents execute tasks and report results — they never modify the tracking file.


Orchestrator Boundaries (CRITICAL)

The orchestrator NEVER:

  • Edits source code directly (_.ts, _.tsx, *.js, etc.)
  • Runs quality gates itself (that's the worker's job)
  • Makes commits containing code changes
  • "Quickly fixes" something to save time — this is how drift starts

The orchestrator ONLY:

  • Reads/writes docs/tasks.md
  • Reads/writes docs/orchestrator-learnings.json
  • Spawns workers via the Task tool for ALL code changes
  • Parses worker JSON results
  • Commits task tracking updates (tasks.md, learnings)
  • Outputs status reports and handoff messages

If you find yourself about to edit source code, STOP. Spawn a worker instead. No exceptions. No "quick fixes."

Worker Limits:

  • Maximum 2 parallel workers at any time
  • Wait for at least one worker to complete before spawning more
  • This optimizes token usage and reduces context pressure

Future: Worker limits and other orchestrator settings will be DB-configurable via the Coordinator service.


Bootstrap Templates

Use templates from docs/templates/ (relative to repo root):

# Set environment variables
export PROJECT="mosaic-stack"
export MILESTONE="M6-Feature"
export CURRENT_DATETIME=$(date -Iseconds)
export TASK_PREFIX="MS-SEC"
export PHASE_ISSUE="#337"
export PHASE_BRANCH="fix/security"

# Create tasks.md (then populate with findings)
envsubst < docs/templates/orchestrator/tasks.md.template > docs/tasks.md

# Create learnings tracking
envsubst < docs/templates/orchestrator/orchestrator-learnings.json.template > docs/orchestrator-learnings.json

# Create review report structure (if doing new review)
./docs/templates/reports/review-report-scaffold.sh codebase-review mosaic-stack

Available templates:

Template Purpose
orchestrator/tasks.md.template Task tracking table with schema
orchestrator/orchestrator-learnings.json.template Variance tracking
orchestrator/phase-issue-body.md.template Gitea issue body
orchestrator/compaction-summary.md.template 60% checkpoint format
reports/review-report-scaffold.sh Creates report directory
scratchpad.md.template Per-task working document

See docs/templates/README.md for full documentation.

CLI Tools

Git and CI operations use @mosaic/cli-tools package:

# Issue operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-issue-create -t "Title" -b "Body" -m "Milestone"
pnpm exec mosaic-issue-list -s open -m "Milestone"

# PR operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-pr-create -t "Title" -b "Body" -B develop
pnpm exec mosaic-pr-merge -n 42 -m squash -d

# Milestone operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-milestone-create -t "M7-Feature" -d "Description"

# CI/CD operations (Woodpecker)
pnpm exec mosaic-ci-pipeline-status --latest
pnpm exec mosaic-ci-pipeline-wait -n 42
pnpm exec mosaic-ci-pipeline-logs -n 42

See packages/cli-tools/README.md for full command reference.

CI Configuration

Set these environment variables for Woodpecker CI integration:

export WOODPECKER_SERVER="https://ci.mosaicstack.dev"
export WOODPECKER_TOKEN="your-token-here"  # Get from ci.mosaicstack.dev/user

Phase 1: Bootstrap

Step 1: Parse Review Reports

Review reports follow this structure:

docs/reports/{report-name}/
├── 00-executive-summary.md   # Start here - overview and counts
├── 01-security-review.md     # Security findings with IDs like SEC-*
├── 02-code-quality-review.md # Code quality findings like CQ-*
├── 03-qa-test-coverage.md    # Test coverage gaps like TEST-*
└── ...

Extract findings by looking for:

  • Finding IDs (e.g., SEC-API-1, CQ-WEB-3, TEST-001)
  • Severity labels: Critical, High, Medium, Low
  • Affected files/components (use for repo column)
  • Specific line numbers or code patterns

Step 2: Categorize into Phases

Severity Phase Focus Branch Pattern
Critical 1 Security vulnerabilities, data exposure fix/security
High 2 Security hardening, auth gaps fix/security
Medium 3 Code quality, performance, bugs fix/code-quality
Low 4 Tests, documentation, cleanup fix/test-coverage

Within each phase, order tasks by:

  1. Blockers first (tasks that unblock others)
  2. Same-file tasks grouped together
  3. Simpler fixes before complex ones

Step 3: Estimate Token Usage

Task Type Estimate Examples
Single-line fix 3-5K Typo, wrong operator, missing null check
Add guard/validation 5-8K Add auth decorator, input validation
Fix error handling 8-12K Proper try/catch, error propagation
Refactor pattern 10-15K Replace KEYS with SCAN, fix memory leak
Add new functionality 15-25K New service method, new component
Write tests 15-25K Unit tests for untested service
Complex refactor 25-40K Architectural change, multi-file refactor

Adjust estimates based on:

  • Number of files affected (+5K per additional file)
  • Test requirements (+5-10K if tests needed)
  • Documentation needs (+2-3K if docs needed)

Step 4: Determine Dependencies

Automatic dependency rules:

  1. All tasks in Phase N depend on the Phase N-1 verification task
  2. Tasks touching the same file should be sequential (earlier blocks later)
  3. Auth/security foundation tasks block tasks that rely on them
  4. Each phase ends with a verification task that depends on all phase tasks

Step 5: Create Gitea Issues (Phase-Level)

Create ONE issue per phase using @mosaic/cli-tools:

# Use mosaic CLI tools (auto-detects Gitea vs GitHub)
pnpm exec mosaic-issue-create \
  -t "Phase 1: Critical Security Fixes" \
  -b "## Findings
- SEC-API-1: Description
- SEC-WEB-2: Description

## Acceptance Criteria
- [ ] All critical findings remediated
- [ ] Quality gates passing" \
  -l "security" \
  -m "{milestone-name}"

CLI tools location: packages/cli-tools/bin/ - see packages/cli-tools/README.md for full documentation.

Step 6: Create docs/tasks.md

Create the file with this exact schema:

# Tasks

| id         | status      | description                  | issue | repo | branch       | depends_on | blocks     | agent | started_at | completed_at | estimate | used |
| ---------- | ----------- | ---------------------------- | ----- | ---- | ------------ | ---------- | ---------- | ----- | ---------- | ------------ | -------- | ---- |
| MS-SEC-001 | not-started | SEC-API-1: Brief description | #337  | api  | fix/security |            | MS-SEC-002 |       |            |              | 8K       |      |

Column definitions:

Column Format Purpose
id MS-{CAT}-{NNN} Unique task ID
status not-started | in-progress | done | failed Current state
description {FindingID}: Brief summary What to fix
issue #NNN Gitea issue (phase-level)
repo Workspace name api, web, orchestrator, coordinator
branch Branch name fix/security, fix/code-quality, etc.
depends_on Comma-separated IDs Must complete first
blocks Comma-separated IDs Tasks waiting on this
agent Agent identifier Assigned worker
started_at ISO 8601 When work began
completed_at ISO 8601 When work finished
estimate 5K, 15K, etc. Predicted token usage
used 4.2K, 12.8K, etc. Actual usage

Step 7: Commit Bootstrap

git add docs/tasks.md docs/orchestrator-learnings.json
git commit -m "chore(orchestrator): Bootstrap tasks.md from review report

Parsed {N} findings into {M} tasks across {P} phases.
Estimated total: {X}K tokens."
git push

Phase 2: Execution Loop

1. git pull --rebase
2. Read docs/tasks.md
3. Find next task: status=not-started AND all depends_on are done
4. If no task available:
   - All done? → Report success, run final retrospective, STOP
   - Some blocked? → Report deadlock, STOP
5. Update tasks.md: status=in-progress, agent={identifier}, started_at={now}
6. Spawn worker agent (Task tool) with task details
7. Wait for worker completion
8. Parse worker result (JSON)
9. Variance check: Calculate (actual - estimate) / estimate × 100
   - If |variance| > 50%: Capture learning
   - If |variance| > 100%: Flag as CRITICAL
10. Update tasks.md: status=done/failed, completed_at={now}, used={actual}
11. Cleanup reports: Remove processed report files
12. Commit + push: git add docs/tasks.md && git commit && git push
13. CI verification (if configured):
    - Check latest pipeline status for the branch
    - Wait for pipeline completion (timeout: 30min)
    - On failure: Fetch logs and auto-diagnose
    - Common failures: lint, type-check, test, build, security
    - If CI fails: Mark task as failed, update tasks.md, restart from step 1
14. If phase verification task: Run phase retrospective
15. Check context usage
16. If >= 55%: Output COMPACTION REQUIRED checkpoint, STOP, wait for user
17. If < 55%: Go to step 1
18. After user runs /compact and says "continue": Go to step 1

Worker Prompt Template

## Task Assignment: {id}

**Description:** {description}
**Repository:** apps/{repo}
**Branch:** {branch}

**Reference:** See `docs/reports/` for detailed finding description. Search for the finding ID.

## Workflow

1. Checkout branch: `git checkout {branch} || git checkout -b {branch} develop && git pull`
2. Read the finding details from the report
3. Implement the fix following existing code patterns
4. Run quality gates (ALL must pass):
   ```bash
   pnpm lint && pnpm typecheck && pnpm test
   ```
5. If gates fail: Fix and retry. Do NOT report success with failures.
6. Commit: `git commit -m "fix({finding_id}): brief description"`
7. Push: `git push origin {branch}`
8. Report result as JSON (see format below)

## Result Format (MANDATORY)

```json
{
  "task_id": "{id}",
  "status": "success|failed",
  "used": "5.2K",
  "commit_sha": "abc123",
  "notes": "Brief summary of what was done"
}
```

## Rules

- DO NOT modify docs/tasks.md
- DO NOT claim other tasks
- Complete this single task, report results, done

CI Verification (Step 13)

After pushing code, the orchestrator can optionally monitor CI pipeline status to catch failures early.

Configuration

CI monitoring requires environment variables:

export WOODPECKER_SERVER="https://ci.mosaicstack.dev"
export WOODPECKER_TOKEN="your-token-here"

If not configured, CI verification is skipped (orchestrator logs a warning).

Verification Process

1. After git push, get latest pipeline for the branch
2. Wait for pipeline to complete (timeout: 30 minutes, configurable)
3. Poll every 10 seconds (configurable)
4. On completion:
   - Success: Continue to next step
   - Failure: Fetch logs and auto-diagnose

Auto-Diagnosis

When a pipeline fails, the orchestrator fetches logs and categorizes the failure:

Category Pattern Suggestion
Lint eslint, lint.*error Run pnpm lint locally
Type Check type.*error, tsc.*error Run pnpm typecheck locally
Test test.*fail, vitest.*fail Run pnpm test locally
Build build.*fail, compilation.*fail Run pnpm build locally
Security secret, security, vulnerability Review security scan, remove secrets
Unknown (fallback) Review full logs

Failure Handling

When CI fails:

  1. Log the failure category and diagnosis
  2. Update task status to failed in tasks.md
  3. Include diagnosis in task notes
  4. Commit the task update
  5. Options:
    • Re-spawn worker with error context to fix
    • Skip task and continue (if non-critical)
    • Stop and alert (if critical blocker)

CLI Integration

Use @mosaic/cli-tools for CI operations:

# Check latest pipeline status
pnpm exec mosaic-ci-pipeline-status --latest

# Wait for specific pipeline
pnpm exec mosaic-ci-pipeline-wait -n 42 -t 1800

# Get logs on failure
pnpm exec mosaic-ci-pipeline-logs -n 42

Service Integration

The CIOperationsService (in apps/orchestrator/src/ci/) provides:

  • getLatestPipeline(repo) - Get most recent pipeline
  • getPipeline(repo, number) - Get specific pipeline
  • waitForPipeline(repo, number, options) - Wait with auto-diagnosis
  • getPipelineLogs(repo, number) - Fetch logs

Example usage in orchestrator:

// After git push
const repo = "mosaic/stack";
const pipeline = await ciService.getLatestPipeline(repo);

if (pipeline) {
  const result = await ciService.waitForPipeline(repo, pipeline.number, {
    timeout: 1800,
    fetchLogsOnFailure: true,
  });

  if (!result.success && result.diagnosis) {
    this.logger.warn(`CI failed: ${result.diagnosis.category}`);
    this.logger.warn(`Suggestion: ${result.diagnosis.suggestion}`);
    // Handle failure...
  }
}

Context Threshold Protocol (Orchestrator Replacement)

Threshold: 55-60% context usage

Why replacement, not compaction?

  • Compaction causes protocol drift — agent "remembers" gist but loses specifics
  • Post-compaction agents may violate core rules (e.g., letting workers modify tasks.md)
  • Fresh orchestrator has 100% protocol fidelity
  • All state lives in docs/tasks.md — the orchestrator is stateless and replaceable

At threshold (55-60%):

  1. Complete current task
  2. Persist all state:
    • Update docs/tasks.md with all progress
    • Update docs/orchestrator-learnings.json with variances
    • Commit and push both files
  3. Output ORCHESTRATOR HANDOFF message with ready-to-use takeover kickstart
  4. STOP COMPLETELY — do not continue working

Handoff message format:

---
⚠️ ORCHESTRATOR HANDOFF REQUIRED

Context: {X}% — Replacement recommended to prevent drift

Progress: {completed}/{total} tasks ({percentage}%)
Current phase: Phase {N} ({phase_name})

State persisted:
- docs/tasks.md ✓
- docs/orchestrator-learnings.json ✓

## Takeover Kickstart

Copy and paste this to spawn a fresh orchestrator:

---
## Continuation Mission

Continue {mission_description} from existing state.

## Setup
- Project: /home/localadmin/src/mosaic-stack
- State: docs/tasks.md (already populated)
- Protocol: docs/claude/orchestrator.md
- Quality gates: pnpm lint && pnpm typecheck && pnpm test

## Resume Point
- Next task: {task_id}
- Phase: {current_phase}
- Progress: {completed}/{total} tasks ({percentage}%)

## Instructions
1. Read docs/claude/orchestrator.md for protocol
2. Read docs/tasks.md to understand current state
3. Continue execution from task {task_id}
4. Follow Two-Phase Completion Protocol
5. You are the SOLE writer of docs/tasks.md
---

STOP: Terminate this session and spawn fresh orchestrator with the kickstart above.
---

Future: Coordinator Automation

When the Mosaic Stack Coordinator service is implemented, it will:

  • Monitor orchestrator stdout for context percentage
  • Detect the handoff checkpoint message
  • Parse the takeover kickstart
  • Automatically spawn fresh orchestrator
  • Log handoff events for debugging

For now, the human acts as Coordinator.

Rules:

  • Do NOT attempt to compact yourself — compaction causes drift
  • Do NOT continue past 60%
  • Do NOT claim you can "just continue" — protocol drift is real
  • STOP means STOP — the user (Coordinator) will spawn your replacement

Two-Phase Completion Protocol

Each major phase uses a two-phase approach to maximize completion while managing diminishing returns.

Bulk Phase (Target: 90%)

  • Focus on tractable errors
  • Parallelize where possible
  • When 90% reached, transition to Polish (do NOT declare success)

Polish Phase (Target: 100%)

  1. Inventory: List all remaining errors with file:line

  2. Categorize:

    Category Criteria Action
    Quick-win <5 min, straightforward Fix immediately
    Medium 5-30 min, clear path Fix in order
    Hard >30 min or uncertain Attempt 15 min, then document
    Architectural Requires design change Document and defer
  3. Work priority: Quick-win → Medium → Hard

  4. Document deferrals in docs/deferred-errors.md:

    ## MS-XXX: [Error description]
    
    - File: path/to/file.ts:123
    - Error: [exact error message]
    - Category: Hard | Architectural | Framework Limitation
    - Reason: [why this is non-trivial]
    - Suggested approach: [how to fix in future]
    - Risk: Low | Medium | High
    
  5. Phase complete when:

    • All Quick-win/Medium fixed
    • All Hard attempted (fixed or documented)
    • Architectural items documented with justification

Phase Boundary Rule

Do NOT proceed to the next major phase until the current phase reaches Polish completion:

✅ Phase 2 Bulk: 91%
✅ Phase 2 Polish: 118 errors triaged
   - 40 medium → fixed
   - 78 low → EACH documented with rationale
✅ Phase 2 Complete: Created docs/deferred-errors.md
→ NOW proceed to Phase 3

❌ WRONG: Phase 2 at 91%, "low priority acceptable", starting Phase 3

Reporting

When transitioning from Bulk to Polish:

Phase X Bulk Complete: {N}% ({fixed}/{total})
Entering Polish Phase: {remaining} errors to triage

When Polish Phase complete:

Phase X Complete: {final_pct}% ({fixed}/{total})
- Quick-wins: {n} fixed
- Medium: {n} fixed
- Hard: {n} fixed, {n} documented
- Framework limitations: {n} documented

Learning & Retrospective

Variance Thresholds

Variance Action
0-30% Log only (acceptable)
30-50% Flag for review
50-100% Capture learning to docs/orchestrator-learnings.json
>100% CRITICAL — review task classification

Task Type Classification

Type Keywords Base Estimate
STYLE_FIX "formatting", "prettier", "lint" 3-5K
BULK_CLEANUP "unused", "warnings", "~N files" file_count × 550
GUARD_ADD "add guard", "decorator", "validation" 5-8K
SECURITY_FIX "sanitize", "injection", "XSS" 8-12K × 2.5
AUTH_ADD "authentication", "auth" 15-25K
REFACTOR "refactor", "replace", "migrate" 10-15K
TEST_ADD "add tests", "coverage" 15-25K

Report Cleanup

QA automation generates report files in docs/reports/qa-automation/pending/. Clean up after processing.

Event Action
Task success Delete matching reports from pending/
Task failed Move reports to escalated/
Phase verification Clean up all pending/ reports
Milestone complete Archive or delete escalated/

Stopping Criteria

ONLY stop if:

  1. All tasks in docs/tasks.md are done
  2. Critical blocker preventing progress (document and alert)
  3. Context usage >= 55% — output COMPACTION REQUIRED checkpoint and wait
  4. Absolute context limit reached AND cannot compact further

DO NOT stop to ask "should I continue?" — the answer is always YES. DO stop at 55-60% — output the compaction checkpoint and wait for user to run /compact.


Sprint Completion Protocol

When all tasks in docs/tasks.md are done (or triaged as deferred), archive the sprint artifacts before stopping. This preserves them for post-mortems, variance calibration, and historical reference.

Archive Steps

  1. Create archive directory (if it doesn't exist):

    mkdir -p docs/tasks/
    
  2. Move tasks.md to archive:

    mv docs/tasks.md docs/tasks/{milestone-name}-tasks.md
    

    Example: docs/tasks/M6-AgentOrchestration-Fixes-tasks.md

  3. Move learnings to archive:

    mv docs/orchestrator-learnings.json docs/tasks/{milestone-name}-learnings.json
    
  4. Commit the archive:

    git add docs/tasks/
    git rm docs/tasks.md docs/orchestrator-learnings.json 2>/dev/null || true
    git commit -m "chore(orchestrator): Archive {milestone-name} sprint artifacts
    
    {completed}/{total} tasks completed, {deferred} deferred.
    Archived to docs/tasks/ for post-mortem reference."
    git push
    
  5. Run final retrospective — review variance patterns and propose updates to estimation heuristics.

Recovery

If an orchestrator starts and docs/tasks.md does not exist, check docs/tasks/ for the most recent archive:

ls -t docs/tasks/*-tasks.md 2>/dev/null | head -1

If found, this may indicate another session archived the file. The orchestrator should:

  1. Report what it found in docs/tasks/
  2. Ask whether to resume from the archived file or bootstrap fresh
  3. If resuming: copy the archive back to docs/tasks.md and continue

Retention Policy

Keep all archived sprints indefinitely. They are small text files and valuable for:

  • Post-mortem analysis
  • Estimation variance calibration across milestones
  • Understanding what was deferred and why
  • Onboarding new orchestrators to project history

Kickstart Message Format

## Mission

Remediate findings from the codebase review.

## Setup

- Project: /home/localadmin/src/mosaic-stack
- Review: docs/reports/{report-name}/
- Quality gates: pnpm lint && pnpm typecheck && pnpm test
- Milestone: {milestone-name}
- Task prefix: MS

## Protocol

Read docs/claude/orchestrator.md for full instructions.

## Start

Bootstrap from the review report, then execute until complete.

Quick Reference

Phase Action
Bootstrap Parse reports → Categorize → Estimate → Create issues → Create tasks.md
Execute Loop: claim → spawn worker → update → commit
Compact At 60%: summarize, clear history, continue
Stop Queue empty, blocker, or context limit

Orchestrator owns tasks.md. Workers execute and report. Single writer eliminates conflicts.