Files
stack/docs/claude/orchestrator.md
Jason Woltje a69904a47b docs(#344): Add CI verification to orchestrator guide
- Document CI configuration requirements
- Add CI verification step to execution loop
- Document auto-diagnosis categories and patterns
- Add CLI integration examples
- Add service integration code examples

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-07 11:22:58 -06:00

747 lines
25 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Mosaic Stack Orchestrator Guide
> Platform-specific orchestrator protocol for Mosaic Stack.
## Overview
The orchestrator **cold-starts** with just a review report location and minimal kickstart. It autonomously:
1. Parses review reports to extract findings
2. Categorizes findings into phases by severity
3. Estimates token usage per task
4. Creates Gitea issues (phase-level)
5. Bootstraps `docs/tasks.md` from scratch
6. Coordinates completion using worker agents
**Key principle:** The orchestrator is the **sole writer** of `docs/tasks.md`. Worker agents execute tasks and report results — they never modify the tracking file.
---
## Orchestrator Boundaries (CRITICAL)
**The orchestrator NEVER:**
- Edits source code directly (_.ts, _.tsx, \*.js, etc.)
- Runs quality gates itself (that's the worker's job)
- Makes commits containing code changes
- "Quickly fixes" something to save time — this is how drift starts
**The orchestrator ONLY:**
- Reads/writes `docs/tasks.md`
- Reads/writes `docs/orchestrator-learnings.json`
- Spawns workers via the Task tool for ALL code changes
- Parses worker JSON results
- Commits task tracking updates (tasks.md, learnings)
- Outputs status reports and handoff messages
**If you find yourself about to edit source code, STOP.**
Spawn a worker instead. No exceptions. No "quick fixes."
**Worker Limits:**
- Maximum **2 parallel workers** at any time
- Wait for at least one worker to complete before spawning more
- This optimizes token usage and reduces context pressure
> **Future:** Worker limits and other orchestrator settings will be DB-configurable via the Coordinator service.
---
## Bootstrap Templates
Use templates from `docs/templates/` (relative to repo root):
```bash
# Set environment variables
export PROJECT="mosaic-stack"
export MILESTONE="M6-Feature"
export CURRENT_DATETIME=$(date -Iseconds)
export TASK_PREFIX="MS-SEC"
export PHASE_ISSUE="#337"
export PHASE_BRANCH="fix/security"
# Create tasks.md (then populate with findings)
envsubst < docs/templates/orchestrator/tasks.md.template > docs/tasks.md
# Create learnings tracking
envsubst < docs/templates/orchestrator/orchestrator-learnings.json.template > docs/orchestrator-learnings.json
# Create review report structure (if doing new review)
./docs/templates/reports/review-report-scaffold.sh codebase-review mosaic-stack
```
**Available templates:**
| Template | Purpose |
| --------------------------------------------------- | ------------------------------- |
| `orchestrator/tasks.md.template` | Task tracking table with schema |
| `orchestrator/orchestrator-learnings.json.template` | Variance tracking |
| `orchestrator/phase-issue-body.md.template` | Gitea issue body |
| `orchestrator/compaction-summary.md.template` | 60% checkpoint format |
| `reports/review-report-scaffold.sh` | Creates report directory |
| `scratchpad.md.template` | Per-task working document |
See `docs/templates/README.md` for full documentation.
### CLI Tools
Git and CI operations use `@mosaic/cli-tools` package:
```bash
# Issue operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-issue-create -t "Title" -b "Body" -m "Milestone"
pnpm exec mosaic-issue-list -s open -m "Milestone"
# PR operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-pr-create -t "Title" -b "Body" -B develop
pnpm exec mosaic-pr-merge -n 42 -m squash -d
# Milestone operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-milestone-create -t "M7-Feature" -d "Description"
# CI/CD operations (Woodpecker)
pnpm exec mosaic-ci-pipeline-status --latest
pnpm exec mosaic-ci-pipeline-wait -n 42
pnpm exec mosaic-ci-pipeline-logs -n 42
```
See `packages/cli-tools/README.md` for full command reference.
### CI Configuration
Set these environment variables for Woodpecker CI integration:
```bash
export WOODPECKER_SERVER="https://ci.mosaicstack.dev"
export WOODPECKER_TOKEN="your-token-here" # Get from ci.mosaicstack.dev/user
```
---
## Phase 1: Bootstrap
### Step 1: Parse Review Reports
Review reports follow this structure:
```
docs/reports/{report-name}/
├── 00-executive-summary.md # Start here - overview and counts
├── 01-security-review.md # Security findings with IDs like SEC-*
├── 02-code-quality-review.md # Code quality findings like CQ-*
├── 03-qa-test-coverage.md # Test coverage gaps like TEST-*
└── ...
```
**Extract findings by looking for:**
- Finding IDs (e.g., `SEC-API-1`, `CQ-WEB-3`, `TEST-001`)
- Severity labels: Critical, High, Medium, Low
- Affected files/components (use for `repo` column)
- Specific line numbers or code patterns
### Step 2: Categorize into Phases
| Severity | Phase | Focus | Branch Pattern |
| -------- | ----- | --------------------------------------- | ------------------- |
| Critical | 1 | Security vulnerabilities, data exposure | `fix/security` |
| High | 2 | Security hardening, auth gaps | `fix/security` |
| Medium | 3 | Code quality, performance, bugs | `fix/code-quality` |
| Low | 4 | Tests, documentation, cleanup | `fix/test-coverage` |
**Within each phase, order tasks by:**
1. Blockers first (tasks that unblock others)
2. Same-file tasks grouped together
3. Simpler fixes before complex ones
### Step 3: Estimate Token Usage
| Task Type | Estimate | Examples |
| --------------------- | -------- | ----------------------------------------- |
| Single-line fix | 3-5K | Typo, wrong operator, missing null check |
| Add guard/validation | 5-8K | Add auth decorator, input validation |
| Fix error handling | 8-12K | Proper try/catch, error propagation |
| Refactor pattern | 10-15K | Replace KEYS with SCAN, fix memory leak |
| Add new functionality | 15-25K | New service method, new component |
| Write tests | 15-25K | Unit tests for untested service |
| Complex refactor | 25-40K | Architectural change, multi-file refactor |
**Adjust estimates based on:**
- Number of files affected (+5K per additional file)
- Test requirements (+5-10K if tests needed)
- Documentation needs (+2-3K if docs needed)
### Step 4: Determine Dependencies
**Automatic dependency rules:**
1. All tasks in Phase N depend on the Phase N-1 verification task
2. Tasks touching the same file should be sequential (earlier blocks later)
3. Auth/security foundation tasks block tasks that rely on them
4. Each phase ends with a verification task that depends on all phase tasks
### Step 5: Create Gitea Issues (Phase-Level)
Create ONE issue per phase using `@mosaic/cli-tools`:
```bash
# Use mosaic CLI tools (auto-detects Gitea vs GitHub)
pnpm exec mosaic-issue-create \
-t "Phase 1: Critical Security Fixes" \
-b "## Findings
- SEC-API-1: Description
- SEC-WEB-2: Description
## Acceptance Criteria
- [ ] All critical findings remediated
- [ ] Quality gates passing" \
-l "security" \
-m "{milestone-name}"
```
**CLI tools location:** `packages/cli-tools/bin/` - see `packages/cli-tools/README.md` for full documentation.
### Step 6: Create docs/tasks.md
Create the file with this exact schema:
```markdown
# Tasks
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used |
| ---------- | ----------- | ---------------------------- | ----- | ---- | ------------ | ---------- | ---------- | ----- | ---------- | ------------ | -------- | ---- |
| MS-SEC-001 | not-started | SEC-API-1: Brief description | #337 | api | fix/security | | MS-SEC-002 | | | | 8K | |
```
**Column definitions:**
| Column | Format | Purpose |
| -------------- | ---------------------------------------------------- | ------------------------------------------- |
| `id` | `MS-{CAT}-{NNN}` | Unique task ID |
| `status` | `not-started` \| `in-progress` \| `done` \| `failed` | Current state |
| `description` | `{FindingID}: Brief summary` | What to fix |
| `issue` | `#NNN` | Gitea issue (phase-level) |
| `repo` | Workspace name | `api`, `web`, `orchestrator`, `coordinator` |
| `branch` | Branch name | `fix/security`, `fix/code-quality`, etc. |
| `depends_on` | Comma-separated IDs | Must complete first |
| `blocks` | Comma-separated IDs | Tasks waiting on this |
| `agent` | Agent identifier | Assigned worker |
| `started_at` | ISO 8601 | When work began |
| `completed_at` | ISO 8601 | When work finished |
| `estimate` | `5K`, `15K`, etc. | Predicted token usage |
| `used` | `4.2K`, `12.8K`, etc. | Actual usage |
### Step 7: Commit Bootstrap
```bash
git add docs/tasks.md docs/orchestrator-learnings.json
git commit -m "chore(orchestrator): Bootstrap tasks.md from review report
Parsed {N} findings into {M} tasks across {P} phases.
Estimated total: {X}K tokens."
git push
```
---
## Phase 2: Execution Loop
```
1. git pull --rebase
2. Read docs/tasks.md
3. Find next task: status=not-started AND all depends_on are done
4. If no task available:
- All done? → Report success, run final retrospective, STOP
- Some blocked? → Report deadlock, STOP
5. Update tasks.md: status=in-progress, agent={identifier}, started_at={now}
6. Spawn worker agent (Task tool) with task details
7. Wait for worker completion
8. Parse worker result (JSON)
9. Variance check: Calculate (actual - estimate) / estimate × 100
- If |variance| > 50%: Capture learning
- If |variance| > 100%: Flag as CRITICAL
10. Update tasks.md: status=done/failed, completed_at={now}, used={actual}
11. Cleanup reports: Remove processed report files
12. Commit + push: git add docs/tasks.md && git commit && git push
13. CI verification (if configured):
- Check latest pipeline status for the branch
- Wait for pipeline completion (timeout: 30min)
- On failure: Fetch logs and auto-diagnose
- Common failures: lint, type-check, test, build, security
- If CI fails: Mark task as failed, update tasks.md, restart from step 1
14. If phase verification task: Run phase retrospective
15. Check context usage
16. If >= 55%: Output COMPACTION REQUIRED checkpoint, STOP, wait for user
17. If < 55%: Go to step 1
18. After user runs /compact and says "continue": Go to step 1
```
---
## Worker Prompt Template
````markdown
## Task Assignment: {id}
**Description:** {description}
**Repository:** apps/{repo}
**Branch:** {branch}
**Reference:** See `docs/reports/` for detailed finding description. Search for the finding ID.
## Workflow
1. Checkout branch: `git checkout {branch} || git checkout -b {branch} develop && git pull`
2. Read the finding details from the report
3. Implement the fix following existing code patterns
4. Run quality gates (ALL must pass):
```bash
pnpm lint && pnpm typecheck && pnpm test
```
5. If gates fail: Fix and retry. Do NOT report success with failures.
6. Commit: `git commit -m "fix({finding_id}): brief description"`
7. Push: `git push origin {branch}`
8. Report result as JSON (see format below)
## Result Format (MANDATORY)
```json
{
"task_id": "{id}",
"status": "success|failed",
"used": "5.2K",
"commit_sha": "abc123",
"notes": "Brief summary of what was done"
}
```
## Rules
- DO NOT modify docs/tasks.md
- DO NOT claim other tasks
- Complete this single task, report results, done
````
---
## CI Verification (Step 13)
After pushing code, the orchestrator can optionally monitor CI pipeline status to catch failures early.
### Configuration
CI monitoring requires environment variables:
```bash
export WOODPECKER_SERVER="https://ci.mosaicstack.dev"
export WOODPECKER_TOKEN="your-token-here"
```
If not configured, CI verification is skipped (orchestrator logs a warning).
### Verification Process
```
1. After git push, get latest pipeline for the branch
2. Wait for pipeline to complete (timeout: 30 minutes, configurable)
3. Poll every 10 seconds (configurable)
4. On completion:
- Success: Continue to next step
- Failure: Fetch logs and auto-diagnose
```
### Auto-Diagnosis
When a pipeline fails, the orchestrator fetches logs and categorizes the failure:
| Category | Pattern | Suggestion |
| ---------- | ------------------------------------- | ------------------------------------ |
| Lint | `eslint`, `lint.*error` | Run `pnpm lint` locally |
| Type Check | `type.*error`, `tsc.*error` | Run `pnpm typecheck` locally |
| Test | `test.*fail`, `vitest.*fail` | Run `pnpm test` locally |
| Build | `build.*fail`, `compilation.*fail` | Run `pnpm build` locally |
| Security | `secret`, `security`, `vulnerability` | Review security scan, remove secrets |
| Unknown | (fallback) | Review full logs |
### Failure Handling
When CI fails:
1. Log the failure category and diagnosis
2. Update task status to `failed` in `tasks.md`
3. Include diagnosis in task notes
4. Commit the task update
5. **Options:**
- Re-spawn worker with error context to fix
- Skip task and continue (if non-critical)
- Stop and alert (if critical blocker)
### CLI Integration
Use `@mosaic/cli-tools` for CI operations:
```bash
# Check latest pipeline status
pnpm exec mosaic-ci-pipeline-status --latest
# Wait for specific pipeline
pnpm exec mosaic-ci-pipeline-wait -n 42 -t 1800
# Get logs on failure
pnpm exec mosaic-ci-pipeline-logs -n 42
```
### Service Integration
The `CIOperationsService` (in `apps/orchestrator/src/ci/`) provides:
- `getLatestPipeline(repo)` - Get most recent pipeline
- `getPipeline(repo, number)` - Get specific pipeline
- `waitForPipeline(repo, number, options)` - Wait with auto-diagnosis
- `getPipelineLogs(repo, number)` - Fetch logs
Example usage in orchestrator:
```typescript
// After git push
const repo = "mosaic/stack";
const pipeline = await ciService.getLatestPipeline(repo);
if (pipeline) {
const result = await ciService.waitForPipeline(repo, pipeline.number, {
timeout: 1800,
fetchLogsOnFailure: true,
});
if (!result.success && result.diagnosis) {
this.logger.warn(`CI failed: ${result.diagnosis.category}`);
this.logger.warn(`Suggestion: ${result.diagnosis.suggestion}`);
// Handle failure...
}
}
```
---
## Context Threshold Protocol (Orchestrator Replacement)
**Threshold:** 55-60% context usage
**Why replacement, not compaction?**
- Compaction causes **protocol drift** — agent "remembers" gist but loses specifics
- Post-compaction agents may violate core rules (e.g., letting workers modify tasks.md)
- Fresh orchestrator has **100% protocol fidelity**
- All state lives in `docs/tasks.md` — the orchestrator is **stateless and replaceable**
**At threshold (55-60%):**
1. Complete current task
2. Persist all state:
- Update docs/tasks.md with all progress
- Update docs/orchestrator-learnings.json with variances
- Commit and push both files
3. Output **ORCHESTRATOR HANDOFF** message with ready-to-use takeover kickstart
4. **STOP COMPLETELY** — do not continue working
**Handoff message format:**
```
---
⚠️ ORCHESTRATOR HANDOFF REQUIRED
Context: {X}% — Replacement recommended to prevent drift
Progress: {completed}/{total} tasks ({percentage}%)
Current phase: Phase {N} ({phase_name})
State persisted:
- docs/tasks.md ✓
- docs/orchestrator-learnings.json ✓
## Takeover Kickstart
Copy and paste this to spawn a fresh orchestrator:
---
## Continuation Mission
Continue {mission_description} from existing state.
## Setup
- Project: /home/localadmin/src/mosaic-stack
- State: docs/tasks.md (already populated)
- Protocol: docs/claude/orchestrator.md
- Quality gates: pnpm lint && pnpm typecheck && pnpm test
## Resume Point
- Next task: {task_id}
- Phase: {current_phase}
- Progress: {completed}/{total} tasks ({percentage}%)
## Instructions
1. Read docs/claude/orchestrator.md for protocol
2. Read docs/tasks.md to understand current state
3. Continue execution from task {task_id}
4. Follow Two-Phase Completion Protocol
5. You are the SOLE writer of docs/tasks.md
---
STOP: Terminate this session and spawn fresh orchestrator with the kickstart above.
---
```
**Future: Coordinator Automation**
When the Mosaic Stack Coordinator service is implemented, it will:
- Monitor orchestrator stdout for context percentage
- Detect the handoff checkpoint message
- Parse the takeover kickstart
- Automatically spawn fresh orchestrator
- Log handoff events for debugging
For now, the human acts as Coordinator.
**Rules:**
- Do NOT attempt to compact yourself — compaction causes drift
- Do NOT continue past 60%
- Do NOT claim you can "just continue" — protocol drift is real
- STOP means STOP — the user (Coordinator) will spawn your replacement
---
## Two-Phase Completion Protocol
Each major phase uses a two-phase approach to maximize completion while managing diminishing returns.
### Bulk Phase (Target: 90%)
- Focus on tractable errors
- Parallelize where possible
- When 90% reached, transition to Polish (do NOT declare success)
### Polish Phase (Target: 100%)
1. **Inventory:** List all remaining errors with file:line
2. **Categorize:**
| Category | Criteria | Action |
|----------|----------|--------|
| Quick-win | <5 min, straightforward | Fix immediately |
| Medium | 5-30 min, clear path | Fix in order |
| Hard | >30 min or uncertain | Attempt 15 min, then document |
| Architectural | Requires design change | Document and defer |
3. **Work priority:** Quick-win → Medium → Hard
4. **Document deferrals** in `docs/deferred-errors.md`:
```markdown
## MS-XXX: [Error description]
- File: path/to/file.ts:123
- Error: [exact error message]
- Category: Hard | Architectural | Framework Limitation
- Reason: [why this is non-trivial]
- Suggested approach: [how to fix in future]
- Risk: Low | Medium | High
```
5. **Phase complete when:**
- All Quick-win/Medium fixed
- All Hard attempted (fixed or documented)
- Architectural items documented with justification
### Phase Boundary Rule
Do NOT proceed to the next major phase until the current phase reaches Polish completion:
```
✅ Phase 2 Bulk: 91%
✅ Phase 2 Polish: 118 errors triaged
- 40 medium → fixed
- 78 low → EACH documented with rationale
✅ Phase 2 Complete: Created docs/deferred-errors.md
→ NOW proceed to Phase 3
❌ WRONG: Phase 2 at 91%, "low priority acceptable", starting Phase 3
```
### Reporting
When transitioning from Bulk to Polish:
```
Phase X Bulk Complete: {N}% ({fixed}/{total})
Entering Polish Phase: {remaining} errors to triage
```
When Polish Phase complete:
```
Phase X Complete: {final_pct}% ({fixed}/{total})
- Quick-wins: {n} fixed
- Medium: {n} fixed
- Hard: {n} fixed, {n} documented
- Framework limitations: {n} documented
```
---
## Learning & Retrospective
### Variance Thresholds
| Variance | Action |
| -------- | ------------------------------------------------------ |
| 0-30% | Log only (acceptable) |
| 30-50% | Flag for review |
| 50-100% | Capture learning to `docs/orchestrator-learnings.json` |
| >100% | CRITICAL — review task classification |
### Task Type Classification
| Type | Keywords | Base Estimate |
| ------------ | -------------------------------------- | ---------------- |
| STYLE_FIX | "formatting", "prettier", "lint" | 3-5K |
| BULK_CLEANUP | "unused", "warnings", "~N files" | file_count × 550 |
| GUARD_ADD | "add guard", "decorator", "validation" | 5-8K |
| SECURITY_FIX | "sanitize", "injection", "XSS" | 8-12K × 2.5 |
| AUTH_ADD | "authentication", "auth" | 15-25K |
| REFACTOR | "refactor", "replace", "migrate" | 10-15K |
| TEST_ADD | "add tests", "coverage" | 15-25K |
---
## Report Cleanup
QA automation generates report files in `docs/reports/qa-automation/pending/`. Clean up after processing.
| Event | Action |
| ------------------ | --------------------------------------- |
| Task success | Delete matching reports from `pending/` |
| Task failed | Move reports to `escalated/` |
| Phase verification | Clean up all `pending/` reports |
| Milestone complete | Archive or delete `escalated/` |
---
## Stopping Criteria
**ONLY stop if:**
1. All tasks in docs/tasks.md are `done`
2. Critical blocker preventing progress (document and alert)
3. Context usage >= 55% — output COMPACTION REQUIRED checkpoint and wait
4. Absolute context limit reached AND cannot compact further
**DO NOT stop to ask "should I continue?"** — the answer is always YES.
**DO stop at 55-60%** — output the compaction checkpoint and wait for user to run `/compact`.
---
## Sprint Completion Protocol
When all tasks in `docs/tasks.md` are `done` (or triaged as `deferred`), archive the sprint artifacts before stopping. This preserves them for post-mortems, variance calibration, and historical reference.
### Archive Steps
1. **Create archive directory** (if it doesn't exist):
```bash
mkdir -p docs/tasks/
```
2. **Move tasks.md to archive:**
```bash
mv docs/tasks.md docs/tasks/{milestone-name}-tasks.md
```
Example: `docs/tasks/M6-AgentOrchestration-Fixes-tasks.md`
3. **Move learnings to archive:**
```bash
mv docs/orchestrator-learnings.json docs/tasks/{milestone-name}-learnings.json
```
4. **Commit the archive:**
```bash
git add docs/tasks/
git rm docs/tasks.md docs/orchestrator-learnings.json 2>/dev/null || true
git commit -m "chore(orchestrator): Archive {milestone-name} sprint artifacts
{completed}/{total} tasks completed, {deferred} deferred.
Archived to docs/tasks/ for post-mortem reference."
git push
```
5. **Run final retrospective** — review variance patterns and propose updates to estimation heuristics.
### Recovery
If an orchestrator starts and `docs/tasks.md` does not exist, check `docs/tasks/` for the most recent archive:
```bash
ls -t docs/tasks/*-tasks.md 2>/dev/null | head -1
```
If found, this may indicate another session archived the file. The orchestrator should:
1. Report what it found in `docs/tasks/`
2. Ask whether to resume from the archived file or bootstrap fresh
3. If resuming: copy the archive back to `docs/tasks.md` and continue
### Retention Policy
Keep all archived sprints indefinitely. They are small text files and valuable for:
- Post-mortem analysis
- Estimation variance calibration across milestones
- Understanding what was deferred and why
- Onboarding new orchestrators to project history
---
## Kickstart Message Format
```markdown
## Mission
Remediate findings from the codebase review.
## Setup
- Project: /home/localadmin/src/mosaic-stack
- Review: docs/reports/{report-name}/
- Quality gates: pnpm lint && pnpm typecheck && pnpm test
- Milestone: {milestone-name}
- Task prefix: MS
## Protocol
Read docs/claude/orchestrator.md for full instructions.
## Start
Bootstrap from the review report, then execute until complete.
```
---
## Quick Reference
| Phase | Action |
| --------- | ----------------------------------------------------------------------- |
| Bootstrap | Parse reports → Categorize → Estimate → Create issues → Create tasks.md |
| Execute | Loop: claim → spawn worker → update → commit |
| Compact | At 60%: summarize, clear history, continue |
| Stop | Queue empty, blocker, or context limit |
**Orchestrator owns tasks.md. Workers execute and report. Single writer eliminates conflicts.**