stack/docs/claude/orchestrator.md

# Mosaic Stack Orchestrator Guide

> Platform-specific orchestrator protocol for Mosaic Stack.

## Overview

The orchestrator **cold-starts** with just a review report location and minimal kickstart. It autonomously:

1. Parses review reports to extract findings
2. Categorizes findings into phases by severity
3. Estimates token usage per task
4. Creates Gitea issues (phase-level)
5. Bootstraps `docs/tasks.md` from scratch
6. Coordinates completion using worker agents

**Key principle:** The orchestrator is the **sole writer** of `docs/tasks.md`. Worker agents execute tasks and report results — they never modify the tracking file.

---

## Orchestrator Boundaries (CRITICAL)

**The orchestrator NEVER:**

- Edits source code directly (_.ts, _.tsx, \*.js, etc.)
- Runs quality gates itself (that's the worker's job)
- Makes commits containing code changes
- "Quickly fixes" something to save time — this is how drift starts

**The orchestrator ONLY:**

- Reads/writes `docs/tasks.md`
- Reads/writes `docs/orchestrator-learnings.json`
- Spawns workers via the Task tool for ALL code changes
- Parses worker JSON results
- Commits task tracking updates (tasks.md, learnings)
- Outputs status reports and handoff messages

**If you find yourself about to edit source code, STOP.**
Spawn a worker instead. No exceptions. No "quick fixes."

**Worker Limits:**

- Maximum **2 parallel workers** at any time
- Wait for at least one worker to complete before spawning more
- This optimizes token usage and reduces context pressure

> **Future:** Worker limits and other orchestrator settings will be DB-configurable via the Coordinator service.

---

## Bootstrap Templates

Use templates from `docs/templates/` (relative to repo root):

```bash
# Set environment variables
export PROJECT="mosaic-stack"
export MILESTONE="M6-Feature"
export CURRENT_DATETIME=$(date -Iseconds)
export TASK_PREFIX="MS-SEC"
export PHASE_ISSUE="#337"
export PHASE_BRANCH="fix/security"

# Create tasks.md (then populate with findings)
envsubst < docs/templates/orchestrator/tasks.md.template > docs/tasks.md

# Create learnings tracking
envsubst < docs/templates/orchestrator/orchestrator-learnings.json.template > docs/orchestrator-learnings.json

# Create review report structure (if doing new review)
./docs/templates/reports/review-report-scaffold.sh codebase-review mosaic-stack
```

**Available templates:**

| Template                                            | Purpose                         |
| --------------------------------------------------- | ------------------------------- |
| `orchestrator/tasks.md.template`                    | Task tracking table with schema |
| `orchestrator/orchestrator-learnings.json.template` | Variance tracking               |
| `orchestrator/phase-issue-body.md.template`         | Gitea issue body                |
| `orchestrator/compaction-summary.md.template`       | 60% checkpoint format           |
| `reports/review-report-scaffold.sh`                 | Creates report directory        |
| `scratchpad.md.template`                            | Per-task working document       |

See `docs/templates/README.md` for full documentation.

### CLI Tools

Git and CI operations use `@mosaic/cli-tools` package:

```bash
# Issue operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-issue-create -t "Title" -b "Body" -m "Milestone"
pnpm exec mosaic-issue-list -s open -m "Milestone"

# PR operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-pr-create -t "Title" -b "Body" -B develop
pnpm exec mosaic-pr-merge -n 42 -m squash -d

# Milestone operations (auto-detects Gitea vs GitHub)
pnpm exec mosaic-milestone-create -t "M7-Feature" -d "Description"

# CI/CD operations (Woodpecker)
pnpm exec mosaic-ci-pipeline-status --latest
pnpm exec mosaic-ci-pipeline-wait -n 42
pnpm exec mosaic-ci-pipeline-logs -n 42
```

See `packages/cli-tools/README.md` for full command reference.

### CI Configuration

Set these environment variables for Woodpecker CI integration:

```bash
export WOODPECKER_SERVER="https://ci.mosaicstack.dev"
export WOODPECKER_TOKEN="your-token-here"  # Get from ci.mosaicstack.dev/user
```

---

## Phase 1: Bootstrap

### Step 1: Parse Review Reports

Review reports follow this structure:

```
docs/reports/{report-name}/
├── 00-executive-summary.md   # Start here - overview and counts
├── 01-security-review.md     # Security findings with IDs like SEC-*
├── 02-code-quality-review.md # Code quality findings like CQ-*
├── 03-qa-test-coverage.md    # Test coverage gaps like TEST-*
└── ...
```

**Extract findings by looking for:**

- Finding IDs (e.g., `SEC-API-1`, `CQ-WEB-3`, `TEST-001`)
- Severity labels: Critical, High, Medium, Low
- Affected files/components (use for `repo` column)
- Specific line numbers or code patterns

### Step 2: Categorize into Phases

| Severity | Phase | Focus                                   | Branch Pattern      |
| -------- | ----- | --------------------------------------- | ------------------- |
| Critical | 1     | Security vulnerabilities, data exposure | `fix/security`      |
| High     | 2     | Security hardening, auth gaps           | `fix/security`      |
| Medium   | 3     | Code quality, performance, bugs         | `fix/code-quality`  |
| Low      | 4     | Tests, documentation, cleanup           | `fix/test-coverage` |

**Within each phase, order tasks by:**

1. Blockers first (tasks that unblock others)
2. Same-file tasks grouped together
3. Simpler fixes before complex ones

### Step 3: Estimate Token Usage

| Task Type             | Estimate | Examples                                  |
| --------------------- | -------- | ----------------------------------------- |
| Single-line fix       | 3-5K     | Typo, wrong operator, missing null check  |
| Add guard/validation  | 5-8K     | Add auth decorator, input validation      |
| Fix error handling    | 8-12K    | Proper try/catch, error propagation       |
| Refactor pattern      | 10-15K   | Replace KEYS with SCAN, fix memory leak   |
| Add new functionality | 15-25K   | New service method, new component         |
| Write tests           | 15-25K   | Unit tests for untested service           |
| Complex refactor      | 25-40K   | Architectural change, multi-file refactor |

**Adjust estimates based on:**

- Number of files affected (+5K per additional file)
- Test requirements (+5-10K if tests needed)
- Documentation needs (+2-3K if docs needed)

### Step 4: Determine Dependencies

**Automatic dependency rules:**

1. All tasks in Phase N depend on the Phase N-1 verification task
2. Tasks touching the same file should be sequential (earlier blocks later)
3. Auth/security foundation tasks block tasks that rely on them
4. Each phase ends with a verification task that depends on all phase tasks

### Step 5: Create Gitea Issues (Phase-Level)

Create ONE issue per phase using `@mosaic/cli-tools`:

```bash
# Use mosaic CLI tools (auto-detects Gitea vs GitHub)
pnpm exec mosaic-issue-create \
  -t "Phase 1: Critical Security Fixes" \
  -b "## Findings
- SEC-API-1: Description
- SEC-WEB-2: Description

## Acceptance Criteria
- [ ] All critical findings remediated
- [ ] Quality gates passing" \
  -l "security" \
  -m "{milestone-name}"
```

**CLI tools location:** `packages/cli-tools/bin/` - see `packages/cli-tools/README.md` for full documentation.

### Step 6: Create docs/tasks.md

Create the file with this exact schema:

```markdown
# Tasks

| id         | status      | description                  | issue | repo | branch       | depends_on | blocks     | agent | started_at | completed_at | estimate | used |
| ---------- | ----------- | ---------------------------- | ----- | ---- | ------------ | ---------- | ---------- | ----- | ---------- | ------------ | -------- | ---- |
| MS-SEC-001 | not-started | SEC-API-1: Brief description | #337  | api  | fix/security |            | MS-SEC-002 |       |            |              | 8K       |      |
```

**Column definitions:**

| Column         | Format                                               | Purpose                                     |
| -------------- | ---------------------------------------------------- | ------------------------------------------- |
| `id`           | `MS-{CAT}-{NNN}`                                     | Unique task ID                              |
| `status`       | `not-started` \| `in-progress` \| `done` \| `failed` | Current state                               |
| `description`  | `{FindingID}: Brief summary`                         | What to fix                                 |
| `issue`        | `#NNN`                                               | Gitea issue (phase-level)                   |
| `repo`         | Workspace name                                       | `api`, `web`, `orchestrator`, `coordinator` |
| `branch`       | Branch name                                          | `fix/security`, `fix/code-quality`, etc.    |
| `depends_on`   | Comma-separated IDs                                  | Must complete first                         |
| `blocks`       | Comma-separated IDs                                  | Tasks waiting on this                       |
| `agent`        | Agent identifier                                     | Assigned worker                             |
| `started_at`   | ISO 8601                                             | When work began                             |
| `completed_at` | ISO 8601                                             | When work finished                          |
| `estimate`     | `5K`, `15K`, etc.                                    | Predicted token usage                       |
| `used`         | `4.2K`, `12.8K`, etc.                                | Actual usage                                |

### Step 7: Commit Bootstrap

```bash
git add docs/tasks.md docs/orchestrator-learnings.json
git commit -m "chore(orchestrator): Bootstrap tasks.md from review report

Parsed {N} findings into {M} tasks across {P} phases.
Estimated total: {X}K tokens."
git push
```

---

## Phase 2: Execution Loop

```
1. git pull --rebase
2. Read docs/tasks.md
3. Find next task: status=not-started AND all depends_on are done
4. If no task available:
   - All done? → Report success, run final retrospective, STOP
   - Some blocked? → Report deadlock, STOP
5. Update tasks.md: status=in-progress, agent={identifier}, started_at={now}
6. Spawn worker agent (Task tool) with task details
7. Wait for worker completion
8. Parse worker result (JSON)
9. Variance check: Calculate (actual - estimate) / estimate × 100
   - If |variance| > 50%: Capture learning
   - If |variance| > 100%: Flag as CRITICAL
10. Update tasks.md: status=done/failed, completed_at={now}, used={actual}
11. Cleanup reports: Remove processed report files
12. Commit + push: git add docs/tasks.md && git commit && git push
13. CI verification (if configured):
    - Check latest pipeline status for the branch
    - Wait for pipeline completion (timeout: 30min)
    - On failure: Fetch logs and auto-diagnose
    - Common failures: lint, type-check, test, build, security
    - If CI fails: Mark task as failed, update tasks.md, restart from step 1
14. If phase verification task: Run phase retrospective
15. Check context usage
16. If >= 55%: Output COMPACTION REQUIRED checkpoint, STOP, wait for user
17. If < 55%: Go to step 1
18. After user runs /compact and says "continue": Go to step 1
```

---

## Worker Prompt Template

````markdown
## Task Assignment: {id}

**Description:** {description}
**Repository:** apps/{repo}
**Branch:** {branch}

**Reference:** See `docs/reports/` for detailed finding description. Search for the finding ID.

## Workflow

1. Checkout branch: `git checkout {branch} || git checkout -b {branch} develop && git pull`
2. Read the finding details from the report
3. Implement the fix following existing code patterns
4. Run quality gates (ALL must pass):
   ```bash
   pnpm lint && pnpm typecheck && pnpm test
   ```
5. If gates fail: Fix and retry. Do NOT report success with failures.
6. Commit: `git commit -m "fix({finding_id}): brief description"`
7. Push: `git push origin {branch}`
8. Report result as JSON (see format below)

## Result Format (MANDATORY)

```json
{
  "task_id": "{id}",
  "status": "success|failed",
  "used": "5.2K",
  "commit_sha": "abc123",
  "notes": "Brief summary of what was done"
}
```

## Rules

- DO NOT modify docs/tasks.md
- DO NOT claim other tasks
- Complete this single task, report results, done
````

---

## CI Verification (Step 13)

After pushing code, the orchestrator can optionally monitor CI pipeline status to catch failures early.

### Configuration

CI monitoring requires environment variables:

```bash
export WOODPECKER_SERVER="https://ci.mosaicstack.dev"
export WOODPECKER_TOKEN="your-token-here"
```

If not configured, CI verification is skipped (orchestrator logs a warning).

### Verification Process

```
1. After git push, get latest pipeline for the branch
2. Wait for pipeline to complete (timeout: 30 minutes, configurable)
3. Poll every 10 seconds (configurable)
4. On completion:
   - Success: Continue to next step
   - Failure: Fetch logs and auto-diagnose
```

### Auto-Diagnosis

When a pipeline fails, the orchestrator fetches logs and categorizes the failure:

| Category   | Pattern                               | Suggestion                           |
| ---------- | ------------------------------------- | ------------------------------------ |
| Lint       | `eslint`, `lint.*error`               | Run `pnpm lint` locally              |
| Type Check | `type.*error`, `tsc.*error`           | Run `pnpm typecheck` locally         |
| Test       | `test.*fail`, `vitest.*fail`          | Run `pnpm test` locally              |
| Build      | `build.*fail`, `compilation.*fail`    | Run `pnpm build` locally             |
| Security   | `secret`, `security`, `vulnerability` | Review security scan, remove secrets |
| Unknown    | (fallback)                            | Review full logs                     |

### Failure Handling

When CI fails:

1. Log the failure category and diagnosis
2. Update task status to `failed` in `tasks.md`
3. Include diagnosis in task notes
4. Commit the task update
5. **Options:**
   - Re-spawn worker with error context to fix
   - Skip task and continue (if non-critical)
   - Stop and alert (if critical blocker)

### CLI Integration

Use `@mosaic/cli-tools` for CI operations:

```bash
# Check latest pipeline status
pnpm exec mosaic-ci-pipeline-status --latest

# Wait for specific pipeline
pnpm exec mosaic-ci-pipeline-wait -n 42 -t 1800

# Get logs on failure
pnpm exec mosaic-ci-pipeline-logs -n 42
```

### Service Integration

The `CIOperationsService` (in `apps/orchestrator/src/ci/`) provides:

- `getLatestPipeline(repo)` - Get most recent pipeline
- `getPipeline(repo, number)` - Get specific pipeline
- `waitForPipeline(repo, number, options)` - Wait with auto-diagnosis
- `getPipelineLogs(repo, number)` - Fetch logs

Example usage in orchestrator:

```typescript
// After git push
const repo = "mosaic/stack";
const pipeline = await ciService.getLatestPipeline(repo);

if (pipeline) {
  const result = await ciService.waitForPipeline(repo, pipeline.number, {
    timeout: 1800,
    fetchLogsOnFailure: true,
  });

  if (!result.success && result.diagnosis) {
    this.logger.warn(`CI failed: ${result.diagnosis.category}`);
    this.logger.warn(`Suggestion: ${result.diagnosis.suggestion}`);
    // Handle failure...
  }
}
```

---

## Context Threshold Protocol (Orchestrator Replacement)

**Threshold:** 55-60% context usage

**Why replacement, not compaction?**

- Compaction causes **protocol drift** — agent "remembers" gist but loses specifics
- Post-compaction agents may violate core rules (e.g., letting workers modify tasks.md)
- Fresh orchestrator has **100% protocol fidelity**
- All state lives in `docs/tasks.md` — the orchestrator is **stateless and replaceable**

**At threshold (55-60%):**

1. Complete current task
2. Persist all state:
   - Update docs/tasks.md with all progress
   - Update docs/orchestrator-learnings.json with variances
   - Commit and push both files
3. Output **ORCHESTRATOR HANDOFF** message with ready-to-use takeover kickstart
4. **STOP COMPLETELY** — do not continue working

**Handoff message format:**

```
---
⚠️ ORCHESTRATOR HANDOFF REQUIRED

Context: {X}% — Replacement recommended to prevent drift

Progress: {completed}/{total} tasks ({percentage}%)
Current phase: Phase {N} ({phase_name})

State persisted:
- docs/tasks.md ✓
- docs/orchestrator-learnings.json ✓

## Takeover Kickstart

Copy and paste this to spawn a fresh orchestrator:

---
## Continuation Mission

Continue {mission_description} from existing state.

## Setup
- Project: /home/localadmin/src/mosaic-stack
- State: docs/tasks.md (already populated)
- Protocol: docs/claude/orchestrator.md
- Quality gates: pnpm lint && pnpm typecheck && pnpm test

## Resume Point
- Next task: {task_id}
- Phase: {current_phase}
- Progress: {completed}/{total} tasks ({percentage}%)

## Instructions
1. Read docs/claude/orchestrator.md for protocol
2. Read docs/tasks.md to understand current state
3. Continue execution from task {task_id}
4. Follow Two-Phase Completion Protocol
5. You are the SOLE writer of docs/tasks.md
---

STOP: Terminate this session and spawn fresh orchestrator with the kickstart above.
---
```

**Future: Coordinator Automation**

When the Mosaic Stack Coordinator service is implemented, it will:

- Monitor orchestrator stdout for context percentage
- Detect the handoff checkpoint message
- Parse the takeover kickstart
- Automatically spawn fresh orchestrator
- Log handoff events for debugging

For now, the human acts as Coordinator.

**Rules:**

- Do NOT attempt to compact yourself — compaction causes drift
- Do NOT continue past 60%
- Do NOT claim you can "just continue" — protocol drift is real
- STOP means STOP — the user (Coordinator) will spawn your replacement

---

## Two-Phase Completion Protocol

Each major phase uses a two-phase approach to maximize completion while managing diminishing returns.

### Bulk Phase (Target: 90%)

- Focus on tractable errors
- Parallelize where possible
- When 90% reached, transition to Polish (do NOT declare success)

### Polish Phase (Target: 100%)

1. **Inventory:** List all remaining errors with file:line
2. **Categorize:**
   | Category | Criteria | Action |
   |----------|----------|--------|
   | Quick-win | <5 min, straightforward | Fix immediately |
   | Medium | 5-30 min, clear path | Fix in order |
   | Hard | >30 min or uncertain | Attempt 15 min, then document |
   | Architectural | Requires design change | Document and defer |

3. **Work priority:** Quick-win → Medium → Hard
4. **Document deferrals** in `docs/deferred-errors.md`:

   ```markdown
   ## MS-XXX: [Error description]

   - File: path/to/file.ts:123
   - Error: [exact error message]
   - Category: Hard | Architectural | Framework Limitation
   - Reason: [why this is non-trivial]
   - Suggested approach: [how to fix in future]
   - Risk: Low | Medium | High
   ```

5. **Phase complete when:**
   - All Quick-win/Medium fixed
   - All Hard attempted (fixed or documented)
   - Architectural items documented with justification

### Phase Boundary Rule

Do NOT proceed to the next major phase until the current phase reaches Polish completion:

```
✅ Phase 2 Bulk: 91%
✅ Phase 2 Polish: 118 errors triaged
   - 40 medium → fixed
   - 78 low → EACH documented with rationale
✅ Phase 2 Complete: Created docs/deferred-errors.md
→ NOW proceed to Phase 3

❌ WRONG: Phase 2 at 91%, "low priority acceptable", starting Phase 3
```

### Reporting

When transitioning from Bulk to Polish:

```
Phase X Bulk Complete: {N}% ({fixed}/{total})
Entering Polish Phase: {remaining} errors to triage
```

When Polish Phase complete:

```
Phase X Complete: {final_pct}% ({fixed}/{total})
- Quick-wins: {n} fixed
- Medium: {n} fixed
- Hard: {n} fixed, {n} documented
- Framework limitations: {n} documented
```

---

## Learning & Retrospective

### Variance Thresholds

| Variance | Action                                                 |
| -------- | ------------------------------------------------------ |
| 0-30%    | Log only (acceptable)                                  |
| 30-50%   | Flag for review                                        |
| 50-100%  | Capture learning to `docs/orchestrator-learnings.json` |
| >100%    | CRITICAL — review task classification                  |

### Task Type Classification

| Type         | Keywords                               | Base Estimate    |
| ------------ | -------------------------------------- | ---------------- |
| STYLE_FIX    | "formatting", "prettier", "lint"       | 3-5K             |
| BULK_CLEANUP | "unused", "warnings", "~N files"       | file_count × 550 |
| GUARD_ADD    | "add guard", "decorator", "validation" | 5-8K             |
| SECURITY_FIX | "sanitize", "injection", "XSS"         | 8-12K × 2.5      |
| AUTH_ADD     | "authentication", "auth"               | 15-25K           |
| REFACTOR     | "refactor", "replace", "migrate"       | 10-15K           |
| TEST_ADD     | "add tests", "coverage"                | 15-25K           |

---

## Report Cleanup

QA automation generates report files in `docs/reports/qa-automation/pending/`. Clean up after processing.

| Event              | Action                                  |
| ------------------ | --------------------------------------- |
| Task success       | Delete matching reports from `pending/` |
| Task failed        | Move reports to `escalated/`            |
| Phase verification | Clean up all `pending/` reports         |
| Milestone complete | Archive or delete `escalated/`          |

---

## Stopping Criteria

**ONLY stop if:**

1. All tasks in docs/tasks.md are `done`
2. Critical blocker preventing progress (document and alert)
3. Context usage >= 55% — output COMPACTION REQUIRED checkpoint and wait
4. Absolute context limit reached AND cannot compact further

**DO NOT stop to ask "should I continue?"** — the answer is always YES.
**DO stop at 55-60%** — output the compaction checkpoint and wait for user to run `/compact`.

---

## Sprint Completion Protocol

When all tasks in `docs/tasks.md` are `done` (or triaged as `deferred`), archive the sprint artifacts before stopping. This preserves them for post-mortems, variance calibration, and historical reference.

### Archive Steps

1. **Create archive directory** (if it doesn't exist):

   ```bash
   mkdir -p docs/tasks/
   ```

2. **Move tasks.md to archive:**

   ```bash
   mv docs/tasks.md docs/tasks/{milestone-name}-tasks.md
   ```

   Example: `docs/tasks/M6-AgentOrchestration-Fixes-tasks.md`

3. **Move learnings to archive:**

   ```bash
   mv docs/orchestrator-learnings.json docs/tasks/{milestone-name}-learnings.json
   ```

4. **Commit the archive:**

   ```bash
   git add docs/tasks/
   git rm docs/tasks.md docs/orchestrator-learnings.json 2>/dev/null || true
   git commit -m "chore(orchestrator): Archive {milestone-name} sprint artifacts

   {completed}/{total} tasks completed, {deferred} deferred.
   Archived to docs/tasks/ for post-mortem reference."
   git push
   ```

5. **Run final retrospective** — review variance patterns and propose updates to estimation heuristics.

### Recovery

If an orchestrator starts and `docs/tasks.md` does not exist, check `docs/tasks/` for the most recent archive:

```bash
ls -t docs/tasks/*-tasks.md 2>/dev/null | head -1
```

If found, this may indicate another session archived the file. The orchestrator should:

1. Report what it found in `docs/tasks/`
2. Ask whether to resume from the archived file or bootstrap fresh
3. If resuming: copy the archive back to `docs/tasks.md` and continue

### Retention Policy

Keep all archived sprints indefinitely. They are small text files and valuable for:

- Post-mortem analysis
- Estimation variance calibration across milestones
- Understanding what was deferred and why
- Onboarding new orchestrators to project history

---

## Kickstart Message Format

```markdown
## Mission

Remediate findings from the codebase review.

## Setup

- Project: /home/localadmin/src/mosaic-stack
- Review: docs/reports/{report-name}/
- Quality gates: pnpm lint && pnpm typecheck && pnpm test
- Milestone: {milestone-name}
- Task prefix: MS

## Protocol

Read docs/claude/orchestrator.md for full instructions.

## Start

Bootstrap from the review report, then execute until complete.
```

---

## Quick Reference

| Phase     | Action                                                                  |
| --------- | ----------------------------------------------------------------------- |
| Bootstrap | Parse reports → Categorize → Estimate → Create issues → Create tasks.md |
| Execute   | Loop: claim → spawn worker → update → commit                            |
| Compact   | At 60%: summarize, clear history, continue                              |
| Stop      | Queue empty, blocker, or context limit                                  |

**Orchestrator owns tasks.md. Workers execute and report. Single writer eliminates conflicts.**