diff --git a/docs/MACP-BRIEF-TEMPLATE.md b/docs/MACP-BRIEF-TEMPLATE.md new file mode 100644 index 0000000..7641eee --- /dev/null +++ b/docs/MACP-BRIEF-TEMPLATE.md @@ -0,0 +1,86 @@ +# MACP Task Brief Template + +**Use this template for all MACP task briefs.** Workers that receive briefs not following this structure should flag it as an issue. + +--- + +```markdown +#
+
+### Constraints:
+-
+
+---
+
+## Task 2:
+
+
+---
+
+## Tests (MANDATORY)
+
+**Every brief MUST include a Tests section. Workers MUST write tests before or alongside implementation. Tests MUST pass before committing.**
+
+### Test file: `tests/test_.py`
+
+### Test cases:
+1. `test_` —
+2. `test_` —
+...
+
+### Test runner:
+```bash
+python3 -m unittest discover -s tests -p 'test_*.py' -v
+```
+
+---
+
+## Verification
+
+1. All tests pass: ``
+2. Python syntax: `python3 -c "import "`
+3.
+
+## Ground Rules
+- Python 3.10+ stdlib only (no pip dependencies)
+- Commit message: `feat: ` (conventional commits)
+- Push to `feat/` branch when done
+```
+
+---
+
+## Brief Sizing Rules
+
+| Brief Type | Max Items | Rationale |
+|------------|-----------|-----------|
+| **Build** (new code) | 2-3 | High cognitive load per item |
+| **Fix** (surgical changes) | 5-7 | Low cognitive load, exact file/line/fix |
+| **Review** | 1 | Naturally focused |
+| **Test** (add tests) | 3-4 | Medium load, but well-scoped |
+
+The key metric is **cognitive load per item**, not item count.
+- Build = construction (high load)
+- Fix = scalpel (low load)
+- Review = naturally focused
+- Test = moderate (reading existing code + writing test logic)
diff --git a/docs/tasks/MACP-PHASE2A-tests.md b/docs/tasks/MACP-PHASE2A-tests.md
new file mode 100644
index 0000000..7b09b1f
--- /dev/null
+++ b/docs/tasks/MACP-PHASE2A-tests.md
@@ -0,0 +1,81 @@
+# MACP Phase 2A — Test Suite
+
+**Branch:** `feat/macp-phase2a` (commit on top of existing)
+**Repo worktree:** `~/src/mosaic-bootstrap-worktrees/macp-phase2a`
+
+---
+
+## Objective
+
+Write a comprehensive test suite for the Phase 2A event bridge code using Python `unittest` (stdlib only). Tests must be runnable with `python3 -m pytest tests/` or `python3 -m unittest discover tests/`.
+
+---
+
+## Task 1: Test infrastructure (`tests/conftest.py` + `tests/run_tests.sh`)
+
+Create `tests/` directory at repo root with:
+- `conftest.py` — shared fixtures: temp directories, sample events, sample config
+- `run_tests.sh` — simple runner: `python3 -m unittest discover -s tests -p 'test_*.py' -v`
+- `__init__.py` — empty, makes tests a package
+
+Sample events fixture should include one of each type: `task.assigned`, `task.started`, `task.completed`, `task.failed`, `task.escalated`, `task.gated`, `task.retry.scheduled`
+
+---
+
+## Task 2: Event watcher tests (`tests/test_event_watcher.py`)
+
+Test the `EventWatcher` class from `tools/orchestrator-matrix/events/event_watcher.py`.
+
+### Test cases:
+1. `test_poll_empty_file` — No events file exists → returns empty list
+2. `test_poll_new_events` — Write 3 events to ndjson, poll → returns all 3
+3. `test_cursor_persistence` — Poll once (reads 3), poll again → returns 0 (cursor saved)
+4. `test_cursor_survives_restart` — Poll, create new watcher instance, poll → no duplicates
+5. `test_corrupt_line_skipped` — Insert a corrupt JSON line between valid events → valid events returned, corrupt skipped
+6. `test_callback_filtering` — Register callback for `task.completed` only → only completed events trigger it
+7. `test_callback_receives_events` — Register callback, poll → callback called with correct event dicts
+8. `test_file_grows_between_polls` — Poll (gets 2), append 3 more, poll → gets 3
+
+---
+
+## Task 3: Webhook adapter tests (`tests/test_webhook_adapter.py`)
+
+Test `send_webhook` and `create_webhook_callback` from `tools/orchestrator-matrix/events/webhook_adapter.py`.
+
+### Test cases:
+1. `test_send_webhook_success` — Mock HTTP response 200 → returns True
+2. `test_send_webhook_failure` — Mock HTTP response 500 → returns False
+3. `test_send_webhook_timeout` — Mock timeout → returns False, no crash
+4. `test_send_webhook_retry` — Mock 500 then 200 → retries and succeeds
+5. `test_event_filter` — Config with filter `["task.completed"]` → callback ignores `task.started`
+6. `test_webhook_disabled` — Config with `enabled: false` → no HTTP call made
+7. `test_ssrf_blocked` — URL with private IP (127.0.0.1, 10.x) → blocked, returns False
+
+Use `unittest.mock.patch` to mock `urllib.request.urlopen`.
+
+---
+
+## Task 4: Discord formatter tests (`tests/test_discord_formatter.py`)
+
+Test `format_event` and `format_summary` from `tools/orchestrator-matrix/events/discord_formatter.py`.
+
+### Test cases:
+1. `test_format_completed` — Completed event → contains "✅" and task ID
+2. `test_format_failed` — Failed event → contains "❌" and error message
+3. `test_format_escalated` — Escalated event → contains "🚨" and escalation reason
+4. `test_format_gated` — Gated event → contains "🔍"
+5. `test_format_started` — Started event → contains "⚙️" and runtime info
+6. `test_format_unknown_type` — Unknown event type → returns None
+7. `test_sanitize_control_chars` — Event with control characters in message → stripped in output
+8. `test_sanitize_mentions` — Event with `@everyone` in message → neutralized in output
+9. `test_format_summary` — List of mixed events → summary with counts
+
+---
+
+## Verification
+
+After writing tests:
+1. `cd ~/src/mosaic-bootstrap-worktrees/macp-phase2a && python3 -m unittest discover -s tests -p 'test_*.py' -v` — ALL tests must pass
+2. Fix any failures before committing
+
+Commit: `test: add comprehensive test suite for Phase 2A event bridge`