100 lines
6.7 KiB
Markdown
100 lines
6.7 KiB
Markdown
# PRD: MACP Phase 2A Event Bridge + Notification System
|
|
|
|
## Metadata
|
|
|
|
- Owner: Jarvis
|
|
- Date: 2026-03-27
|
|
- Status: in-progress
|
|
- Best-Guess Mode: true
|
|
|
|
## Problem Statement
|
|
|
|
MACP Phase 1 writes structured lifecycle events to `.mosaic/orchestrator/events.ndjson`, but no repo-local bridge consumes those events for external systems. Phase 2A adds a portable watcher, webhook delivery, and Discord-friendly formatting so MACP event streams can drive OpenClaw integrations and human-facing notifications.
|
|
|
|
## Objectives
|
|
|
|
1. Add a synchronous event watcher that tails `events.ndjson` using stdlib-only file polling and persists cursor state across restarts.
|
|
2. Add a webhook adapter that can forward selected MACP events to a configured HTTP endpoint with bounded retries.
|
|
3. Add a Discord formatter that turns task lifecycle events into concise human-readable strings.
|
|
4. Extend the `mosaic macp` CLI with a `watch` command for one-shot or continuous event bridge execution.
|
|
|
|
## Scope
|
|
|
|
### In Scope
|
|
|
|
1. New `tools/orchestrator-matrix/events/` package with watcher, webhook adapter, and Discord formatter modules.
|
|
2. Cursor persistence at `.mosaic/orchestrator/event_cursor.json`.
|
|
3. `mosaic macp watch [--webhook] [--once]` CLI support using `.mosaic/orchestrator/config.json`.
|
|
4. Stdlib-only verification of watcher polling, webhook delivery, Discord formatting, CLI watch behavior, and cursor persistence.
|
|
5. Developer documentation and sitemap updates covering the Phase 2A event bridge.
|
|
6. A repo-local unittest suite under `tests/` that covers watcher polling/cursor behavior, webhook delivery logic, and Discord formatting.
|
|
|
|
### Out of Scope
|
|
|
|
1. Adding Discord transport or webhook server hosting inside this repository.
|
|
2. Replacing the existing Matrix transport bridge.
|
|
3. Introducing async, threads, or third-party Python packages.
|
|
4. Changing event emission behavior in the controller beyond consuming the existing event stream.
|
|
|
|
## User/Stakeholder Requirements
|
|
|
|
1. External systems must be able to consume MACP events without reading the NDJSON file directly.
|
|
2. The watcher must remain portable across environments, so file polling is required instead of platform-specific file watching.
|
|
3. Restarting the watcher must not replay previously consumed events.
|
|
4. Webhook delivery failures must be logged and isolated so the watcher loop continues running.
|
|
5. Discord formatting must stay concise and useful for task lifecycle visibility.
|
|
|
|
## Functional Requirements
|
|
|
|
1. `EventWatcher` must watch `.mosaic/orchestrator/events.ndjson`, parse appended JSON lines, and invoke registered callbacks for matching event types.
|
|
2. `EventWatcher.poll_once()` must tolerate a missing events file, truncated/corrupt lines, and cursor positions that are stale after file truncation.
|
|
3. Cursor writes must be atomic and stored at `.mosaic/orchestrator/event_cursor.json`.
|
|
4. `send_webhook(event, config)` must POST JSON to the configured URL using `urllib.request`, optionally adding a bearer token, respecting timeout, and retrying with exponential backoff.
|
|
5. `create_webhook_callback(config)` must return a callback that swallows/logs failures instead of raising into the watcher loop.
|
|
6. `format_event(event)` must support `task.completed`, `task.failed`, `task.escalated`, `task.gated`, and `task.started`, including useful task metadata when present.
|
|
7. `format_summary(events)` must produce a short batch summary suitable for notification digests.
|
|
8. `bin/mosaic-macp` must expose `watch`, optionally enabling webhook delivery from config, and support one-shot polling with `--once`.
|
|
|
|
## Non-Functional Requirements
|
|
|
|
1. Security: no secrets embedded in code or logs; auth token only sent via header when configured.
|
|
2. Performance: each webhook attempt must be bounded by `timeout_seconds`; no event-processing path may hang indefinitely.
|
|
3. Reliability: corrupt input lines and callback delivery failures must be logged to stderr and skipped without crashing the watcher.
|
|
4. Portability: Python 3.10+ stdlib only; no OS-specific file watcher APIs.
|
|
5. Observability: warnings and failures must be clear enough to diagnose cursor, parsing, and webhook problems.
|
|
|
|
## Acceptance Criteria
|
|
|
|
1. `EventWatcher.poll_once()` reads newly appended events, returns parsed dicts, invokes registered callbacks, and skips already-consumed events after restart.
|
|
2. Webhook delivery posts matching events to a local test endpoint, supports bearer auth configuration, and retries boundedly on failure.
|
|
3. Discord formatter returns expected concise strings for the required task lifecycle event types and a usable batch summary.
|
|
4. `mosaic macp watch --once` processes events from a bootstrapped repo state without error and honors `--webhook`.
|
|
5. Cursor persistence prevents replay on a second run and resets safely when the events file is truncated.
|
|
6. `python3 -m unittest discover -s tests -p 'test_*.py' -v` passes with stdlib-only tests for the Phase 2A event bridge modules.
|
|
|
|
## Constraints and Dependencies
|
|
|
|
1. Python implementation must use stdlib only and support Python 3.10+.
|
|
2. Shell CLI behavior must remain bash-based and consistent with the existing Mosaic command style.
|
|
3. The watcher consumes the event schema already emitted by Phase 1 controller logic.
|
|
4. Webhook configuration lives under `.mosaic/orchestrator/config.json` at `macp.webhook`.
|
|
|
|
## Risks and Open Questions
|
|
|
|
1. Risk: partial writes may leave an incomplete trailing JSON line that must not advance the cursor incorrectly.
|
|
2. Risk: synchronous webhook retries can slow one poll cycle if the endpoint is unavailable; timeout and retry behavior must remain bounded.
|
|
3. Risk: event payloads may omit optional metadata fields, so formatter output must degrade cleanly.
|
|
4. ASSUMPTION: the watcher should advance past corrupt lines after logging them so a single bad line does not permanently stall downstream consumption.
|
|
5. ASSUMPTION: CLI `watch` should default to no-op callback processing when no delivery option is enabled, while still updating the cursor and reporting processed count.
|
|
|
|
## Testing and Verification Expectations
|
|
|
|
1. Baseline checks: Python bytecode compilation/import validation for new modules and shell syntax validation for `bin/mosaic-macp`.
|
|
2. Situational tests: temporary orchestrator state exercising watcher polling, callback filtering, webhook POST capture/mocking, formatter sanitization, CLI one-shot watch execution, and cursor persistence across repeated runs.
|
|
3. Evidence format: command-level results recorded in the scratchpad and summarized against acceptance criteria.
|
|
|
|
## Milestone / Delivery Intent
|
|
|
|
1. Target milestone/version: Phase 2A observability bridge
|
|
2. Definition of done: code merged to `main`, CI terminal green, issue `#10` closed, and verification evidence recorded against all acceptance criteria.
|