feat(#372): track orchestrator agent task completions via telemetry

- Instrument Coordinator.process_queue() with timing and telemetry events - Instrument OrchestrationLoop.process_next_issue() with quality gate tracking - Add agent-to-telemetry mapping (model, provider, harness per agent name) - Map difficulty levels to Complexity enum and gate names to QualityGate enum - Track retry counts per issue (increment on failure, clear on success) - Emit FAILURE outcome on agent spawn failure or quality gate rejection - Non-blocking: telemetry errors are logged and swallowed, never delay tasks - Pass telemetry client from FastAPI lifespan to Coordinator constructor - Add 33 unit tests covering all telemetry scenarios Refs #372 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:52:54 -06:00
parent ed23293e1a
commit d6c6af10d9
3 changed files with 1057 additions and 1 deletions
--- a/apps/coordinator/src/main.py
+++ b/apps/coordinator/src/main.py
@@ -100,6 +100,8 @@ async def lifespan(app: FastAPI) -> AsyncIterator[dict[str, Any]]:
        _coordinator = Coordinator(
            queue_manager=queue_manager,
            poll_interval=settings.coordinator_poll_interval,
+            telemetry_client=mosaic_telemetry_client,
+            instance_id=mosaic_telemetry_config.instance_id or "",
        )
        logger.info(
            f"Coordinator initialized (poll interval: {settings.coordinator_poll_interval}s, "