feat(#372): track orchestrator agent task completions via telemetry

- Instrument Coordinator.process_queue() with timing and telemetry events
- Instrument OrchestrationLoop.process_next_issue() with quality gate tracking
- Add agent-to-telemetry mapping (model, provider, harness per agent name)
- Map difficulty levels to Complexity enum and gate names to QualityGate enum
- Track retry counts per issue (increment on failure, clear on success)
- Emit FAILURE outcome on agent spawn failure or quality gate rejection
- Non-blocking: telemetry errors are logged and swallowed, never delay tasks
- Pass telemetry client from FastAPI lifespan to Coordinator constructor
- Add 33 unit tests covering all telemetry scenarios

Refs #372

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-15 01:52:54 -06:00
parent ed23293e1a
commit d6c6af10d9
3 changed files with 1057 additions and 1 deletions

View File

@@ -100,6 +100,8 @@ async def lifespan(app: FastAPI) -> AsyncIterator[dict[str, Any]]:
_coordinator = Coordinator(
queue_manager=queue_manager,
poll_interval=settings.coordinator_poll_interval,
telemetry_client=mosaic_telemetry_client,
instance_id=mosaic_telemetry_config.instance_id or "",
)
logger.info(
f"Coordinator initialized (poll interval: {settings.coordinator_poll_interval}s, "