Add OpenTelemetry Tracing Infrastructure #131

Closed
opened 2026-01-30 21:29:13 +00:00 by jason.woltje · 0 comments
Owner

Implement OpenTelemetry distributed tracing and observability.

Objective: Add comprehensive tracing for all HTTP requests and LLM calls with GenAI semantic conventions.

Tasks:

  • Add OpenTelemetry dependencies to package.json
  • Create telemetry.service.ts for OTEL SDK initialization
  • Create telemetry.interceptor.ts for HTTP request tracing
  • Create llm-telemetry.decorator.ts for LLM-specific spans
  • Create span-context.service.ts for context propagation
  • Configure Jaeger exporter
  • Instrument all LLM provider calls
  • Add trace context to existing logging
  • Write documentation in docs/3-architecture/telemetry.md
  • Benchmark performance impact

Features:

  • Automatic HTTP request spans
  • LLM call instrumentation with token counts
  • Error tracking with stack traces
  • GenAI semantic conventions support
  • Jaeger/Zipkin export

Acceptance Criteria:

  • All HTTP requests create spans
  • LLM calls show token counts and latency
  • Traces viewable in Jaeger
  • No more than 5% performance overhead
  • Documentation complete

Related: Epic #121, Phase 3 OpenTelemetry
Depends on: #127 (refactored LlmService)

Implement OpenTelemetry distributed tracing and observability. Objective: Add comprehensive tracing for all HTTP requests and LLM calls with GenAI semantic conventions. Tasks: - Add OpenTelemetry dependencies to package.json - Create telemetry.service.ts for OTEL SDK initialization - Create telemetry.interceptor.ts for HTTP request tracing - Create llm-telemetry.decorator.ts for LLM-specific spans - Create span-context.service.ts for context propagation - Configure Jaeger exporter - Instrument all LLM provider calls - Add trace context to existing logging - Write documentation in docs/3-architecture/telemetry.md - Benchmark performance impact Features: - Automatic HTTP request spans - LLM call instrumentation with token counts - Error tracking with stack traces - GenAI semantic conventions support - Jaeger/Zipkin export Acceptance Criteria: - All HTTP requests create spans - LLM calls show token counts and latency - Traces viewable in Jaeger - No more than 5% performance overhead - Documentation complete Related: Epic #121, Phase 3 OpenTelemetry Depends on: #127 (refactored LlmService)
jason.woltje added the phase-3p0apiapi labels 2026-01-30 21:29:13 +00:00
jason.woltje added this to the M4-LLM (0.0.4) milestone 2026-01-30 23:40:48 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#131