Track LLM task completions via Mosaic Telemetry #371

New Issue

jason.woltje · 2026-02-15T05:28:26Z

jason.woltje commented

2026-02-15 05:28:26 +00:00

Summary

Instrument the LLM service layer to emit TaskCompletionEvents through the Mosaic Telemetry client after each LLM interaction completes. This is the primary data source for token usage tracking, cost analysis, and prediction model training.

Context

The telemetry system tracks AI coding task completions with rich metadata. The LLM service (apps/api/src/llm/) is where all provider calls happen — this is the natural integration point.

Note: This is separate from the existing OpenTelemetry (OTEL) instrumentation which handles request tracing/spans. Mosaic Telemetry tracks higher-level task completion metrics for cost forecasting and quality analysis.

Requirements

Event Construction

After each LLM call completes, build a TaskCompletionEvent using EventBuilder:

const event = telemetry.eventBuilder.build({
  taskType: 'implementation',
  complexity: 'medium',
  harness: 'api_direct',
  model: 'claude-sonnet-4-5-20250929',
  provider: 'anthropic',
  taskDurationMs: elapsed,
  estimatedInputTokens: promptTokenEstimate,
  estimatedOutputTokens: completionTokenEstimate,
  actualInputTokens: response.usage.input_tokens,
  actualOutputTokens: response.usage.output_tokens,
  estimatedCostUsdMicros: preEstimate,
  actualCostUsdMicros: computedCost,
  qualityGatePassed: true,
  qualityGatesRun: ['build', 'typecheck'],
  qualityGatesFailed: [],
  contextCompactions: 0,
  contextRotations: 0,
  contextUtilizationFinal: 0.0,
  outcome: 'success',
  retryCount: 0,
});
telemetry.trackTaskCompletion(event);

Integration Points

LlmService.chat() — Standard chat completions
LlmService.chatStream() — Streaming completions (aggregate tokens after stream ends)
LlmService.embed() — Embedding operations

Provider-Specific Token Extraction

Each provider returns usage data differently:

Anthropic: response.usage.input_tokens, response.usage.output_tokens
OpenAI: response.usage.prompt_tokens, response.usage.completion_tokens
Ollama: response.eval_count, response.prompt_eval_count

Normalize all to the common actual_input_tokens / actual_output_tokens fields.

Cost Calculation

Maintain a cost table (or use predictions) for $/token by model
Store in microdollars (USD * 1,000,000) as integers
Example: $0.003/1K input tokens = 3000 microdollars per 1K tokens

Task Type Inference

Map the calling context to a TaskType:

Chat conversations → implementation or planning (based on system prompt)
Brain queries → planning
Code generation requests → implementation
Review requests → code_review

Acceptance Criteria

All LLM calls emit TaskCompletionEvents
Token usage accurately captured per provider
Cost calculated in microdollars
Streaming responses aggregate tokens correctly
Events queued (non-blocking) — never delays LLM response to user
Task type inferred from context
Unit tests with mocked telemetry client
Integration test verifying event structure matches schema

## Summary Instrument the LLM service layer to emit `TaskCompletionEvent`s through the Mosaic Telemetry client after each LLM interaction completes. This is the primary data source for token usage tracking, cost analysis, and prediction model training. ## Context The telemetry system tracks AI coding task completions with rich metadata. The LLM service (`apps/api/src/llm/`) is where all provider calls happen — this is the natural integration point. **Note:** This is separate from the existing OpenTelemetry (OTEL) instrumentation which handles request tracing/spans. Mosaic Telemetry tracks higher-level task completion metrics for cost forecasting and quality analysis. ## Requirements ### Event Construction After each LLM call completes, build a `TaskCompletionEvent` using `EventBuilder`: ```typescript const event = telemetry.eventBuilder.build({ taskType: 'implementation', complexity: 'medium', harness: 'api_direct', model: 'claude-sonnet-4-5-20250929', provider: 'anthropic', taskDurationMs: elapsed, estimatedInputTokens: promptTokenEstimate, estimatedOutputTokens: completionTokenEstimate, actualInputTokens: response.usage.input_tokens, actualOutputTokens: response.usage.output_tokens, estimatedCostUsdMicros: preEstimate, actualCostUsdMicros: computedCost, qualityGatePassed: true, qualityGatesRun: ['build', 'typecheck'], qualityGatesFailed: [], contextCompactions: 0, contextRotations: 0, contextUtilizationFinal: 0.0, outcome: 'success', retryCount: 0, }); telemetry.trackTaskCompletion(event); ``` ### Integration Points 1. **`LlmService.chat()`** — Standard chat completions 2. **`LlmService.chatStream()`** — Streaming completions (aggregate tokens after stream ends) 3. **`LlmService.embed()`** — Embedding operations ### Provider-Specific Token Extraction Each provider returns usage data differently: - **Anthropic:** `response.usage.input_tokens`, `response.usage.output_tokens` - **OpenAI:** `response.usage.prompt_tokens`, `response.usage.completion_tokens` - **Ollama:** `response.eval_count`, `response.prompt_eval_count` Normalize all to the common `actual_input_tokens` / `actual_output_tokens` fields. ### Cost Calculation - Maintain a cost table (or use predictions) for $/token by model - Store in microdollars (USD * 1,000,000) as integers - Example: $0.003/1K input tokens = 3000 microdollars per 1K tokens ### Task Type Inference Map the calling context to a `TaskType`: - Chat conversations → `implementation` or `planning` (based on system prompt) - Brain queries → `planning` - Code generation requests → `implementation` - Review requests → `code_review` ## Acceptance Criteria - [ ] All LLM calls emit TaskCompletionEvents - [ ] Token usage accurately captured per provider - [ ] Cost calculated in microdollars - [ ] Streaming responses aggregate tokens correctly - [ ] Events queued (non-blocking) — never delays LLM response to user - [ ] Task type inferred from context - [ ] Unit tests with mocked telemetry client - [ ] Integration test verifying event structure matches schema

jason.woltje added the ai label 2026-02-15 05:28:26 +00:00

jason.woltje added this to the M10-Telemetry (0.0.10) milestone 2026-02-15 05:31:19 +00:00

jason.woltje referenced this issue from a commit

2026-02-15 07:44:34 +00:00

feat(#371): track LLM task completions via Mosaic Telemetry

jason.woltje closed this issue

2026-02-15 08:04:34 +00:00

jason.woltje commented

2026-02-15 08:05:00 +00:00

Completed in commit 639881f on feature/m10-telemetry. Created LlmTelemetryTrackerService with fire-and-forget tracking, llm-cost-table.ts with microdollar pricing. Instrumented LlmService chat/chatStream/embed. 69 unit tests.

Completed in commit 639881f on feature/m10-telemetry. Created LlmTelemetryTrackerService with fire-and-forget tracking, llm-cost-table.ts with microdollar pricing. Instrumented LlmService chat/chatStream/embed. 69 unit tests.

jason.woltje referenced this issue

2026-02-15 08:05:45 +00:00

feat: M10-Telemetry — Mosaic Telemetry integration #407

jason.woltje referenced this issue from a commit

2026-02-15 08:07:48 +00:00

fix(#371): resolve TypeScript strictness errors in telemetry tracking

jason.woltje referenced this issue from a commit

2026-02-15 08:10:33 +00:00

feat(#371): track LLM task completions via Mosaic Telemetry

jason.woltje referenced this issue from a commit

2026-02-15 08:10:33 +00:00

fix(#371): resolve TypeScript strictness errors in telemetry tracking

Sign in to join this conversation.

Branches Tags

main

fix/orchestrator-widget-endpoints

fix/dashboard-widget-mock-data

fix/ci-glibc-image

fix/dockerfile-npmrc

fix/matrix-native-binary

fix/kaniko-cache

fix/base-image-kaniko-v2

fix/base-image-kaniko

feat/custom-base-image

ci/pnpm-cache

fix/interceptor-tests

fix/kanban-tests

feat/wire-chat

feat/usage-widget

fix/security-hardening

fix/project-domain-v2

feat/kanban-add-task

fix/project-domain-attach

fix/logs-page-clean

fix/workspace-members

fix/ci-lint-632

fix/file-manager-tags

fix/csrf-debug-log

fix/controller-type-imports

fix/system-admin-env

fix/gateway-cors-trusted-origins

feat/project-detail-page

fix/fleet-provider-form-dto-v2

fix/ms22-audit

fix/orchestrator-widgets

fix/fleet-provider-form-dto

fix/csrf-bearer-bypass

fix/ms22-missing-authmodule-imports

fix/container-lifecycle-config-module

fix/swarm-compose-ms22-vars

chore/ms22-p1-complete

feat/ms22-p1h-settings-ui

feat/ms22-p1f-onboarding-ui

feat/ms22-p1i-chat-proxy

feat/ms22-p1k-idle-reaper

feat/ms22-p1j-docker

feat/ms22-p1e-onboarding-api

feat/ms22-p1g-settings-api

feat/ms22-p1d-container-mgr

feat/ms22-p1c-config-api

chore/ms22-prd-tracking

feat/ms22-p1a-schema

feat/ms22-p1b-crypto

chore/ms22-p1-tasks

docs/ms22-architecture

feat/ms22-openclaw-docker

feat/ms22-openclaw-gateway-module

chore/ms21-complete

chore/ms21-final-tasks-done

fix/ms21-ui-001-qa

test/ms21-ui-tests

chore/ms21-tasks-sync

chore/ms22-phase0-complete

feat/ms22-ingest-clean

feat/ms21-ui-users-members

feat/ms22-task-agent

chore/tasks-final

chore/tasks-update

feat/ms21-session-invalidation

feat/ms21-rbac-settings

feat/ms21-teams-page

feat/ms21-users-page

feat/ms19-terminal-persistence

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: mosaic/stack#371