Files
telemetry-client-js/docs/integration-guide.md
Jason Woltje 20f56edb49
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
docs(#1): document dev/release package versioning convention
Add versioning table to README and integration guide showing dist-tags,
version formats, and .npmrc registry configuration for the Gitea npm
registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:02:46 -06:00

13 KiB

Integration Guide

This guide covers how to integrate @mosaicstack/telemetry-client into your applications. The SDK targets Mosaic Telemetry API v1 (event schema version 1.0).

Prerequisites

  • Node.js >= 18 (for native fetch and crypto.randomUUID())
  • A Mosaic Telemetry API key and instance ID (issued by an administrator via the admin API)

Installation

Configure the Gitea npm registry in your project's .npmrc:

@mosaicstack:registry=https://git.mosaicstack.dev/api/packages/mosaic/npm/

Then install:

# Latest stable release (from main)
npm install @mosaicstack/telemetry-client

# Latest dev build (from develop)
npm install @mosaicstack/telemetry-client@dev
Branch Dist-tag Version format Example
main latest {version} 0.1.0
develop dev {version}-dev.{YYYYMMDDHHmmss} 0.1.0-dev.20260215050000

The package ships ESM-only with TypeScript declarations. Zero runtime dependencies.

Environment Setup

Store your credentials in environment variables — never hardcode them.

# .env (not committed — add to .gitignore)
TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
TELEMETRY_API_KEY=msk_your_api_key_here
TELEMETRY_INSTANCE_ID=a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d
# .env.example (committed — documents required variables)
TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
TELEMETRY_API_KEY=your-api-key
TELEMETRY_INSTANCE_ID=your-instance-uuid

Instrumenting a Next.js App

Next.js server actions and API routes run on Node.js, so the SDK works directly. Create a shared singleton and track events from your server-side code.

1. Create a telemetry singleton

// lib/telemetry.ts
import {
  TelemetryClient,
  TaskType,
  Complexity,
  Harness,
  Provider,
  Outcome,
  QualityGate,
} from '@mosaicstack/telemetry-client';

let client: TelemetryClient | null = null;

export function getTelemetryClient(): TelemetryClient {
  if (!client) {
    client = new TelemetryClient({
      serverUrl: process.env.TELEMETRY_API_URL!,
      apiKey: process.env.TELEMETRY_API_KEY!,
      instanceId: process.env.TELEMETRY_INSTANCE_ID!,
      enabled: process.env.NODE_ENV === 'production',
      onError: (err) => console.error('[telemetry]', err.message),
    });
    client.start();
  }
  return client;
}

// Re-export enums for convenience
export { TaskType, Complexity, Harness, Provider, Outcome, QualityGate };

2. Track events from an API route

// app/api/task-complete/route.ts
import { NextResponse } from 'next/server';
import { getTelemetryClient, TaskType, Complexity, Harness, Provider, Outcome } from '@/lib/telemetry';

export async function POST(request: Request) {
  const body = await request.json();

  const client = getTelemetryClient();
  const event = client.eventBuilder.build({
    task_duration_ms: body.durationMs,
    task_type: TaskType.IMPLEMENTATION,
    complexity: Complexity.MEDIUM,
    harness: Harness.CLAUDE_CODE,
    model: body.model,
    provider: Provider.ANTHROPIC,
    estimated_input_tokens: body.estimatedInputTokens,
    estimated_output_tokens: body.estimatedOutputTokens,
    actual_input_tokens: body.actualInputTokens,
    actual_output_tokens: body.actualOutputTokens,
    estimated_cost_usd_micros: body.estimatedCostMicros,
    actual_cost_usd_micros: body.actualCostMicros,
    quality_gate_passed: body.qualityGatePassed,
    quality_gates_run: body.qualityGatesRun,
    quality_gates_failed: body.qualityGatesFailed,
    context_compactions: body.contextCompactions,
    context_rotations: body.contextRotations,
    context_utilization_final: body.contextUtilization,
    outcome: Outcome.SUCCESS,
    retry_count: 0,
    language: 'typescript',
  });

  client.track(event);

  return NextResponse.json({ status: 'queued' });
}

3. Graceful shutdown

Next.js doesn't provide a built-in shutdown hook, but you can handle SIGTERM:

// instrumentation.ts (Next.js instrumentation file)
export async function register() {
  if (process.env.NEXT_RUNTIME === 'nodejs') {
    const { getTelemetryClient } = await import('./lib/telemetry');

    // Ensure the client starts on server boot
    getTelemetryClient();

    // Flush remaining events on shutdown
    const shutdown = async () => {
      const { getTelemetryClient } = await import('./lib/telemetry');
      const client = getTelemetryClient();
      await client.stop();
      process.exit(0);
    };

    process.on('SIGTERM', shutdown);
    process.on('SIGINT', shutdown);
  }
}

Instrumenting a Node.js Service

For a standalone Node.js service (Express, Fastify, plain script, etc.).

1. Initialize and start

// src/telemetry.ts
import { TelemetryClient } from '@mosaicstack/telemetry-client';

export const telemetry = new TelemetryClient({
  serverUrl: process.env.TELEMETRY_API_URL ?? 'https://tel-api.mosaicstack.dev',
  apiKey: process.env.TELEMETRY_API_KEY!,
  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
  onError: (err) => console.error('[telemetry]', err.message),
});

telemetry.start();

2. Track events after task completion

// src/task-runner.ts
import {
  TaskType,
  Complexity,
  Harness,
  Provider,
  Outcome,
  QualityGate,
} from '@mosaicstack/telemetry-client';
import { telemetry } from './telemetry.js';

async function runTask() {
  const startTime = Date.now();

  // ... run your AI coding task ...

  const durationMs = Date.now() - startTime;

  const event = telemetry.eventBuilder.build({
    task_duration_ms: durationMs,
    task_type: TaskType.IMPLEMENTATION,
    complexity: Complexity.HIGH,
    harness: Harness.CLAUDE_CODE,
    model: 'claude-sonnet-4-5-20250929',
    provider: Provider.ANTHROPIC,
    estimated_input_tokens: 200000,
    estimated_output_tokens: 80000,
    actual_input_tokens: 215000,
    actual_output_tokens: 72000,
    estimated_cost_usd_micros: 1200000,
    actual_cost_usd_micros: 1150000,
    quality_gate_passed: true,
    quality_gates_run: [
      QualityGate.BUILD,
      QualityGate.LINT,
      QualityGate.TEST,
      QualityGate.TYPECHECK,
    ],
    quality_gates_failed: [],
    context_compactions: 3,
    context_rotations: 1,
    context_utilization_final: 0.85,
    outcome: Outcome.SUCCESS,
    retry_count: 0,
    language: 'typescript',
    repo_size_category: 'medium',
  });

  telemetry.track(event);
}

3. Graceful shutdown

// src/main.ts
import { telemetry } from './telemetry.js';

async function main() {
  // ... your application logic ...

  // On shutdown, flush remaining events
  process.on('SIGTERM', async () => {
    await telemetry.stop();
    process.exit(0);
  });
}

main();

Using Predictions

The telemetry API provides crowd-sourced predictions for token usage, cost, and duration based on historical data. The SDK caches these predictions locally.

Pre-populate the cache

Call refreshPredictions() at startup with the dimension combinations your application uses:

import { TaskType, Provider, Complexity } from '@mosaicstack/telemetry-client';
import { telemetry } from './telemetry.js';

// Fetch predictions for all combinations you'll need
await telemetry.refreshPredictions([
  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.MEDIUM },
  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.HIGH },
  { task_type: TaskType.TESTING, model: 'claude-haiku-4-5-20251001', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
]);

Read cached predictions

const prediction = telemetry.getPrediction({
  task_type: TaskType.IMPLEMENTATION,
  model: 'claude-sonnet-4-5-20250929',
  provider: Provider.ANTHROPIC,
  complexity: Complexity.MEDIUM,
});

if (prediction?.prediction) {
  const p = prediction.prediction;
  console.log('Token predictions (median):', {
    inputTokens: p.input_tokens.median,
    outputTokens: p.output_tokens.median,
  });
  console.log('Cost prediction:', `$${(p.cost_usd_micros.median / 1_000_000).toFixed(2)}`);
  console.log('Duration prediction:', `${(p.duration_ms.median / 1000).toFixed(0)}s`);
  console.log('Correction factors:', {
    input: p.correction_factors.input,   // >1.0 means estimates tend to be too low
    output: p.correction_factors.output,
  });
  console.log('Quality:', {
    gatePassRate: `${(p.quality.gate_pass_rate * 100).toFixed(0)}%`,
    successRate: `${(p.quality.success_rate * 100).toFixed(0)}%`,
  });

  // Check confidence level
  if (prediction.metadata.confidence === 'low') {
    console.warn('Low confidence — small sample size or fallback was applied');
  }
}

Understand fallback behavior

When the server doesn't have enough data for an exact match, it broadens the query by dropping dimensions (e.g., ignoring complexity). The metadata fields tell you what happened:

fallback_level Meaning
0 Exact match on all dimensions
1+ Some dimensions were dropped to find data
-1 No prediction data available at any level

Environment-Specific Configuration

Development

const client = new TelemetryClient({
  serverUrl: 'http://localhost:8000',         // Local dev server
  apiKey: process.env.TELEMETRY_API_KEY!,
  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
  dryRun: true,                               // Don't send real data
  submitIntervalMs: 10_000,                    // Flush more frequently for debugging
  onError: (err) => console.error('[telemetry]', err),
});

Production

const client = new TelemetryClient({
  serverUrl: 'https://tel-api.mosaicstack.dev',
  apiKey: process.env.TELEMETRY_API_KEY!,
  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
  submitIntervalMs: 300_000,                   // 5 min (default)
  maxRetries: 3,                               // Retry on transient failures
  onError: (err) => {
    // Route to your observability stack
    logger.error('Telemetry submission failed', { error: err.message });
  },
});

Conditional enable/disable

const client = new TelemetryClient({
  serverUrl: process.env.TELEMETRY_API_URL!,
  apiKey: process.env.TELEMETRY_API_KEY!,
  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
  enabled: process.env.TELEMETRY_ENABLED !== 'false',  // Opt-out via env var
});

When enabled is false, track() returns immediately without queuing.


Error Handling

The SDK is designed to never disrupt your application:

  • track() never throws. All errors are caught and routed to the onError callback.
  • Failed batches are re-queued. If a submission fails, events are prepended back to the queue for the next flush cycle.
  • Exponential backoff with jitter. Retries use 1s base delay, doubling up to 60s, with random jitter to prevent thundering herd.
  • Retry-After header support. On HTTP 429 (rate limited), the SDK respects the server's Retry-After header.
  • HTTP 403 is not retried. An API key / instance ID mismatch is a permanent error.

Custom error handling

const client = new TelemetryClient({
  // ...
  onError: (error) => {
    if (error.message.includes('HTTP 403')) {
      console.error('Telemetry auth failed — check API key and instance ID');
    } else if (error.message.includes('HTTP 429')) {
      console.warn('Telemetry rate limited — events will be retried');
    } else {
      console.error('Telemetry error:', error.message);
    }
  },
});

Batch Submission Behavior

The SDK batches events for efficiency:

  1. track(event) adds the event to an in-memory queue (bounded, FIFO eviction at capacity).
  2. Every submitIntervalMs (default: 5 minutes), the background timer drains the queue in batches of up to batchSize (default/max: 100).
  3. Each batch is POSTed to POST /v1/events/batch with exponential backoff on failure.
  4. Calling stop() flushes all remaining events before resolving.

The server accepts up to 100 events per batch and supports partial success — some events may be accepted while others (e.g., duplicates) are rejected.


API Version Compatibility

SDK Version API Version Schema Version
0.1.x v1 (/v1/ endpoints) 1.0

The EventBuilder automatically sets schema_version: "1.0" on every event. The SDK submits to /v1/events/batch and queries /v1/predictions/batch.

When the telemetry API introduces a v2, this SDK will add support in a new major release. The server supports two API versions simultaneously during a 6-month deprecation window.