All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Add versioning table to README and integration guide showing dist-tags, version formats, and .npmrc registry configuration for the Gitea npm registry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
423 lines
13 KiB
Markdown
423 lines
13 KiB
Markdown
# Integration Guide
|
|
|
|
This guide covers how to integrate `@mosaicstack/telemetry-client` into your applications. The SDK targets **Mosaic Telemetry API v1** (event schema version `1.0`).
|
|
|
|
## Prerequisites
|
|
|
|
- Node.js >= 18 (for native `fetch` and `crypto.randomUUID()`)
|
|
- A Mosaic Telemetry API key and instance ID (issued by an administrator via the admin API)
|
|
|
|
## Installation
|
|
|
|
Configure the Gitea npm registry in your project's `.npmrc`:
|
|
|
|
```ini
|
|
@mosaicstack:registry=https://git.mosaicstack.dev/api/packages/mosaic/npm/
|
|
```
|
|
|
|
Then install:
|
|
|
|
```bash
|
|
# Latest stable release (from main)
|
|
npm install @mosaicstack/telemetry-client
|
|
|
|
# Latest dev build (from develop)
|
|
npm install @mosaicstack/telemetry-client@dev
|
|
```
|
|
|
|
| Branch | Dist-tag | Version format | Example |
|
|
|--------|----------|----------------|---------|
|
|
| `main` | `latest` | `{version}` | `0.1.0` |
|
|
| `develop` | `dev` | `{version}-dev.{YYYYMMDDHHmmss}` | `0.1.0-dev.20260215050000` |
|
|
|
|
The package ships ESM-only with TypeScript declarations. Zero runtime dependencies.
|
|
|
|
## Environment Setup
|
|
|
|
Store your credentials in environment variables — never hardcode them.
|
|
|
|
```bash
|
|
# .env (not committed — add to .gitignore)
|
|
TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
|
|
TELEMETRY_API_KEY=msk_your_api_key_here
|
|
TELEMETRY_INSTANCE_ID=a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d
|
|
```
|
|
|
|
```bash
|
|
# .env.example (committed — documents required variables)
|
|
TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
|
|
TELEMETRY_API_KEY=your-api-key
|
|
TELEMETRY_INSTANCE_ID=your-instance-uuid
|
|
```
|
|
|
|
---
|
|
|
|
## Instrumenting a Next.js App
|
|
|
|
Next.js server actions and API routes run on Node.js, so the SDK works directly. Create a shared singleton and track events from your server-side code.
|
|
|
|
### 1. Create a telemetry singleton
|
|
|
|
```typescript
|
|
// lib/telemetry.ts
|
|
import {
|
|
TelemetryClient,
|
|
TaskType,
|
|
Complexity,
|
|
Harness,
|
|
Provider,
|
|
Outcome,
|
|
QualityGate,
|
|
} from '@mosaicstack/telemetry-client';
|
|
|
|
let client: TelemetryClient | null = null;
|
|
|
|
export function getTelemetryClient(): TelemetryClient {
|
|
if (!client) {
|
|
client = new TelemetryClient({
|
|
serverUrl: process.env.TELEMETRY_API_URL!,
|
|
apiKey: process.env.TELEMETRY_API_KEY!,
|
|
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
|
|
enabled: process.env.NODE_ENV === 'production',
|
|
onError: (err) => console.error('[telemetry]', err.message),
|
|
});
|
|
client.start();
|
|
}
|
|
return client;
|
|
}
|
|
|
|
// Re-export enums for convenience
|
|
export { TaskType, Complexity, Harness, Provider, Outcome, QualityGate };
|
|
```
|
|
|
|
### 2. Track events from an API route
|
|
|
|
```typescript
|
|
// app/api/task-complete/route.ts
|
|
import { NextResponse } from 'next/server';
|
|
import { getTelemetryClient, TaskType, Complexity, Harness, Provider, Outcome } from '@/lib/telemetry';
|
|
|
|
export async function POST(request: Request) {
|
|
const body = await request.json();
|
|
|
|
const client = getTelemetryClient();
|
|
const event = client.eventBuilder.build({
|
|
task_duration_ms: body.durationMs,
|
|
task_type: TaskType.IMPLEMENTATION,
|
|
complexity: Complexity.MEDIUM,
|
|
harness: Harness.CLAUDE_CODE,
|
|
model: body.model,
|
|
provider: Provider.ANTHROPIC,
|
|
estimated_input_tokens: body.estimatedInputTokens,
|
|
estimated_output_tokens: body.estimatedOutputTokens,
|
|
actual_input_tokens: body.actualInputTokens,
|
|
actual_output_tokens: body.actualOutputTokens,
|
|
estimated_cost_usd_micros: body.estimatedCostMicros,
|
|
actual_cost_usd_micros: body.actualCostMicros,
|
|
quality_gate_passed: body.qualityGatePassed,
|
|
quality_gates_run: body.qualityGatesRun,
|
|
quality_gates_failed: body.qualityGatesFailed,
|
|
context_compactions: body.contextCompactions,
|
|
context_rotations: body.contextRotations,
|
|
context_utilization_final: body.contextUtilization,
|
|
outcome: Outcome.SUCCESS,
|
|
retry_count: 0,
|
|
language: 'typescript',
|
|
});
|
|
|
|
client.track(event);
|
|
|
|
return NextResponse.json({ status: 'queued' });
|
|
}
|
|
```
|
|
|
|
### 3. Graceful shutdown
|
|
|
|
Next.js doesn't provide a built-in shutdown hook, but you can handle `SIGTERM`:
|
|
|
|
```typescript
|
|
// instrumentation.ts (Next.js instrumentation file)
|
|
export async function register() {
|
|
if (process.env.NEXT_RUNTIME === 'nodejs') {
|
|
const { getTelemetryClient } = await import('./lib/telemetry');
|
|
|
|
// Ensure the client starts on server boot
|
|
getTelemetryClient();
|
|
|
|
// Flush remaining events on shutdown
|
|
const shutdown = async () => {
|
|
const { getTelemetryClient } = await import('./lib/telemetry');
|
|
const client = getTelemetryClient();
|
|
await client.stop();
|
|
process.exit(0);
|
|
};
|
|
|
|
process.on('SIGTERM', shutdown);
|
|
process.on('SIGINT', shutdown);
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Instrumenting a Node.js Service
|
|
|
|
For a standalone Node.js service (Express, Fastify, plain script, etc.).
|
|
|
|
### 1. Initialize and start
|
|
|
|
```typescript
|
|
// src/telemetry.ts
|
|
import { TelemetryClient } from '@mosaicstack/telemetry-client';
|
|
|
|
export const telemetry = new TelemetryClient({
|
|
serverUrl: process.env.TELEMETRY_API_URL ?? 'https://tel-api.mosaicstack.dev',
|
|
apiKey: process.env.TELEMETRY_API_KEY!,
|
|
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
|
|
onError: (err) => console.error('[telemetry]', err.message),
|
|
});
|
|
|
|
telemetry.start();
|
|
```
|
|
|
|
### 2. Track events after task completion
|
|
|
|
```typescript
|
|
// src/task-runner.ts
|
|
import {
|
|
TaskType,
|
|
Complexity,
|
|
Harness,
|
|
Provider,
|
|
Outcome,
|
|
QualityGate,
|
|
} from '@mosaicstack/telemetry-client';
|
|
import { telemetry } from './telemetry.js';
|
|
|
|
async function runTask() {
|
|
const startTime = Date.now();
|
|
|
|
// ... run your AI coding task ...
|
|
|
|
const durationMs = Date.now() - startTime;
|
|
|
|
const event = telemetry.eventBuilder.build({
|
|
task_duration_ms: durationMs,
|
|
task_type: TaskType.IMPLEMENTATION,
|
|
complexity: Complexity.HIGH,
|
|
harness: Harness.CLAUDE_CODE,
|
|
model: 'claude-sonnet-4-5-20250929',
|
|
provider: Provider.ANTHROPIC,
|
|
estimated_input_tokens: 200000,
|
|
estimated_output_tokens: 80000,
|
|
actual_input_tokens: 215000,
|
|
actual_output_tokens: 72000,
|
|
estimated_cost_usd_micros: 1200000,
|
|
actual_cost_usd_micros: 1150000,
|
|
quality_gate_passed: true,
|
|
quality_gates_run: [
|
|
QualityGate.BUILD,
|
|
QualityGate.LINT,
|
|
QualityGate.TEST,
|
|
QualityGate.TYPECHECK,
|
|
],
|
|
quality_gates_failed: [],
|
|
context_compactions: 3,
|
|
context_rotations: 1,
|
|
context_utilization_final: 0.85,
|
|
outcome: Outcome.SUCCESS,
|
|
retry_count: 0,
|
|
language: 'typescript',
|
|
repo_size_category: 'medium',
|
|
});
|
|
|
|
telemetry.track(event);
|
|
}
|
|
```
|
|
|
|
### 3. Graceful shutdown
|
|
|
|
```typescript
|
|
// src/main.ts
|
|
import { telemetry } from './telemetry.js';
|
|
|
|
async function main() {
|
|
// ... your application logic ...
|
|
|
|
// On shutdown, flush remaining events
|
|
process.on('SIGTERM', async () => {
|
|
await telemetry.stop();
|
|
process.exit(0);
|
|
});
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
---
|
|
|
|
## Using Predictions
|
|
|
|
The telemetry API provides crowd-sourced predictions for token usage, cost, and duration based on historical data. The SDK caches these predictions locally.
|
|
|
|
### Pre-populate the cache
|
|
|
|
Call `refreshPredictions()` at startup with the dimension combinations your application uses:
|
|
|
|
```typescript
|
|
import { TaskType, Provider, Complexity } from '@mosaicstack/telemetry-client';
|
|
import { telemetry } from './telemetry.js';
|
|
|
|
// Fetch predictions for all combinations you'll need
|
|
await telemetry.refreshPredictions([
|
|
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
|
|
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.MEDIUM },
|
|
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.HIGH },
|
|
{ task_type: TaskType.TESTING, model: 'claude-haiku-4-5-20251001', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
|
|
]);
|
|
```
|
|
|
|
### Read cached predictions
|
|
|
|
```typescript
|
|
const prediction = telemetry.getPrediction({
|
|
task_type: TaskType.IMPLEMENTATION,
|
|
model: 'claude-sonnet-4-5-20250929',
|
|
provider: Provider.ANTHROPIC,
|
|
complexity: Complexity.MEDIUM,
|
|
});
|
|
|
|
if (prediction?.prediction) {
|
|
const p = prediction.prediction;
|
|
console.log('Token predictions (median):', {
|
|
inputTokens: p.input_tokens.median,
|
|
outputTokens: p.output_tokens.median,
|
|
});
|
|
console.log('Cost prediction:', `$${(p.cost_usd_micros.median / 1_000_000).toFixed(2)}`);
|
|
console.log('Duration prediction:', `${(p.duration_ms.median / 1000).toFixed(0)}s`);
|
|
console.log('Correction factors:', {
|
|
input: p.correction_factors.input, // >1.0 means estimates tend to be too low
|
|
output: p.correction_factors.output,
|
|
});
|
|
console.log('Quality:', {
|
|
gatePassRate: `${(p.quality.gate_pass_rate * 100).toFixed(0)}%`,
|
|
successRate: `${(p.quality.success_rate * 100).toFixed(0)}%`,
|
|
});
|
|
|
|
// Check confidence level
|
|
if (prediction.metadata.confidence === 'low') {
|
|
console.warn('Low confidence — small sample size or fallback was applied');
|
|
}
|
|
}
|
|
```
|
|
|
|
### Understand fallback behavior
|
|
|
|
When the server doesn't have enough data for an exact match, it broadens the query by dropping dimensions (e.g., ignoring complexity). The `metadata` fields tell you what happened:
|
|
|
|
| `fallback_level` | Meaning |
|
|
|-------------------|---------|
|
|
| `0` | Exact match on all dimensions |
|
|
| `1+` | Some dimensions were dropped to find data |
|
|
| `-1` | No prediction data available at any level |
|
|
|
|
---
|
|
|
|
## Environment-Specific Configuration
|
|
|
|
### Development
|
|
|
|
```typescript
|
|
const client = new TelemetryClient({
|
|
serverUrl: 'http://localhost:8000', // Local dev server
|
|
apiKey: process.env.TELEMETRY_API_KEY!,
|
|
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
|
|
dryRun: true, // Don't send real data
|
|
submitIntervalMs: 10_000, // Flush more frequently for debugging
|
|
onError: (err) => console.error('[telemetry]', err),
|
|
});
|
|
```
|
|
|
|
### Production
|
|
|
|
```typescript
|
|
const client = new TelemetryClient({
|
|
serverUrl: 'https://tel-api.mosaicstack.dev',
|
|
apiKey: process.env.TELEMETRY_API_KEY!,
|
|
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
|
|
submitIntervalMs: 300_000, // 5 min (default)
|
|
maxRetries: 3, // Retry on transient failures
|
|
onError: (err) => {
|
|
// Route to your observability stack
|
|
logger.error('Telemetry submission failed', { error: err.message });
|
|
},
|
|
});
|
|
```
|
|
|
|
### Conditional enable/disable
|
|
|
|
```typescript
|
|
const client = new TelemetryClient({
|
|
serverUrl: process.env.TELEMETRY_API_URL!,
|
|
apiKey: process.env.TELEMETRY_API_KEY!,
|
|
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
|
|
enabled: process.env.TELEMETRY_ENABLED !== 'false', // Opt-out via env var
|
|
});
|
|
```
|
|
|
|
When `enabled` is `false`, `track()` returns immediately without queuing.
|
|
|
|
---
|
|
|
|
## Error Handling
|
|
|
|
The SDK is designed to never disrupt your application:
|
|
|
|
- **`track()` never throws.** All errors are caught and routed to the `onError` callback.
|
|
- **Failed batches are re-queued.** If a submission fails, events are prepended back to the queue for the next flush cycle.
|
|
- **Exponential backoff with jitter.** Retries use 1s base delay, doubling up to 60s, with random jitter to prevent thundering herd.
|
|
- **`Retry-After` header support.** On HTTP 429 (rate limited), the SDK respects the server's `Retry-After` header.
|
|
- **HTTP 403 is not retried.** An API key / instance ID mismatch is a permanent error.
|
|
|
|
### Custom error handling
|
|
|
|
```typescript
|
|
const client = new TelemetryClient({
|
|
// ...
|
|
onError: (error) => {
|
|
if (error.message.includes('HTTP 403')) {
|
|
console.error('Telemetry auth failed — check API key and instance ID');
|
|
} else if (error.message.includes('HTTP 429')) {
|
|
console.warn('Telemetry rate limited — events will be retried');
|
|
} else {
|
|
console.error('Telemetry error:', error.message);
|
|
}
|
|
},
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## Batch Submission Behavior
|
|
|
|
The SDK batches events for efficiency:
|
|
|
|
1. `track(event)` adds the event to an in-memory queue (bounded, FIFO eviction at capacity).
|
|
2. Every `submitIntervalMs` (default: 5 minutes), the background timer drains the queue in batches of up to `batchSize` (default/max: 100).
|
|
3. Each batch is POSTed to `POST /v1/events/batch` with exponential backoff on failure.
|
|
4. Calling `stop()` flushes all remaining events before resolving.
|
|
|
|
The server accepts up to **100 events per batch** and supports **partial success** — some events may be accepted while others (e.g., duplicates) are rejected.
|
|
|
|
---
|
|
|
|
## API Version Compatibility
|
|
|
|
| SDK Version | API Version | Schema Version |
|
|
|-------------|-------------|----------------|
|
|
| 0.1.x | v1 (`/v1/` endpoints) | `1.0` |
|
|
|
|
The `EventBuilder` automatically sets `schema_version: "1.0"` on every event. The SDK submits to `/v1/events/batch` and queries `/v1/predictions/batch`.
|
|
|
|
When the telemetry API introduces a v2, this SDK will add support in a new major release. The server supports two API versions simultaneously during a 6-month deprecation window.
|