docs(#1): SDK integration guide, API reference, and CI pipeline
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed

- Rewrite README with quick start, config table, prediction usage, API version note
- Add docs/integration-guide.md with Next.js and Node.js examples, env-specific
  config, error handling patterns, batch behavior, and API version compatibility
- Add docs/api-reference.md with full reference for all exported classes, methods,
  types, and enums
- Add .woodpecker.yml with quality gates (lint, typecheck, format, security audit,
  test with coverage) and npm publish to Gitea registry
- Add AGENTS.md and update CLAUDE.md with project conventions

Fixes #1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-14 22:38:19 -06:00
parent 177720e523
commit 231a799a46
6 changed files with 1303 additions and 52 deletions

91
.woodpecker.yml Normal file
View File

@@ -0,0 +1,91 @@
when:
- event: [push, pull_request, manual]
variables:
- &node_image "node:22-alpine"
- &install_deps |
corepack enable
npm ci
steps:
install:
image: *node_image
commands:
- *install_deps
lint:
image: *node_image
commands:
- *install_deps
- npm run lint
depends_on:
- install
typecheck:
image: *node_image
commands:
- *install_deps
- npm run typecheck
depends_on:
- install
format-check:
image: *node_image
commands:
- *install_deps
- npm run format:check
depends_on:
- install
security-audit:
image: *node_image
commands:
- npm audit --audit-level=high
depends_on:
- install
test:
image: *node_image
commands:
- *install_deps
- npm run test:coverage
depends_on:
- install
build:
image: *node_image
commands:
- *install_deps
- npm run build
depends_on:
- lint
- typecheck
- format-check
- security-audit
- test
publish:
image: *node_image
environment:
GITEA_TOKEN:
from_secret: gitea_token
commands:
- *install_deps
- npm run build
- |
echo "//git.mosaicstack.dev/api/packages/mosaic/npm/:_authToken=$$GITEA_TOKEN" > .npmrc
echo "@mosaicstack:registry=https://git.mosaicstack.dev/api/packages/mosaic/npm/" >> .npmrc
- |
CURRENT=$(node -p "require('./package.json').version")
PUBLISHED=$(npm view @mosaicstack/telemetry-client version 2>/dev/null || echo "0.0.0")
if [ "$$CURRENT" = "$$PUBLISHED" ]; then
echo "Version $$CURRENT already published, skipping"
exit 0
fi
echo "Publishing $$CURRENT (was $$PUBLISHED)"
npm publish --access public
when:
- branch: [main, develop]
event: [push, manual, tag]
depends_on:
- build

72
AGENTS.md Normal file
View File

@@ -0,0 +1,72 @@
# mosaic-telemetry-client-js — Agent Context
> Patterns, gotchas, and orchestrator integration for AI agents working on this project.
> **Update this file** when you discover reusable patterns or non-obvious requirements.
## Codebase Patterns
<!-- Add project-specific patterns as you discover them -->
<!-- Examples: -->
<!-- - Use `httpx.AsyncClient` for external HTTP calls -->
<!-- - All routes require authentication via `Depends(get_current_user)` -->
<!-- - Config is loaded from environment variables via `settings.py` -->
## Common Gotchas
<!-- Add things that trip up agents -->
<!-- Examples: -->
<!-- - Remember to run migrations after schema changes -->
<!-- - Frontend env vars need NEXT_PUBLIC_ prefix -->
<!-- - Tests require a running PostgreSQL instance -->
## Quality Gates
**All must pass before any commit:**
```bash
npm run lint ${QUALITY_GATES}${QUALITY_GATES} npm run typecheck ${QUALITY_GATES}${QUALITY_GATES} npm test
```
## Orchestrator Integration
### Task Prefix
Use `MOSAIC-TELEMETRY-CLIENT-JS` as the prefix for orchestrated tasks (e.g., `MOSAIC-TELEMETRY-CLIENT-JS-SEC-001`).
### Package/Directory Names
<!-- List key directories the orchestrator needs to know about -->
| Directory | Purpose |
|-----------|---------|
| `src/` | Main source code |
| `tests/` | Test files |
| `docs/scratchpads/` | Working documents |
### Worker Checklist
When completing an orchestrated task:
1. Read the finding details from the report
2. Implement the fix following existing code patterns
3. Run quality gates (ALL must pass)
4. Commit with: `git commit -m "fix({finding_id}): brief description"`
5. Report result as JSON to orchestrator
### Post-Coding Review
After implementing changes, the orchestrator will run:
1. **Codex code review**`~/.claude/scripts/codex/codex-code-review.sh --uncommitted`
2. **Codex security review**`~/.claude/scripts/codex/codex-security-review.sh --uncommitted`
3. If blockers/critical findings: remediation task created
4. If clean: task marked done
## Directory-Specific Context
<!-- Add sub-AGENTS.md files in subdirectories if needed -->
<!-- Example: -->
<!-- - `src/api/AGENTS.md` — API-specific patterns -->
<!-- - `src/components/AGENTS.md` — Component conventions -->
## Testing Approaches
<!-- Document how tests should be written for this project -->
<!-- Examples: -->
<!-- - Unit tests use pytest with fixtures in conftest.py -->
<!-- - Integration tests require DATABASE_URL env var -->
<!-- - E2E tests use Playwright -->

View File

@@ -28,3 +28,55 @@ npm run build # Build to dist/
- `track()` never throws — catches everything, routes to `onError` callback
- Zero runtime deps: uses native `fetch` (Node 18+), `crypto.randomUUID()`, `setInterval`
- All types are standalone — no dependency on the telemetry server package
## Conditional Documentation Loading
**Read the relevant guide before starting work:**
| Task Type | Guide |
|-----------|-------|
| Bootstrapping a new project | `~/.claude/agent-guides/bootstrap.md` |
| Orchestrating autonomous tasks | `~/.claude/agent-guides/orchestrator.md` |
| Ralph autonomous development | `~/.claude/agent-guides/ralph-autonomous.md` |
| Frontend development | `~/.claude/agent-guides/frontend.md` |
| Backend/API development | `~/.claude/agent-guides/backend.md` |
| TypeScript strict typing | `~/.claude/agent-guides/typescript.md` |
| Code review | `~/.claude/agent-guides/code-review.md` |
| Authentication/Authorization | `~/.claude/agent-guides/authentication.md` |
| Infrastructure/DevOps | `~/.claude/agent-guides/infrastructure.md` |
| QA/Testing | `~/.claude/agent-guides/qa-testing.md` |
| Secrets management (Vault) | `~/.claude/agent-guides/vault-secrets.md` |
## Commits
```
<type>(#issue): Brief description
Detailed explanation if needed.
Fixes #123
```
Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
## Secrets Management
**NEVER hardcode secrets.** Use `.env` files (gitignored) or a secrets manager.
```bash
# .env.example is committed (with placeholders)
# .env is NOT committed (contains real values)
```
Ensure `.gitignore` includes `.env*` (except `.env.example`).
## Multi-Agent Coordination
When multiple agents work on this project:
1. `git pull --rebase` before editing
2. `git pull --rebase` before pushing
3. If conflicts, **alert the user** — don't auto-resolve data conflicts

133
README.md
View File

@@ -2,7 +2,9 @@
TypeScript client SDK for [Mosaic Stack Telemetry](https://tel.mosaicstack.dev). Reports task-completion metrics from AI coding harnesses and queries crowd-sourced predictions.
**Zero runtime dependencies** — uses native `fetch`, `crypto.randomUUID()`, and `setInterval`.
**Zero runtime dependencies** — uses native `fetch`, `crypto.randomUUID()`, and `setInterval`. Requires Node.js 18+.
**Targets Mosaic Telemetry API v1** (`/v1/` endpoints, event schema version `1.0`).
## Installation
@@ -13,17 +15,26 @@ npm install @mosaicstack/telemetry-client
## Quick Start
```typescript
import { TelemetryClient, TaskType, Complexity, Harness, Provider, Outcome } from '@mosaicstack/telemetry-client';
import {
TelemetryClient,
TaskType,
Complexity,
Harness,
Provider,
Outcome,
QualityGate,
} from '@mosaicstack/telemetry-client';
// 1. Create and start the client
const client = new TelemetryClient({
serverUrl: 'https://tel.mosaicstack.dev',
apiKey: 'your-64-char-hex-api-key',
instanceId: 'your-instance-uuid',
serverUrl: 'https://tel-api.mosaicstack.dev',
apiKey: process.env.TELEMETRY_API_KEY!,
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
});
client.start();
client.start(); // begins background batch submission every 5 minutes
// Build and track an event
// 2. Build and track an event
const event = client.eventBuilder.build({
task_duration_ms: 45000,
task_type: TaskType.IMPLEMENTATION,
@@ -31,83 +42,101 @@ const event = client.eventBuilder.build({
harness: Harness.CLAUDE_CODE,
model: 'claude-sonnet-4-5-20250929',
provider: Provider.ANTHROPIC,
estimated_input_tokens: 5000,
estimated_output_tokens: 2000,
actual_input_tokens: 5500,
actual_output_tokens: 2200,
estimated_cost_usd_micros: 30000,
actual_cost_usd_micros: 33000,
estimated_input_tokens: 105000,
estimated_output_tokens: 45000,
actual_input_tokens: 112340,
actual_output_tokens: 38760,
estimated_cost_usd_micros: 630000,
actual_cost_usd_micros: 919200,
quality_gate_passed: true,
quality_gates_run: [],
quality_gates_run: [QualityGate.BUILD, QualityGate.LINT, QualityGate.TEST],
quality_gates_failed: [],
context_compactions: 0,
context_compactions: 2,
context_rotations: 0,
context_utilization_final: 0.4,
context_utilization_final: 0.72,
outcome: Outcome.SUCCESS,
retry_count: 0,
language: 'typescript',
repo_size_category: 'medium',
});
client.track(event);
client.track(event); // queues the event (never throws)
// When shutting down
await client.stop();
```
## Querying Predictions
```typescript
const query = {
// 3. Query predictions
const prediction = client.getPrediction({
task_type: TaskType.IMPLEMENTATION,
model: 'claude-sonnet-4-5-20250929',
provider: Provider.ANTHROPIC,
complexity: Complexity.MEDIUM,
};
});
// Fetch from server and cache locally
await client.refreshPredictions([query]);
// Get cached prediction (returns null if not cached)
const prediction = client.getPrediction(query);
if (prediction?.prediction) {
console.log('Median input tokens:', prediction.prediction.input_tokens.median);
console.log('Median cost (microdollars):', prediction.prediction.cost_usd_micros.median);
}
// 4. Shut down gracefully (flushes remaining events)
await client.stop();
```
## Configuration
```typescript
const client = new TelemetryClient({
serverUrl: 'https://tel.mosaicstack.dev', // Required
apiKey: 'your-api-key', // Required (64-char hex)
instanceId: 'your-uuid', // Required
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `serverUrl` | `string` | **required** | Telemetry API base URL |
| `apiKey` | `string` | **required** | Bearer token for authentication |
| `instanceId` | `string` | **required** | UUID identifying this instance |
| `enabled` | `boolean` | `true` | Set `false` to disable — `track()` becomes a no-op |
| `submitIntervalMs` | `number` | `300_000` | Background flush interval (5 min) |
| `maxQueueSize` | `number` | `1000` | Max queued events before FIFO eviction |
| `batchSize` | `number` | `100` | Events per batch submission (server max: 100) |
| `requestTimeoutMs` | `number` | `10_000` | HTTP request timeout |
| `predictionCacheTtlMs` | `number` | `21_600_000` | Prediction cache TTL (6 hours) |
| `dryRun` | `boolean` | `false` | Log events instead of sending them |
| `maxRetries` | `number` | `3` | Retry attempts with exponential backoff |
| `onError` | `(error: Error) => void` | silent | Error callback |
// Optional
enabled: true, // Set false to disable (track() becomes no-op)
submitIntervalMs: 300_000, // Background flush interval (default: 5 min)
maxQueueSize: 1000, // Max queued events (default: 1000, FIFO eviction)
batchSize: 100, // Events per batch (default/max: 100)
requestTimeoutMs: 10_000, // HTTP timeout (default: 10s)
predictionCacheTtlMs: 21_600_000, // Prediction cache TTL (default: 6 hours)
dryRun: false, // Log events instead of sending
maxRetries: 3, // Retry attempts on failure
onError: (err) => console.error(err), // Error callback
## Querying Predictions
Predictions are crowd-sourced token/cost/duration estimates from the telemetry API. The SDK caches them locally with a configurable TTL.
```typescript
// Fetch predictions from the server and cache locally
await client.refreshPredictions([
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.MEDIUM },
{ task_type: TaskType.TESTING, model: 'claude-haiku-4-5-20251001', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
]);
// Read from cache (returns null if not cached or expired)
const prediction = client.getPrediction({
task_type: TaskType.IMPLEMENTATION,
model: 'claude-sonnet-4-5-20250929',
provider: Provider.ANTHROPIC,
complexity: Complexity.MEDIUM,
});
if (prediction?.prediction) {
console.log('Median input tokens:', prediction.prediction.input_tokens.median);
console.log('Median cost ($):', prediction.prediction.cost_usd_micros.median / 1_000_000);
console.log('Confidence:', prediction.metadata.confidence);
}
```
## Dry-Run Mode
For testing without sending data:
For development and testing without sending data to the server:
```typescript
const client = new TelemetryClient({
serverUrl: 'https://tel.mosaicstack.dev',
serverUrl: 'https://tel-api.mosaicstack.dev',
apiKey: 'test-key',
instanceId: 'test-uuid',
dryRun: true,
});
```
In dry-run mode, `track()` still queues events and `flush()` still runs, but the `BatchSubmitter` returns synthetic `accepted` responses without making HTTP calls.
## Documentation
- **[Integration Guide](docs/integration-guide.md)** — Next.js and Node.js examples, environment-specific configuration, error handling patterns
- **[API Reference](docs/api-reference.md)** — Full reference for all exported classes, methods, types, and enums
## License
MPL-2.0

602
docs/api-reference.md Normal file
View File

@@ -0,0 +1,602 @@
# API Reference
Complete reference for all classes, methods, types, and enums exported by `@mosaicstack/telemetry-client`.
**SDK version:** 0.1.0
**Targets:** Mosaic Telemetry API v1, event schema version `1.0`
---
## TelemetryClient
Main entry point. Queues task-completion events for background batch submission and provides access to cached predictions.
```typescript
import { TelemetryClient } from '@mosaicstack/telemetry-client';
```
### Constructor
```typescript
new TelemetryClient(config: TelemetryConfig)
```
Creates a new client instance. Does **not** start background submission — call `start()` to begin.
### Properties
| Property | Type | Description |
|----------|------|-------------|
| `eventBuilder` | `EventBuilder` | Builder for constructing `TaskCompletionEvent` objects |
| `queueSize` | `number` | Number of events currently in the queue |
| `isRunning` | `boolean` | Whether background submission is active |
### Methods
#### `start(): void`
Start background batch submission via `setInterval`. Idempotent — calling `start()` multiple times has no effect.
#### `stop(): Promise<void>`
Stop background submission and flush all remaining events. Idempotent. Returns a promise that resolves when the final flush completes.
#### `track(event: TaskCompletionEvent): void`
Queue an event for batch submission. **Never throws** — all errors are caught and routed to the `onError` callback.
When `enabled` is `false`, this method returns immediately without queuing.
When the queue is at capacity (`maxQueueSize`), the oldest event is evicted to make room.
#### `getPrediction(query: PredictionQuery): PredictionResponse | null`
Get a cached prediction for the given query dimensions. Returns `null` if no prediction is cached or the cache entry has expired.
#### `refreshPredictions(queries: PredictionQuery[]): Promise<void>`
Fetch predictions from the server via `POST /v1/predictions/batch` and store them in the local cache. The predictions endpoint is public — no authentication required.
Accepts up to 50 queries per call (server limit).
---
## EventBuilder
Convenience builder that auto-fills `event_id`, `timestamp`, `instance_id`, and `schema_version`.
```typescript
import { EventBuilder } from '@mosaicstack/telemetry-client';
```
Access via `client.eventBuilder` — you don't normally construct this directly.
### Methods
#### `build(params: EventBuilderParams): TaskCompletionEvent`
Build a complete `TaskCompletionEvent` from the given parameters.
Auto-generated fields:
- `event_id``crypto.randomUUID()`
- `timestamp``new Date().toISOString()`
- `instance_id` — from client config
- `schema_version``"1.0"`
---
## EventQueue
Bounded FIFO queue for telemetry events. Used internally by `TelemetryClient`.
```typescript
import { EventQueue } from '@mosaicstack/telemetry-client';
```
### Constructor
```typescript
new EventQueue(maxSize: number)
```
### Properties
| Property | Type | Description |
|----------|------|-------------|
| `size` | `number` | Current number of events in the queue |
| `isEmpty` | `boolean` | Whether the queue is empty |
### Methods
#### `enqueue(event: TaskCompletionEvent): void`
Add an event. Evicts the oldest event if at capacity.
#### `drain(maxItems: number): TaskCompletionEvent[]`
Remove and return up to `maxItems` events from the front.
#### `prepend(events: TaskCompletionEvent[]): void`
Prepend events back to the front (used for re-enqueue on submission failure). Respects `maxSize` — excess events are dropped.
---
## BatchSubmitter
Handles HTTP submission of event batches with retry logic.
```typescript
import { BatchSubmitter } from '@mosaicstack/telemetry-client';
```
### Methods
#### `submit(events: TaskCompletionEvent[]): Promise<SubmitResult>`
Submit a batch to `POST /v1/events/batch`. Retries with exponential backoff (1s base, 60s max, with jitter) on transient failures. Respects the server's `Retry-After` header on HTTP 429.
In dry-run mode, returns a synthetic success response without making HTTP calls.
---
## PredictionCache
In-memory TTL cache for prediction responses.
```typescript
import { PredictionCache } from '@mosaicstack/telemetry-client';
```
### Constructor
```typescript
new PredictionCache(ttlMs: number)
```
### Properties
| Property | Type | Description |
|----------|------|-------------|
| `size` | `number` | Number of entries in cache (may include expired entries) |
### Methods
#### `get(query: PredictionQuery): PredictionResponse | null`
Retrieve a cached prediction. Returns `null` if not cached or expired (expired entries are lazily deleted).
#### `set(query: PredictionQuery, response: PredictionResponse): void`
Store a prediction with TTL.
#### `clear(): void`
Clear all cached predictions.
---
## Configuration Types
### TelemetryConfig
User-facing configuration passed to the `TelemetryClient` constructor.
```typescript
import type { TelemetryConfig } from '@mosaicstack/telemetry-client';
```
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `serverUrl` | `string` | Yes | — | Telemetry API base URL (e.g., `"https://tel-api.mosaicstack.dev"`) |
| `apiKey` | `string` | Yes | — | Bearer token for `POST /v1/events/batch` authentication |
| `instanceId` | `string` | Yes | — | UUID identifying this Mosaic Stack instance |
| `enabled` | `boolean` | No | `true` | When `false`, `track()` is a no-op |
| `submitIntervalMs` | `number` | No | `300_000` | Background flush interval in ms (5 min) |
| `maxQueueSize` | `number` | No | `1000` | Maximum events held in queue before FIFO eviction |
| `batchSize` | `number` | No | `100` | Events per batch (server max: 100) |
| `requestTimeoutMs` | `number` | No | `10_000` | HTTP request timeout in ms |
| `predictionCacheTtlMs` | `number` | No | `21_600_000` | Prediction cache TTL in ms (6 hours) |
| `dryRun` | `boolean` | No | `false` | Simulate submissions without HTTP calls |
| `maxRetries` | `number` | No | `3` | Retry attempts on transient failure |
| `onError` | `(error: Error) => void` | No | silent | Callback invoked on errors |
### ResolvedConfig
Internal configuration with all defaults applied. All fields are required (non-optional).
```typescript
import type { ResolvedConfig } from '@mosaicstack/telemetry-client';
```
### resolveConfig
```typescript
import { resolveConfig } from '@mosaicstack/telemetry-client';
function resolveConfig(config: TelemetryConfig): ResolvedConfig
```
Apply defaults to a `TelemetryConfig`, producing a `ResolvedConfig`. Strips trailing slashes from `serverUrl`.
---
## Event Types
### EventBuilderParams
Parameters accepted by `EventBuilder.build()`. Excludes auto-generated fields (`event_id`, `timestamp`, `instance_id`, `schema_version`).
```typescript
import type { EventBuilderParams } from '@mosaicstack/telemetry-client';
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `task_duration_ms` | `number` | Yes | Wall-clock time in ms (086,400,000) |
| `task_type` | `TaskType` | Yes | Category of work performed |
| `complexity` | `Complexity` | Yes | Task complexity level |
| `harness` | `Harness` | Yes | Coding tool / execution environment |
| `model` | `string` | Yes | Model identifier (1100 chars) |
| `provider` | `Provider` | Yes | LLM provider |
| `estimated_input_tokens` | `number` | Yes | Pre-task input token estimate (010,000,000) |
| `estimated_output_tokens` | `number` | Yes | Pre-task output token estimate (010,000,000) |
| `actual_input_tokens` | `number` | Yes | Actual input tokens consumed (010,000,000) |
| `actual_output_tokens` | `number` | Yes | Actual output tokens generated (010,000,000) |
| `estimated_cost_usd_micros` | `number` | Yes | Estimated cost in microdollars (0100,000,000) |
| `actual_cost_usd_micros` | `number` | Yes | Actual cost in microdollars (0100,000,000) |
| `quality_gate_passed` | `boolean` | Yes | Whether all quality gates passed |
| `quality_gates_run` | `QualityGate[]` | Yes | Gates that were executed |
| `quality_gates_failed` | `QualityGate[]` | Yes | Gates that failed |
| `context_compactions` | `number` | Yes | Context compaction count (0100) |
| `context_rotations` | `number` | Yes | Context rotation count (050) |
| `context_utilization_final` | `number` | Yes | Final context utilization ratio (0.01.0) |
| `outcome` | `Outcome` | Yes | Task result |
| `retry_count` | `number` | Yes | Number of retries (020) |
| `language` | `string \| null` | No | Primary programming language (max 30 chars) |
| `repo_size_category` | `RepoSizeCategory \| null` | No | Repository size bucket |
### TaskCompletionEvent
Full event object submitted to the server. Extends `EventBuilderParams` with auto-generated identity fields.
```typescript
import type { TaskCompletionEvent } from '@mosaicstack/telemetry-client';
```
Additional fields (auto-generated by `EventBuilder`):
| Field | Type | Description |
|-------|------|-------------|
| `instance_id` | `string` | UUID identifying the submitting instance |
| `event_id` | `string` | Unique UUID for deduplication |
| `schema_version` | `string` | Always `"1.0"` |
| `timestamp` | `string` | ISO 8601 datetime |
---
## Prediction Types
### PredictionQuery
Query parameters for fetching a prediction.
```typescript
import type { PredictionQuery } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `task_type` | `TaskType` | Task type to predict for |
| `model` | `string` | Model identifier |
| `provider` | `Provider` | LLM provider |
| `complexity` | `Complexity` | Complexity level |
### PredictionResponse
Response from the predictions endpoint.
```typescript
import type { PredictionResponse } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `prediction` | `PredictionData \| null` | Prediction data, or `null` if no data available |
| `metadata` | `PredictionMetadata` | Sample size, confidence, fallback info |
### PredictionData
Statistical prediction for a dimension combination.
```typescript
import type { PredictionData } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `input_tokens` | `TokenDistribution` | Input token distribution (p10/p25/median/p75/p90) |
| `output_tokens` | `TokenDistribution` | Output token distribution |
| `cost_usd_micros` | `Record<string, number>` | Cost stats — `{ median: number }` |
| `duration_ms` | `Record<string, number>` | Duration stats — `{ median: number }` |
| `correction_factors` | `CorrectionFactors` | Actual-to-estimated token ratios |
| `quality` | `QualityPrediction` | Quality gate pass rate and success rate |
### TokenDistribution
Percentile distribution of token counts.
```typescript
import type { TokenDistribution } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `p10` | `number` | 10th percentile |
| `p25` | `number` | 25th percentile |
| `median` | `number` | 50th percentile (median) |
| `p75` | `number` | 75th percentile |
| `p90` | `number` | 90th percentile |
### CorrectionFactors
Ratio of actual to estimated tokens. Values >1.0 mean estimates tend to be too low.
```typescript
import type { CorrectionFactors } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `input` | `number` | Actual / estimated input tokens |
| `output` | `number` | Actual / estimated output tokens |
### QualityPrediction
Predicted quality gate and success rates.
```typescript
import type { QualityPrediction } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `gate_pass_rate` | `number` | Fraction of events where all quality gates pass (0.01.0) |
| `success_rate` | `number` | Fraction of events with `outcome: "success"` (0.01.0) |
### PredictionMetadata
Metadata about a prediction response.
```typescript
import type { PredictionMetadata } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `sample_size` | `number` | Number of events used to compute this prediction |
| `fallback_level` | `number` | 0 = exact match, 1+ = dimensions dropped, -1 = no data |
| `confidence` | `'none' \| 'low' \| 'medium' \| 'high'` | Confidence level |
| `last_updated` | `string \| null` | ISO 8601 timestamp of last computation |
| `dimensions_matched` | `Record<string, string \| null> \| null` | Matched dimensions (`null` values indicate fallback) |
| `fallback_note` | `string \| null` | Human-readable fallback explanation |
| `cache_hit` | `boolean` | Whether served from server-side cache |
**Confidence level criteria:**
| Level | Criteria |
|-------|----------|
| `none` | No data available. `prediction` is `null`. |
| `low` | Sample size < 30 or fallback was applied |
| `medium` | Sample size 3099, exact match |
| `high` | Sample size >= 100, exact match |
---
## Batch Types
### BatchEventRequest
Request body for `POST /v1/events/batch`.
```typescript
import type { BatchEventRequest } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `events` | `TaskCompletionEvent[]` | 1100 events to submit |
### BatchEventResponse
Response from `POST /v1/events/batch`.
```typescript
import type { BatchEventResponse } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `accepted` | `number` | Count of accepted events |
| `rejected` | `number` | Count of rejected events |
| `results` | `BatchEventResult[]` | Per-event result details |
### BatchEventResult
Per-event result within a batch response.
```typescript
import type { BatchEventResult } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `event_id` | `string` | The event's UUID |
| `status` | `'accepted' \| 'rejected'` | Whether the event was accepted |
| `error` | `string \| null` | Error message if rejected |
### SubmitResult
Internal result type from `BatchSubmitter.submit()`.
```typescript
import type { SubmitResult } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | `boolean` | Whether the submission succeeded |
| `response` | `BatchEventResponse \| undefined` | Server response (on success) |
| `retryAfterMs` | `number \| undefined` | Retry delay from 429 response |
| `error` | `Error \| undefined` | Error details (on failure) |
### BatchPredictionRequest
Request body for `POST /v1/predictions/batch`.
```typescript
import type { BatchPredictionRequest } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `queries` | `PredictionQuery[]` | 150 prediction queries |
### BatchPredictionResponse
Response from `POST /v1/predictions/batch`.
```typescript
import type { BatchPredictionResponse } from '@mosaicstack/telemetry-client';
```
| Field | Type | Description |
|-------|------|-------------|
| `results` | `PredictionResponse[]` | One response per query, in request order |
---
## Enums
All enums use string values matching the server's API contract.
### TaskType
```typescript
import { TaskType } from '@mosaicstack/telemetry-client';
```
| Member | Value | Description |
|--------|-------|-------------|
| `PLANNING` | `"planning"` | Architecture design, task breakdown |
| `IMPLEMENTATION` | `"implementation"` | Writing new code |
| `CODE_REVIEW` | `"code_review"` | Reviewing existing code |
| `TESTING` | `"testing"` | Writing or running tests |
| `DEBUGGING` | `"debugging"` | Investigating and fixing bugs |
| `REFACTORING` | `"refactoring"` | Restructuring existing code |
| `DOCUMENTATION` | `"documentation"` | Writing docs, comments, READMEs |
| `CONFIGURATION` | `"configuration"` | Config files, CI/CD, infrastructure |
| `SECURITY_AUDIT` | `"security_audit"` | Security review, vulnerability analysis |
| `UNKNOWN` | `"unknown"` | Unclassified task type (fallback) |
### Complexity
```typescript
import { Complexity } from '@mosaicstack/telemetry-client';
```
| Member | Value | Description | Typical Token Budget |
|--------|-------|-------------|---------------------|
| `LOW` | `"low"` | Simple fixes, typos, config changes | 50,000 |
| `MEDIUM` | `"medium"` | Standard features, moderate logic | 150,000 |
| `HIGH` | `"high"` | Complex features, multi-file changes | 350,000 |
| `CRITICAL` | `"critical"` | Major refactoring, architectural changes | 750,000 |
### Harness
```typescript
import { Harness } from '@mosaicstack/telemetry-client';
```
| Member | Value | Description |
|--------|-------|-------------|
| `CLAUDE_CODE` | `"claude_code"` | Anthropic Claude Code CLI |
| `OPENCODE` | `"opencode"` | OpenCode CLI |
| `KILO_CODE` | `"kilo_code"` | Kilo Code VS Code extension |
| `AIDER` | `"aider"` | Aider AI pair programming |
| `API_DIRECT` | `"api_direct"` | Direct API calls (no harness) |
| `OLLAMA_LOCAL` | `"ollama_local"` | Ollama local inference |
| `CUSTOM` | `"custom"` | Custom or unrecognized harness |
| `UNKNOWN` | `"unknown"` | Harness not reported |
### Provider
```typescript
import { Provider } from '@mosaicstack/telemetry-client';
```
| Member | Value | Description |
|--------|-------|-------------|
| `ANTHROPIC` | `"anthropic"` | Anthropic (Claude models) |
| `OPENAI` | `"openai"` | OpenAI (GPT models) |
| `OPENROUTER` | `"openrouter"` | OpenRouter (multi-provider routing) |
| `OLLAMA` | `"ollama"` | Ollama (local/self-hosted) |
| `GOOGLE` | `"google"` | Google (Gemini models) |
| `MISTRAL` | `"mistral"` | Mistral AI |
| `CUSTOM` | `"custom"` | Custom or unrecognized provider |
| `UNKNOWN` | `"unknown"` | Provider not reported |
### QualityGate
```typescript
import { QualityGate } from '@mosaicstack/telemetry-client';
```
| Member | Value | Description |
|--------|-------|-------------|
| `BUILD` | `"build"` | Code compiles/builds successfully |
| `LINT` | `"lint"` | Linter passes with no errors |
| `TEST` | `"test"` | Unit/integration tests pass |
| `COVERAGE` | `"coverage"` | Code coverage meets threshold (85%) |
| `TYPECHECK` | `"typecheck"` | Type checker passes |
| `SECURITY` | `"security"` | Security scan passes |
### Outcome
```typescript
import { Outcome } from '@mosaicstack/telemetry-client';
```
| Member | Value | Description |
|--------|-------|-------------|
| `SUCCESS` | `"success"` | Task completed, all quality gates passed |
| `FAILURE` | `"failure"` | Task failed after all retries |
| `PARTIAL` | `"partial"` | Task partially completed (some gates passed) |
| `TIMEOUT` | `"timeout"` | Task exceeded time or token budget |
### RepoSizeCategory
```typescript
import { RepoSizeCategory } from '@mosaicstack/telemetry-client';
```
| Member | Value | Approximate LOC | Description |
|--------|-------|-----------------|-------------|
| `TINY` | `"tiny"` | < 1,000 | Scripts, single-file projects |
| `SMALL` | `"small"` | 1,00010,000 | Small libraries, tools |
| `MEDIUM` | `"medium"` | 10,000100,000 | Standard applications |
| `LARGE` | `"large"` | 100,0001,000,000 | Large applications, monorepos |
| `HUGE` | `"huge"` | > 1,000,000 | Enterprise codebases |
---
## Server API Endpoints Used
The SDK communicates with these Mosaic Telemetry API v1 endpoints:
| SDK Method | HTTP Endpoint | Auth Required |
|------------|---------------|---------------|
| `flush()` (internal) | `POST /v1/events/batch` | Yes (Bearer token) |
| `refreshPredictions()` | `POST /v1/predictions/batch` | No (public) |
For the full server API specification, see the [Mosaic Telemetry API Reference](https://tel-api.mosaicstack.dev/v1/docs).

405
docs/integration-guide.md Normal file
View File

@@ -0,0 +1,405 @@
# Integration Guide
This guide covers how to integrate `@mosaicstack/telemetry-client` into your applications. The SDK targets **Mosaic Telemetry API v1** (event schema version `1.0`).
## Prerequisites
- Node.js >= 18 (for native `fetch` and `crypto.randomUUID()`)
- A Mosaic Telemetry API key and instance ID (issued by an administrator via the admin API)
## Installation
```bash
npm install @mosaicstack/telemetry-client
```
The package ships ESM-only with TypeScript declarations. Zero runtime dependencies.
## Environment Setup
Store your credentials in environment variables — never hardcode them.
```bash
# .env (not committed — add to .gitignore)
TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
TELEMETRY_API_KEY=msk_your_api_key_here
TELEMETRY_INSTANCE_ID=a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d
```
```bash
# .env.example (committed — documents required variables)
TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
TELEMETRY_API_KEY=your-api-key
TELEMETRY_INSTANCE_ID=your-instance-uuid
```
---
## Instrumenting a Next.js App
Next.js server actions and API routes run on Node.js, so the SDK works directly. Create a shared singleton and track events from your server-side code.
### 1. Create a telemetry singleton
```typescript
// lib/telemetry.ts
import {
TelemetryClient,
TaskType,
Complexity,
Harness,
Provider,
Outcome,
QualityGate,
} from '@mosaicstack/telemetry-client';
let client: TelemetryClient | null = null;
export function getTelemetryClient(): TelemetryClient {
if (!client) {
client = new TelemetryClient({
serverUrl: process.env.TELEMETRY_API_URL!,
apiKey: process.env.TELEMETRY_API_KEY!,
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
enabled: process.env.NODE_ENV === 'production',
onError: (err) => console.error('[telemetry]', err.message),
});
client.start();
}
return client;
}
// Re-export enums for convenience
export { TaskType, Complexity, Harness, Provider, Outcome, QualityGate };
```
### 2. Track events from an API route
```typescript
// app/api/task-complete/route.ts
import { NextResponse } from 'next/server';
import { getTelemetryClient, TaskType, Complexity, Harness, Provider, Outcome } from '@/lib/telemetry';
export async function POST(request: Request) {
const body = await request.json();
const client = getTelemetryClient();
const event = client.eventBuilder.build({
task_duration_ms: body.durationMs,
task_type: TaskType.IMPLEMENTATION,
complexity: Complexity.MEDIUM,
harness: Harness.CLAUDE_CODE,
model: body.model,
provider: Provider.ANTHROPIC,
estimated_input_tokens: body.estimatedInputTokens,
estimated_output_tokens: body.estimatedOutputTokens,
actual_input_tokens: body.actualInputTokens,
actual_output_tokens: body.actualOutputTokens,
estimated_cost_usd_micros: body.estimatedCostMicros,
actual_cost_usd_micros: body.actualCostMicros,
quality_gate_passed: body.qualityGatePassed,
quality_gates_run: body.qualityGatesRun,
quality_gates_failed: body.qualityGatesFailed,
context_compactions: body.contextCompactions,
context_rotations: body.contextRotations,
context_utilization_final: body.contextUtilization,
outcome: Outcome.SUCCESS,
retry_count: 0,
language: 'typescript',
});
client.track(event);
return NextResponse.json({ status: 'queued' });
}
```
### 3. Graceful shutdown
Next.js doesn't provide a built-in shutdown hook, but you can handle `SIGTERM`:
```typescript
// instrumentation.ts (Next.js instrumentation file)
export async function register() {
if (process.env.NEXT_RUNTIME === 'nodejs') {
const { getTelemetryClient } = await import('./lib/telemetry');
// Ensure the client starts on server boot
getTelemetryClient();
// Flush remaining events on shutdown
const shutdown = async () => {
const { getTelemetryClient } = await import('./lib/telemetry');
const client = getTelemetryClient();
await client.stop();
process.exit(0);
};
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
}
}
```
---
## Instrumenting a Node.js Service
For a standalone Node.js service (Express, Fastify, plain script, etc.).
### 1. Initialize and start
```typescript
// src/telemetry.ts
import { TelemetryClient } from '@mosaicstack/telemetry-client';
export const telemetry = new TelemetryClient({
serverUrl: process.env.TELEMETRY_API_URL ?? 'https://tel-api.mosaicstack.dev',
apiKey: process.env.TELEMETRY_API_KEY!,
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
onError: (err) => console.error('[telemetry]', err.message),
});
telemetry.start();
```
### 2. Track events after task completion
```typescript
// src/task-runner.ts
import {
TaskType,
Complexity,
Harness,
Provider,
Outcome,
QualityGate,
} from '@mosaicstack/telemetry-client';
import { telemetry } from './telemetry.js';
async function runTask() {
const startTime = Date.now();
// ... run your AI coding task ...
const durationMs = Date.now() - startTime;
const event = telemetry.eventBuilder.build({
task_duration_ms: durationMs,
task_type: TaskType.IMPLEMENTATION,
complexity: Complexity.HIGH,
harness: Harness.CLAUDE_CODE,
model: 'claude-sonnet-4-5-20250929',
provider: Provider.ANTHROPIC,
estimated_input_tokens: 200000,
estimated_output_tokens: 80000,
actual_input_tokens: 215000,
actual_output_tokens: 72000,
estimated_cost_usd_micros: 1200000,
actual_cost_usd_micros: 1150000,
quality_gate_passed: true,
quality_gates_run: [
QualityGate.BUILD,
QualityGate.LINT,
QualityGate.TEST,
QualityGate.TYPECHECK,
],
quality_gates_failed: [],
context_compactions: 3,
context_rotations: 1,
context_utilization_final: 0.85,
outcome: Outcome.SUCCESS,
retry_count: 0,
language: 'typescript',
repo_size_category: 'medium',
});
telemetry.track(event);
}
```
### 3. Graceful shutdown
```typescript
// src/main.ts
import { telemetry } from './telemetry.js';
async function main() {
// ... your application logic ...
// On shutdown, flush remaining events
process.on('SIGTERM', async () => {
await telemetry.stop();
process.exit(0);
});
}
main();
```
---
## Using Predictions
The telemetry API provides crowd-sourced predictions for token usage, cost, and duration based on historical data. The SDK caches these predictions locally.
### Pre-populate the cache
Call `refreshPredictions()` at startup with the dimension combinations your application uses:
```typescript
import { TaskType, Provider, Complexity } from '@mosaicstack/telemetry-client';
import { telemetry } from './telemetry.js';
// Fetch predictions for all combinations you'll need
await telemetry.refreshPredictions([
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.MEDIUM },
{ task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.HIGH },
{ task_type: TaskType.TESTING, model: 'claude-haiku-4-5-20251001', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
]);
```
### Read cached predictions
```typescript
const prediction = telemetry.getPrediction({
task_type: TaskType.IMPLEMENTATION,
model: 'claude-sonnet-4-5-20250929',
provider: Provider.ANTHROPIC,
complexity: Complexity.MEDIUM,
});
if (prediction?.prediction) {
const p = prediction.prediction;
console.log('Token predictions (median):', {
inputTokens: p.input_tokens.median,
outputTokens: p.output_tokens.median,
});
console.log('Cost prediction:', `$${(p.cost_usd_micros.median / 1_000_000).toFixed(2)}`);
console.log('Duration prediction:', `${(p.duration_ms.median / 1000).toFixed(0)}s`);
console.log('Correction factors:', {
input: p.correction_factors.input, // >1.0 means estimates tend to be too low
output: p.correction_factors.output,
});
console.log('Quality:', {
gatePassRate: `${(p.quality.gate_pass_rate * 100).toFixed(0)}%`,
successRate: `${(p.quality.success_rate * 100).toFixed(0)}%`,
});
// Check confidence level
if (prediction.metadata.confidence === 'low') {
console.warn('Low confidence — small sample size or fallback was applied');
}
}
```
### Understand fallback behavior
When the server doesn't have enough data for an exact match, it broadens the query by dropping dimensions (e.g., ignoring complexity). The `metadata` fields tell you what happened:
| `fallback_level` | Meaning |
|-------------------|---------|
| `0` | Exact match on all dimensions |
| `1+` | Some dimensions were dropped to find data |
| `-1` | No prediction data available at any level |
---
## Environment-Specific Configuration
### Development
```typescript
const client = new TelemetryClient({
serverUrl: 'http://localhost:8000', // Local dev server
apiKey: process.env.TELEMETRY_API_KEY!,
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
dryRun: true, // Don't send real data
submitIntervalMs: 10_000, // Flush more frequently for debugging
onError: (err) => console.error('[telemetry]', err),
});
```
### Production
```typescript
const client = new TelemetryClient({
serverUrl: 'https://tel-api.mosaicstack.dev',
apiKey: process.env.TELEMETRY_API_KEY!,
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
submitIntervalMs: 300_000, // 5 min (default)
maxRetries: 3, // Retry on transient failures
onError: (err) => {
// Route to your observability stack
logger.error('Telemetry submission failed', { error: err.message });
},
});
```
### Conditional enable/disable
```typescript
const client = new TelemetryClient({
serverUrl: process.env.TELEMETRY_API_URL!,
apiKey: process.env.TELEMETRY_API_KEY!,
instanceId: process.env.TELEMETRY_INSTANCE_ID!,
enabled: process.env.TELEMETRY_ENABLED !== 'false', // Opt-out via env var
});
```
When `enabled` is `false`, `track()` returns immediately without queuing.
---
## Error Handling
The SDK is designed to never disrupt your application:
- **`track()` never throws.** All errors are caught and routed to the `onError` callback.
- **Failed batches are re-queued.** If a submission fails, events are prepended back to the queue for the next flush cycle.
- **Exponential backoff with jitter.** Retries use 1s base delay, doubling up to 60s, with random jitter to prevent thundering herd.
- **`Retry-After` header support.** On HTTP 429 (rate limited), the SDK respects the server's `Retry-After` header.
- **HTTP 403 is not retried.** An API key / instance ID mismatch is a permanent error.
### Custom error handling
```typescript
const client = new TelemetryClient({
// ...
onError: (error) => {
if (error.message.includes('HTTP 403')) {
console.error('Telemetry auth failed — check API key and instance ID');
} else if (error.message.includes('HTTP 429')) {
console.warn('Telemetry rate limited — events will be retried');
} else {
console.error('Telemetry error:', error.message);
}
},
});
```
---
## Batch Submission Behavior
The SDK batches events for efficiency:
1. `track(event)` adds the event to an in-memory queue (bounded, FIFO eviction at capacity).
2. Every `submitIntervalMs` (default: 5 minutes), the background timer drains the queue in batches of up to `batchSize` (default/max: 100).
3. Each batch is POSTed to `POST /v1/events/batch` with exponential backoff on failure.
4. Calling `stop()` flushes all remaining events before resolving.
The server accepts up to **100 events per batch** and supports **partial success** — some events may be accepted while others (e.g., duplicates) are rejected.
---
## API Version Compatibility
| SDK Version | API Version | Schema Version |
|-------------|-------------|----------------|
| 0.1.x | v1 (`/v1/` endpoints) | `1.0` |
The `EventBuilder` automatically sets `schema_version: "1.0"` on every event. The SDK submits to `/v1/events/batch` and queries `/v1/predictions/batch`.
When the telemetry API introduces a v2, this SDK will add support in a new major release. The server supports two API versions simultaneously during a 6-month deprecation window.