docs(#1): SDK integration guide, API reference, and CI pipeline

- Rewrite README with quick start, config table, prediction usage, API version note - Add docs/integration-guide.md with Next.js and Node.js examples, env-specific config, error handling patterns, batch behavior, and API version compatibility - Add docs/api-reference.md with full reference for all exported classes, methods, types, and enums - Add .woodpecker.yml with quality gates (lint, typecheck, format, security audit, test with coverage) and npm publish to Gitea registry - Add AGENTS.md and update CLAUDE.md with project conventions Fixes #1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 22:38:19 -06:00
parent 177720e523
commit 231a799a46
6 changed files with 1303 additions and 52 deletions
--- a/.woodpecker.yml
+++ b/.woodpecker.yml
@@ -0,0 +1,91 @@
+when:
+  - event: [push, pull_request, manual]
+
+variables:
+  - &node_image "node:22-alpine"
+  - &install_deps |
+    corepack enable
+    npm ci
+
+steps:
+  install:
+    image: *node_image
+    commands:
+      - *install_deps
+
+  lint:
+    image: *node_image
+    commands:
+      - *install_deps
+      - npm run lint
+    depends_on:
+      - install
+
+  typecheck:
+    image: *node_image
+    commands:
+      - *install_deps
+      - npm run typecheck
+    depends_on:
+      - install
+
+  format-check:
+    image: *node_image
+    commands:
+      - *install_deps
+      - npm run format:check
+    depends_on:
+      - install
+
+  security-audit:
+    image: *node_image
+    commands:
+      - npm audit --audit-level=high
+    depends_on:
+      - install
+
+  test:
+    image: *node_image
+    commands:
+      - *install_deps
+      - npm run test:coverage
+    depends_on:
+      - install
+
+  build:
+    image: *node_image
+    commands:
+      - *install_deps
+      - npm run build
+    depends_on:
+      - lint
+      - typecheck
+      - format-check
+      - security-audit
+      - test
+
+  publish:
+    image: *node_image
+    environment:
+      GITEA_TOKEN:
+        from_secret: gitea_token
+    commands:
+      - *install_deps
+      - npm run build
+      - |
+        echo "//git.mosaicstack.dev/api/packages/mosaic/npm/:_authToken=$$GITEA_TOKEN" > .npmrc
+        echo "@mosaicstack:registry=https://git.mosaicstack.dev/api/packages/mosaic/npm/" >> .npmrc
+      - |
+        CURRENT=$(node -p "require('./package.json').version")
+        PUBLISHED=$(npm view @mosaicstack/telemetry-client version 2>/dev/null || echo "0.0.0")
+        if [ "$$CURRENT" = "$$PUBLISHED" ]; then
+          echo "Version $$CURRENT already published, skipping"
+          exit 0
+        fi
+        echo "Publishing $$CURRENT (was $$PUBLISHED)"
+        npm publish --access public
+    when:
+      - branch: [main, develop]
+        event: [push, manual, tag]
+    depends_on:
+      - build
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,72 @@
+# mosaic-telemetry-client-js — Agent Context
+
+> Patterns, gotchas, and orchestrator integration for AI agents working on this project.
+> **Update this file** when you discover reusable patterns or non-obvious requirements.
+
+## Codebase Patterns
+
+<!-- Add project-specific patterns as you discover them -->
+<!-- Examples: -->
+<!-- - Use `httpx.AsyncClient` for external HTTP calls -->
+<!-- - All routes require authentication via `Depends(get_current_user)` -->
+<!-- - Config is loaded from environment variables via `settings.py` -->
+
+## Common Gotchas
+
+<!-- Add things that trip up agents -->
+<!-- Examples: -->
+<!-- - Remember to run migrations after schema changes -->
+<!-- - Frontend env vars need NEXT_PUBLIC_ prefix -->
+<!-- - Tests require a running PostgreSQL instance -->
+
+## Quality Gates
+
+**All must pass before any commit:**
+
+```bash
+npm run lint ${QUALITY_GATES}${QUALITY_GATES} npm run typecheck ${QUALITY_GATES}${QUALITY_GATES} npm test
+```
+
+## Orchestrator Integration
+
+### Task Prefix
+Use `MOSAIC-TELEMETRY-CLIENT-JS` as the prefix for orchestrated tasks (e.g., `MOSAIC-TELEMETRY-CLIENT-JS-SEC-001`).
+
+### Package/Directory Names
+<!-- List key directories the orchestrator needs to know about -->
+
+| Directory | Purpose |
+|-----------|---------|
+| `src/` | Main source code |
+| `tests/` | Test files |
+| `docs/scratchpads/` | Working documents |
+
+### Worker Checklist
+When completing an orchestrated task:
+1. Read the finding details from the report
+2. Implement the fix following existing code patterns
+3. Run quality gates (ALL must pass)
+4. Commit with: `git commit -m "fix({finding_id}): brief description"`
+5. Report result as JSON to orchestrator
+
+### Post-Coding Review
+After implementing changes, the orchestrator will run:
+1. **Codex code review** — `~/.claude/scripts/codex/codex-code-review.sh --uncommitted`
+2. **Codex security review** — `~/.claude/scripts/codex/codex-security-review.sh --uncommitted`
+3. If blockers/critical findings: remediation task created
+4. If clean: task marked done
+
+## Directory-Specific Context
+
+<!-- Add sub-AGENTS.md files in subdirectories if needed -->
+<!-- Example: -->
+<!-- - `src/api/AGENTS.md` — API-specific patterns -->
+<!-- - `src/components/AGENTS.md` — Component conventions -->
+
+## Testing Approaches
+
+<!-- Document how tests should be written for this project -->
+<!-- Examples: -->
+<!-- - Unit tests use pytest with fixtures in conftest.py -->
+<!-- - Integration tests require DATABASE_URL env var -->
+<!-- - E2E tests use Playwright -->
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -28,3 +28,55 @@ npm run build        # Build to dist/
 - `track()` never throws — catches everything, routes to `onError` callback
 - Zero runtime deps: uses native `fetch` (Node 18+), `crypto.randomUUID()`, `setInterval`
 - All types are standalone — no dependency on the telemetry server package
+
+## Conditional Documentation Loading
+
+**Read the relevant guide before starting work:**
+
+| Task Type | Guide |
+|-----------|-------|
+| Bootstrapping a new project | `~/.claude/agent-guides/bootstrap.md` |
+| Orchestrating autonomous tasks | `~/.claude/agent-guides/orchestrator.md` |
+| Ralph autonomous development | `~/.claude/agent-guides/ralph-autonomous.md` |
+| Frontend development | `~/.claude/agent-guides/frontend.md` |
+| Backend/API development | `~/.claude/agent-guides/backend.md` |
+| TypeScript strict typing | `~/.claude/agent-guides/typescript.md` |
+| Code review | `~/.claude/agent-guides/code-review.md` |
+| Authentication/Authorization | `~/.claude/agent-guides/authentication.md` |
+| Infrastructure/DevOps | `~/.claude/agent-guides/infrastructure.md` |
+| QA/Testing | `~/.claude/agent-guides/qa-testing.md` |
+| Secrets management (Vault) | `~/.claude/agent-guides/vault-secrets.md` |
+
+
+## Commits
+
+```
+<type>(#issue): Brief description
+
+Detailed explanation if needed.
+
+Fixes #123
+```
+
+Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
+
+
+## Secrets Management
+
+**NEVER hardcode secrets.** Use `.env` files (gitignored) or a secrets manager.
+
+```bash
+# .env.example is committed (with placeholders)
+# .env is NOT committed (contains real values)
+```
+
+Ensure `.gitignore` includes `.env*` (except `.env.example`).
+
+
+## Multi-Agent Coordination
+
+When multiple agents work on this project:
+1. `git pull --rebase` before editing
+2. `git pull --rebase` before pushing
+3. If conflicts, **alert the user** — don't auto-resolve data conflicts
+
--- a/README.md
+++ b/README.md
@@ -2,7 +2,9 @@

 TypeScript client SDK for [Mosaic Stack Telemetry](https://tel.mosaicstack.dev). Reports task-completion metrics from AI coding harnesses and queries crowd-sourced predictions.

-**Zero runtime dependencies** — uses native `fetch`, `crypto.randomUUID()`, and `setInterval`.
+**Zero runtime dependencies** — uses native `fetch`, `crypto.randomUUID()`, and `setInterval`. Requires Node.js 18+.
+
+**Targets Mosaic Telemetry API v1** (`/v1/` endpoints, event schema version `1.0`).

 ## Installation

@@ -13,17 +15,26 @@ npm install @mosaicstack/telemetry-client
 ## Quick Start

 ```typescript
-import { TelemetryClient, TaskType, Complexity, Harness, Provider, Outcome } from '@mosaicstack/telemetry-client';
+import {
+  TelemetryClient,
+  TaskType,
+  Complexity,
+  Harness,
+  Provider,
+  Outcome,
+  QualityGate,
+} from '@mosaicstack/telemetry-client';

+// 1. Create and start the client
 const client = new TelemetryClient({
-  serverUrl: 'https://tel.mosaicstack.dev',
-  apiKey: 'your-64-char-hex-api-key',
-  instanceId: 'your-instance-uuid',
+  serverUrl: 'https://tel-api.mosaicstack.dev',
+  apiKey: process.env.TELEMETRY_API_KEY!,
+  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
 });

-client.start();
+client.start(); // begins background batch submission every 5 minutes

-// Build and track an event
+// 2. Build and track an event
 const event = client.eventBuilder.build({
  task_duration_ms: 45000,
  task_type: TaskType.IMPLEMENTATION,
@@ -31,83 +42,101 @@ const event = client.eventBuilder.build({
  harness: Harness.CLAUDE_CODE,
  model: 'claude-sonnet-4-5-20250929',
  provider: Provider.ANTHROPIC,
-  estimated_input_tokens: 5000,
-  estimated_output_tokens: 2000,
-  actual_input_tokens: 5500,
-  actual_output_tokens: 2200,
-  estimated_cost_usd_micros: 30000,
-  actual_cost_usd_micros: 33000,
+  estimated_input_tokens: 105000,
+  estimated_output_tokens: 45000,
+  actual_input_tokens: 112340,
+  actual_output_tokens: 38760,
+  estimated_cost_usd_micros: 630000,
+  actual_cost_usd_micros: 919200,
  quality_gate_passed: true,
-  quality_gates_run: [],
+  quality_gates_run: [QualityGate.BUILD, QualityGate.LINT, QualityGate.TEST],
  quality_gates_failed: [],
-  context_compactions: 0,
+  context_compactions: 2,
  context_rotations: 0,
-  context_utilization_final: 0.4,
+  context_utilization_final: 0.72,
  outcome: Outcome.SUCCESS,
  retry_count: 0,
+  language: 'typescript',
+  repo_size_category: 'medium',
 });

-client.track(event);
+client.track(event); // queues the event (never throws)

-// When shutting down
-await client.stop();
-```
-
-## Querying Predictions
-
-```typescript
-const query = {
+// 3. Query predictions
+const prediction = client.getPrediction({
  task_type: TaskType.IMPLEMENTATION,
  model: 'claude-sonnet-4-5-20250929',
  provider: Provider.ANTHROPIC,
  complexity: Complexity.MEDIUM,
-};
+});

-// Fetch from server and cache locally
-await client.refreshPredictions([query]);
-
-// Get cached prediction (returns null if not cached)
-const prediction = client.getPrediction(query);
-if (prediction?.prediction) {
-  console.log('Median input tokens:', prediction.prediction.input_tokens.median);
-  console.log('Median cost (microdollars):', prediction.prediction.cost_usd_micros.median);
-}
+// 4. Shut down gracefully (flushes remaining events)
+await client.stop();
 ```

 ## Configuration

-```typescript
-const client = new TelemetryClient({
-  serverUrl: 'https://tel.mosaicstack.dev',  // Required
-  apiKey: 'your-api-key',                     // Required (64-char hex)
-  instanceId: 'your-uuid',                    // Required
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `serverUrl` | `string` | **required** | Telemetry API base URL |
+| `apiKey` | `string` | **required** | Bearer token for authentication |
+| `instanceId` | `string` | **required** | UUID identifying this instance |
+| `enabled` | `boolean` | `true` | Set `false` to disable — `track()` becomes a no-op |
+| `submitIntervalMs` | `number` | `300_000` | Background flush interval (5 min) |
+| `maxQueueSize` | `number` | `1000` | Max queued events before FIFO eviction |
+| `batchSize` | `number` | `100` | Events per batch submission (server max: 100) |
+| `requestTimeoutMs` | `number` | `10_000` | HTTP request timeout |
+| `predictionCacheTtlMs` | `number` | `21_600_000` | Prediction cache TTL (6 hours) |
+| `dryRun` | `boolean` | `false` | Log events instead of sending them |
+| `maxRetries` | `number` | `3` | Retry attempts with exponential backoff |
+| `onError` | `(error: Error) => void` | silent | Error callback |

-  // Optional
-  enabled: true,                  // Set false to disable (track() becomes no-op)
-  submitIntervalMs: 300_000,      // Background flush interval (default: 5 min)
-  maxQueueSize: 1000,             // Max queued events (default: 1000, FIFO eviction)
-  batchSize: 100,                 // Events per batch (default/max: 100)
-  requestTimeoutMs: 10_000,       // HTTP timeout (default: 10s)
-  predictionCacheTtlMs: 21_600_000, // Prediction cache TTL (default: 6 hours)
-  dryRun: false,                  // Log events instead of sending
-  maxRetries: 3,                  // Retry attempts on failure
-  onError: (err) => console.error(err),  // Error callback
+## Querying Predictions
+
+Predictions are crowd-sourced token/cost/duration estimates from the telemetry API. The SDK caches them locally with a configurable TTL.
+
+```typescript
+// Fetch predictions from the server and cache locally
+await client.refreshPredictions([
+  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.MEDIUM },
+  { task_type: TaskType.TESTING, model: 'claude-haiku-4-5-20251001', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
+]);
+
+// Read from cache (returns null if not cached or expired)
+const prediction = client.getPrediction({
+  task_type: TaskType.IMPLEMENTATION,
+  model: 'claude-sonnet-4-5-20250929',
+  provider: Provider.ANTHROPIC,
+  complexity: Complexity.MEDIUM,
 });
+
+if (prediction?.prediction) {
+  console.log('Median input tokens:', prediction.prediction.input_tokens.median);
+  console.log('Median cost ($):', prediction.prediction.cost_usd_micros.median / 1_000_000);
+  console.log('Confidence:', prediction.metadata.confidence);
+}
 ```

 ## Dry-Run Mode

-For testing without sending data:
+For development and testing without sending data to the server:

 ```typescript
 const client = new TelemetryClient({
-  serverUrl: 'https://tel.mosaicstack.dev',
+  serverUrl: 'https://tel-api.mosaicstack.dev',
  apiKey: 'test-key',
  instanceId: 'test-uuid',
  dryRun: true,
 });
 ```

+In dry-run mode, `track()` still queues events and `flush()` still runs, but the `BatchSubmitter` returns synthetic `accepted` responses without making HTTP calls.
+
+## Documentation
+
+- **[Integration Guide](docs/integration-guide.md)** — Next.js and Node.js examples, environment-specific configuration, error handling patterns
+- **[API Reference](docs/api-reference.md)** — Full reference for all exported classes, methods, types, and enums
+
 ## License

 MPL-2.0
--- a/docs/api-reference.md
+++ b/docs/api-reference.md
@@ -0,0 +1,602 @@
+# API Reference
+
+Complete reference for all classes, methods, types, and enums exported by `@mosaicstack/telemetry-client`.
+
+**SDK version:** 0.1.0
+**Targets:** Mosaic Telemetry API v1, event schema version `1.0`
+
+---
+
+## TelemetryClient
+
+Main entry point. Queues task-completion events for background batch submission and provides access to cached predictions.
+
+```typescript
+import { TelemetryClient } from '@mosaicstack/telemetry-client';
+```
+
+### Constructor
+
+```typescript
+new TelemetryClient(config: TelemetryConfig)
+```
+
+Creates a new client instance. Does **not** start background submission — call `start()` to begin.
+
+### Properties
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `eventBuilder` | `EventBuilder` | Builder for constructing `TaskCompletionEvent` objects |
+| `queueSize` | `number` | Number of events currently in the queue |
+| `isRunning` | `boolean` | Whether background submission is active |
+
+### Methods
+
+#### `start(): void`
+
+Start background batch submission via `setInterval`. Idempotent — calling `start()` multiple times has no effect.
+
+#### `stop(): Promise<void>`
+
+Stop background submission and flush all remaining events. Idempotent. Returns a promise that resolves when the final flush completes.
+
+#### `track(event: TaskCompletionEvent): void`
+
+Queue an event for batch submission. **Never throws** — all errors are caught and routed to the `onError` callback.
+
+When `enabled` is `false`, this method returns immediately without queuing.
+
+When the queue is at capacity (`maxQueueSize`), the oldest event is evicted to make room.
+
+#### `getPrediction(query: PredictionQuery): PredictionResponse | null`
+
+Get a cached prediction for the given query dimensions. Returns `null` if no prediction is cached or the cache entry has expired.
+
+#### `refreshPredictions(queries: PredictionQuery[]): Promise<void>`
+
+Fetch predictions from the server via `POST /v1/predictions/batch` and store them in the local cache. The predictions endpoint is public — no authentication required.
+
+Accepts up to 50 queries per call (server limit).
+
+---
+
+## EventBuilder
+
+Convenience builder that auto-fills `event_id`, `timestamp`, `instance_id`, and `schema_version`.
+
+```typescript
+import { EventBuilder } from '@mosaicstack/telemetry-client';
+```
+
+Access via `client.eventBuilder` — you don't normally construct this directly.
+
+### Methods
+
+#### `build(params: EventBuilderParams): TaskCompletionEvent`
+
+Build a complete `TaskCompletionEvent` from the given parameters.
+
+Auto-generated fields:
+- `event_id` — `crypto.randomUUID()`
+- `timestamp` — `new Date().toISOString()`
+- `instance_id` — from client config
+- `schema_version` — `"1.0"`
+
+---
+
+## EventQueue
+
+Bounded FIFO queue for telemetry events. Used internally by `TelemetryClient`.
+
+```typescript
+import { EventQueue } from '@mosaicstack/telemetry-client';
+```
+
+### Constructor
+
+```typescript
+new EventQueue(maxSize: number)
+```
+
+### Properties
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `size` | `number` | Current number of events in the queue |
+| `isEmpty` | `boolean` | Whether the queue is empty |
+
+### Methods
+
+#### `enqueue(event: TaskCompletionEvent): void`
+
+Add an event. Evicts the oldest event if at capacity.
+
+#### `drain(maxItems: number): TaskCompletionEvent[]`
+
+Remove and return up to `maxItems` events from the front.
+
+#### `prepend(events: TaskCompletionEvent[]): void`
+
+Prepend events back to the front (used for re-enqueue on submission failure). Respects `maxSize` — excess events are dropped.
+
+---
+
+## BatchSubmitter
+
+Handles HTTP submission of event batches with retry logic.
+
+```typescript
+import { BatchSubmitter } from '@mosaicstack/telemetry-client';
+```
+
+### Methods
+
+#### `submit(events: TaskCompletionEvent[]): Promise<SubmitResult>`
+
+Submit a batch to `POST /v1/events/batch`. Retries with exponential backoff (1s base, 60s max, with jitter) on transient failures. Respects the server's `Retry-After` header on HTTP 429.
+
+In dry-run mode, returns a synthetic success response without making HTTP calls.
+
+---
+
+## PredictionCache
+
+In-memory TTL cache for prediction responses.
+
+```typescript
+import { PredictionCache } from '@mosaicstack/telemetry-client';
+```
+
+### Constructor
+
+```typescript
+new PredictionCache(ttlMs: number)
+```
+
+### Properties
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `size` | `number` | Number of entries in cache (may include expired entries) |
+
+### Methods
+
+#### `get(query: PredictionQuery): PredictionResponse | null`
+
+Retrieve a cached prediction. Returns `null` if not cached or expired (expired entries are lazily deleted).
+
+#### `set(query: PredictionQuery, response: PredictionResponse): void`
+
+Store a prediction with TTL.
+
+#### `clear(): void`
+
+Clear all cached predictions.
+
+---
+
+## Configuration Types
+
+### TelemetryConfig
+
+User-facing configuration passed to the `TelemetryClient` constructor.
+
+```typescript
+import type { TelemetryConfig } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `serverUrl` | `string` | Yes | — | Telemetry API base URL (e.g., `"https://tel-api.mosaicstack.dev"`) |
+| `apiKey` | `string` | Yes | — | Bearer token for `POST /v1/events/batch` authentication |
+| `instanceId` | `string` | Yes | — | UUID identifying this Mosaic Stack instance |
+| `enabled` | `boolean` | No | `true` | When `false`, `track()` is a no-op |
+| `submitIntervalMs` | `number` | No | `300_000` | Background flush interval in ms (5 min) |
+| `maxQueueSize` | `number` | No | `1000` | Maximum events held in queue before FIFO eviction |
+| `batchSize` | `number` | No | `100` | Events per batch (server max: 100) |
+| `requestTimeoutMs` | `number` | No | `10_000` | HTTP request timeout in ms |
+| `predictionCacheTtlMs` | `number` | No | `21_600_000` | Prediction cache TTL in ms (6 hours) |
+| `dryRun` | `boolean` | No | `false` | Simulate submissions without HTTP calls |
+| `maxRetries` | `number` | No | `3` | Retry attempts on transient failure |
+| `onError` | `(error: Error) => void` | No | silent | Callback invoked on errors |
+
+### ResolvedConfig
+
+Internal configuration with all defaults applied. All fields are required (non-optional).
+
+```typescript
+import type { ResolvedConfig } from '@mosaicstack/telemetry-client';
+```
+
+### resolveConfig
+
+```typescript
+import { resolveConfig } from '@mosaicstack/telemetry-client';
+
+function resolveConfig(config: TelemetryConfig): ResolvedConfig
+```
+
+Apply defaults to a `TelemetryConfig`, producing a `ResolvedConfig`. Strips trailing slashes from `serverUrl`.
+
+---
+
+## Event Types
+
+### EventBuilderParams
+
+Parameters accepted by `EventBuilder.build()`. Excludes auto-generated fields (`event_id`, `timestamp`, `instance_id`, `schema_version`).
+
+```typescript
+import type { EventBuilderParams } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `task_duration_ms` | `number` | Yes | Wall-clock time in ms (0–86,400,000) |
+| `task_type` | `TaskType` | Yes | Category of work performed |
+| `complexity` | `Complexity` | Yes | Task complexity level |
+| `harness` | `Harness` | Yes | Coding tool / execution environment |
+| `model` | `string` | Yes | Model identifier (1–100 chars) |
+| `provider` | `Provider` | Yes | LLM provider |
+| `estimated_input_tokens` | `number` | Yes | Pre-task input token estimate (0–10,000,000) |
+| `estimated_output_tokens` | `number` | Yes | Pre-task output token estimate (0–10,000,000) |
+| `actual_input_tokens` | `number` | Yes | Actual input tokens consumed (0–10,000,000) |
+| `actual_output_tokens` | `number` | Yes | Actual output tokens generated (0–10,000,000) |
+| `estimated_cost_usd_micros` | `number` | Yes | Estimated cost in microdollars (0–100,000,000) |
+| `actual_cost_usd_micros` | `number` | Yes | Actual cost in microdollars (0–100,000,000) |
+| `quality_gate_passed` | `boolean` | Yes | Whether all quality gates passed |
+| `quality_gates_run` | `QualityGate[]` | Yes | Gates that were executed |
+| `quality_gates_failed` | `QualityGate[]` | Yes | Gates that failed |
+| `context_compactions` | `number` | Yes | Context compaction count (0–100) |
+| `context_rotations` | `number` | Yes | Context rotation count (0–50) |
+| `context_utilization_final` | `number` | Yes | Final context utilization ratio (0.0–1.0) |
+| `outcome` | `Outcome` | Yes | Task result |
+| `retry_count` | `number` | Yes | Number of retries (0–20) |
+| `language` | `string \| null` | No | Primary programming language (max 30 chars) |
+| `repo_size_category` | `RepoSizeCategory \| null` | No | Repository size bucket |
+
+### TaskCompletionEvent
+
+Full event object submitted to the server. Extends `EventBuilderParams` with auto-generated identity fields.
+
+```typescript
+import type { TaskCompletionEvent } from '@mosaicstack/telemetry-client';
+```
+
+Additional fields (auto-generated by `EventBuilder`):
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `instance_id` | `string` | UUID identifying the submitting instance |
+| `event_id` | `string` | Unique UUID for deduplication |
+| `schema_version` | `string` | Always `"1.0"` |
+| `timestamp` | `string` | ISO 8601 datetime |
+
+---
+
+## Prediction Types
+
+### PredictionQuery
+
+Query parameters for fetching a prediction.
+
+```typescript
+import type { PredictionQuery } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `task_type` | `TaskType` | Task type to predict for |
+| `model` | `string` | Model identifier |
+| `provider` | `Provider` | LLM provider |
+| `complexity` | `Complexity` | Complexity level |
+
+### PredictionResponse
+
+Response from the predictions endpoint.
+
+```typescript
+import type { PredictionResponse } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `prediction` | `PredictionData \| null` | Prediction data, or `null` if no data available |
+| `metadata` | `PredictionMetadata` | Sample size, confidence, fallback info |
+
+### PredictionData
+
+Statistical prediction for a dimension combination.
+
+```typescript
+import type { PredictionData } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `input_tokens` | `TokenDistribution` | Input token distribution (p10/p25/median/p75/p90) |
+| `output_tokens` | `TokenDistribution` | Output token distribution |
+| `cost_usd_micros` | `Record<string, number>` | Cost stats — `{ median: number }` |
+| `duration_ms` | `Record<string, number>` | Duration stats — `{ median: number }` |
+| `correction_factors` | `CorrectionFactors` | Actual-to-estimated token ratios |
+| `quality` | `QualityPrediction` | Quality gate pass rate and success rate |
+
+### TokenDistribution
+
+Percentile distribution of token counts.
+
+```typescript
+import type { TokenDistribution } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `p10` | `number` | 10th percentile |
+| `p25` | `number` | 25th percentile |
+| `median` | `number` | 50th percentile (median) |
+| `p75` | `number` | 75th percentile |
+| `p90` | `number` | 90th percentile |
+
+### CorrectionFactors
+
+Ratio of actual to estimated tokens. Values >1.0 mean estimates tend to be too low.
+
+```typescript
+import type { CorrectionFactors } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `input` | `number` | Actual / estimated input tokens |
+| `output` | `number` | Actual / estimated output tokens |
+
+### QualityPrediction
+
+Predicted quality gate and success rates.
+
+```typescript
+import type { QualityPrediction } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `gate_pass_rate` | `number` | Fraction of events where all quality gates pass (0.0–1.0) |
+| `success_rate` | `number` | Fraction of events with `outcome: "success"` (0.0–1.0) |
+
+### PredictionMetadata
+
+Metadata about a prediction response.
+
+```typescript
+import type { PredictionMetadata } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `sample_size` | `number` | Number of events used to compute this prediction |
+| `fallback_level` | `number` | 0 = exact match, 1+ = dimensions dropped, -1 = no data |
+| `confidence` | `'none' \| 'low' \| 'medium' \| 'high'` | Confidence level |
+| `last_updated` | `string \| null` | ISO 8601 timestamp of last computation |
+| `dimensions_matched` | `Record<string, string \| null> \| null` | Matched dimensions (`null` values indicate fallback) |
+| `fallback_note` | `string \| null` | Human-readable fallback explanation |
+| `cache_hit` | `boolean` | Whether served from server-side cache |
+
+**Confidence level criteria:**
+
+| Level | Criteria |
+|-------|----------|
+| `none` | No data available. `prediction` is `null`. |
+| `low` | Sample size < 30 or fallback was applied |
+| `medium` | Sample size 30–99, exact match |
+| `high` | Sample size >= 100, exact match |
+
+---
+
+## Batch Types
+
+### BatchEventRequest
+
+Request body for `POST /v1/events/batch`.
+
+```typescript
+import type { BatchEventRequest } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `events` | `TaskCompletionEvent[]` | 1–100 events to submit |
+
+### BatchEventResponse
+
+Response from `POST /v1/events/batch`.
+
+```typescript
+import type { BatchEventResponse } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `accepted` | `number` | Count of accepted events |
+| `rejected` | `number` | Count of rejected events |
+| `results` | `BatchEventResult[]` | Per-event result details |
+
+### BatchEventResult
+
+Per-event result within a batch response.
+
+```typescript
+import type { BatchEventResult } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `event_id` | `string` | The event's UUID |
+| `status` | `'accepted' \| 'rejected'` | Whether the event was accepted |
+| `error` | `string \| null` | Error message if rejected |
+
+### SubmitResult
+
+Internal result type from `BatchSubmitter.submit()`.
+
+```typescript
+import type { SubmitResult } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `success` | `boolean` | Whether the submission succeeded |
+| `response` | `BatchEventResponse \| undefined` | Server response (on success) |
+| `retryAfterMs` | `number \| undefined` | Retry delay from 429 response |
+| `error` | `Error \| undefined` | Error details (on failure) |
+
+### BatchPredictionRequest
+
+Request body for `POST /v1/predictions/batch`.
+
+```typescript
+import type { BatchPredictionRequest } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `queries` | `PredictionQuery[]` | 1–50 prediction queries |
+
+### BatchPredictionResponse
+
+Response from `POST /v1/predictions/batch`.
+
+```typescript
+import type { BatchPredictionResponse } from '@mosaicstack/telemetry-client';
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `results` | `PredictionResponse[]` | One response per query, in request order |
+
+---
+
+## Enums
+
+All enums use string values matching the server's API contract.
+
+### TaskType
+
+```typescript
+import { TaskType } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Description |
+|--------|-------|-------------|
+| `PLANNING` | `"planning"` | Architecture design, task breakdown |
+| `IMPLEMENTATION` | `"implementation"` | Writing new code |
+| `CODE_REVIEW` | `"code_review"` | Reviewing existing code |
+| `TESTING` | `"testing"` | Writing or running tests |
+| `DEBUGGING` | `"debugging"` | Investigating and fixing bugs |
+| `REFACTORING` | `"refactoring"` | Restructuring existing code |
+| `DOCUMENTATION` | `"documentation"` | Writing docs, comments, READMEs |
+| `CONFIGURATION` | `"configuration"` | Config files, CI/CD, infrastructure |
+| `SECURITY_AUDIT` | `"security_audit"` | Security review, vulnerability analysis |
+| `UNKNOWN` | `"unknown"` | Unclassified task type (fallback) |
+
+### Complexity
+
+```typescript
+import { Complexity } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Description | Typical Token Budget |
+|--------|-------|-------------|---------------------|
+| `LOW` | `"low"` | Simple fixes, typos, config changes | 50,000 |
+| `MEDIUM` | `"medium"` | Standard features, moderate logic | 150,000 |
+| `HIGH` | `"high"` | Complex features, multi-file changes | 350,000 |
+| `CRITICAL` | `"critical"` | Major refactoring, architectural changes | 750,000 |
+
+### Harness
+
+```typescript
+import { Harness } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Description |
+|--------|-------|-------------|
+| `CLAUDE_CODE` | `"claude_code"` | Anthropic Claude Code CLI |
+| `OPENCODE` | `"opencode"` | OpenCode CLI |
+| `KILO_CODE` | `"kilo_code"` | Kilo Code VS Code extension |
+| `AIDER` | `"aider"` | Aider AI pair programming |
+| `API_DIRECT` | `"api_direct"` | Direct API calls (no harness) |
+| `OLLAMA_LOCAL` | `"ollama_local"` | Ollama local inference |
+| `CUSTOM` | `"custom"` | Custom or unrecognized harness |
+| `UNKNOWN` | `"unknown"` | Harness not reported |
+
+### Provider
+
+```typescript
+import { Provider } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Description |
+|--------|-------|-------------|
+| `ANTHROPIC` | `"anthropic"` | Anthropic (Claude models) |
+| `OPENAI` | `"openai"` | OpenAI (GPT models) |
+| `OPENROUTER` | `"openrouter"` | OpenRouter (multi-provider routing) |
+| `OLLAMA` | `"ollama"` | Ollama (local/self-hosted) |
+| `GOOGLE` | `"google"` | Google (Gemini models) |
+| `MISTRAL` | `"mistral"` | Mistral AI |
+| `CUSTOM` | `"custom"` | Custom or unrecognized provider |
+| `UNKNOWN` | `"unknown"` | Provider not reported |
+
+### QualityGate
+
+```typescript
+import { QualityGate } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Description |
+|--------|-------|-------------|
+| `BUILD` | `"build"` | Code compiles/builds successfully |
+| `LINT` | `"lint"` | Linter passes with no errors |
+| `TEST` | `"test"` | Unit/integration tests pass |
+| `COVERAGE` | `"coverage"` | Code coverage meets threshold (85%) |
+| `TYPECHECK` | `"typecheck"` | Type checker passes |
+| `SECURITY` | `"security"` | Security scan passes |
+
+### Outcome
+
+```typescript
+import { Outcome } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Description |
+|--------|-------|-------------|
+| `SUCCESS` | `"success"` | Task completed, all quality gates passed |
+| `FAILURE` | `"failure"` | Task failed after all retries |
+| `PARTIAL` | `"partial"` | Task partially completed (some gates passed) |
+| `TIMEOUT` | `"timeout"` | Task exceeded time or token budget |
+
+### RepoSizeCategory
+
+```typescript
+import { RepoSizeCategory } from '@mosaicstack/telemetry-client';
+```
+
+| Member | Value | Approximate LOC | Description |
+|--------|-------|-----------------|-------------|
+| `TINY` | `"tiny"` | < 1,000 | Scripts, single-file projects |
+| `SMALL` | `"small"` | 1,000–10,000 | Small libraries, tools |
+| `MEDIUM` | `"medium"` | 10,000–100,000 | Standard applications |
+| `LARGE` | `"large"` | 100,000–1,000,000 | Large applications, monorepos |
+| `HUGE` | `"huge"` | > 1,000,000 | Enterprise codebases |
+
+---
+
+## Server API Endpoints Used
+
+The SDK communicates with these Mosaic Telemetry API v1 endpoints:
+
+| SDK Method | HTTP Endpoint | Auth Required |
+|------------|---------------|---------------|
+| `flush()` (internal) | `POST /v1/events/batch` | Yes (Bearer token) |
+| `refreshPredictions()` | `POST /v1/predictions/batch` | No (public) |
+
+For the full server API specification, see the [Mosaic Telemetry API Reference](https://tel-api.mosaicstack.dev/v1/docs).
--- a/docs/integration-guide.md
+++ b/docs/integration-guide.md
@@ -0,0 +1,405 @@
+# Integration Guide
+
+This guide covers how to integrate `@mosaicstack/telemetry-client` into your applications. The SDK targets **Mosaic Telemetry API v1** (event schema version `1.0`).
+
+## Prerequisites
+
+- Node.js >= 18 (for native `fetch` and `crypto.randomUUID()`)
+- A Mosaic Telemetry API key and instance ID (issued by an administrator via the admin API)
+
+## Installation
+
+```bash
+npm install @mosaicstack/telemetry-client
+```
+
+The package ships ESM-only with TypeScript declarations. Zero runtime dependencies.
+
+## Environment Setup
+
+Store your credentials in environment variables — never hardcode them.
+
+```bash
+# .env (not committed — add to .gitignore)
+TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
+TELEMETRY_API_KEY=msk_your_api_key_here
+TELEMETRY_INSTANCE_ID=a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d
+```
+
+```bash
+# .env.example (committed — documents required variables)
+TELEMETRY_API_URL=https://tel-api.mosaicstack.dev
+TELEMETRY_API_KEY=your-api-key
+TELEMETRY_INSTANCE_ID=your-instance-uuid
+```
+
+---
+
+## Instrumenting a Next.js App
+
+Next.js server actions and API routes run on Node.js, so the SDK works directly. Create a shared singleton and track events from your server-side code.
+
+### 1. Create a telemetry singleton
+
+```typescript
+// lib/telemetry.ts
+import {
+  TelemetryClient,
+  TaskType,
+  Complexity,
+  Harness,
+  Provider,
+  Outcome,
+  QualityGate,
+} from '@mosaicstack/telemetry-client';
+
+let client: TelemetryClient | null = null;
+
+export function getTelemetryClient(): TelemetryClient {
+  if (!client) {
+    client = new TelemetryClient({
+      serverUrl: process.env.TELEMETRY_API_URL!,
+      apiKey: process.env.TELEMETRY_API_KEY!,
+      instanceId: process.env.TELEMETRY_INSTANCE_ID!,
+      enabled: process.env.NODE_ENV === 'production',
+      onError: (err) => console.error('[telemetry]', err.message),
+    });
+    client.start();
+  }
+  return client;
+}
+
+// Re-export enums for convenience
+export { TaskType, Complexity, Harness, Provider, Outcome, QualityGate };
+```
+
+### 2. Track events from an API route
+
+```typescript
+// app/api/task-complete/route.ts
+import { NextResponse } from 'next/server';
+import { getTelemetryClient, TaskType, Complexity, Harness, Provider, Outcome } from '@/lib/telemetry';
+
+export async function POST(request: Request) {
+  const body = await request.json();
+
+  const client = getTelemetryClient();
+  const event = client.eventBuilder.build({
+    task_duration_ms: body.durationMs,
+    task_type: TaskType.IMPLEMENTATION,
+    complexity: Complexity.MEDIUM,
+    harness: Harness.CLAUDE_CODE,
+    model: body.model,
+    provider: Provider.ANTHROPIC,
+    estimated_input_tokens: body.estimatedInputTokens,
+    estimated_output_tokens: body.estimatedOutputTokens,
+    actual_input_tokens: body.actualInputTokens,
+    actual_output_tokens: body.actualOutputTokens,
+    estimated_cost_usd_micros: body.estimatedCostMicros,
+    actual_cost_usd_micros: body.actualCostMicros,
+    quality_gate_passed: body.qualityGatePassed,
+    quality_gates_run: body.qualityGatesRun,
+    quality_gates_failed: body.qualityGatesFailed,
+    context_compactions: body.contextCompactions,
+    context_rotations: body.contextRotations,
+    context_utilization_final: body.contextUtilization,
+    outcome: Outcome.SUCCESS,
+    retry_count: 0,
+    language: 'typescript',
+  });
+
+  client.track(event);
+
+  return NextResponse.json({ status: 'queued' });
+}
+```
+
+### 3. Graceful shutdown
+
+Next.js doesn't provide a built-in shutdown hook, but you can handle `SIGTERM`:
+
+```typescript
+// instrumentation.ts (Next.js instrumentation file)
+export async function register() {
+  if (process.env.NEXT_RUNTIME === 'nodejs') {
+    const { getTelemetryClient } = await import('./lib/telemetry');
+
+    // Ensure the client starts on server boot
+    getTelemetryClient();
+
+    // Flush remaining events on shutdown
+    const shutdown = async () => {
+      const { getTelemetryClient } = await import('./lib/telemetry');
+      const client = getTelemetryClient();
+      await client.stop();
+      process.exit(0);
+    };
+
+    process.on('SIGTERM', shutdown);
+    process.on('SIGINT', shutdown);
+  }
+}
+```
+
+---
+
+## Instrumenting a Node.js Service
+
+For a standalone Node.js service (Express, Fastify, plain script, etc.).
+
+### 1. Initialize and start
+
+```typescript
+// src/telemetry.ts
+import { TelemetryClient } from '@mosaicstack/telemetry-client';
+
+export const telemetry = new TelemetryClient({
+  serverUrl: process.env.TELEMETRY_API_URL ?? 'https://tel-api.mosaicstack.dev',
+  apiKey: process.env.TELEMETRY_API_KEY!,
+  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
+  onError: (err) => console.error('[telemetry]', err.message),
+});
+
+telemetry.start();
+```
+
+### 2. Track events after task completion
+
+```typescript
+// src/task-runner.ts
+import {
+  TaskType,
+  Complexity,
+  Harness,
+  Provider,
+  Outcome,
+  QualityGate,
+} from '@mosaicstack/telemetry-client';
+import { telemetry } from './telemetry.js';
+
+async function runTask() {
+  const startTime = Date.now();
+
+  // ... run your AI coding task ...
+
+  const durationMs = Date.now() - startTime;
+
+  const event = telemetry.eventBuilder.build({
+    task_duration_ms: durationMs,
+    task_type: TaskType.IMPLEMENTATION,
+    complexity: Complexity.HIGH,
+    harness: Harness.CLAUDE_CODE,
+    model: 'claude-sonnet-4-5-20250929',
+    provider: Provider.ANTHROPIC,
+    estimated_input_tokens: 200000,
+    estimated_output_tokens: 80000,
+    actual_input_tokens: 215000,
+    actual_output_tokens: 72000,
+    estimated_cost_usd_micros: 1200000,
+    actual_cost_usd_micros: 1150000,
+    quality_gate_passed: true,
+    quality_gates_run: [
+      QualityGate.BUILD,
+      QualityGate.LINT,
+      QualityGate.TEST,
+      QualityGate.TYPECHECK,
+    ],
+    quality_gates_failed: [],
+    context_compactions: 3,
+    context_rotations: 1,
+    context_utilization_final: 0.85,
+    outcome: Outcome.SUCCESS,
+    retry_count: 0,
+    language: 'typescript',
+    repo_size_category: 'medium',
+  });
+
+  telemetry.track(event);
+}
+```
+
+### 3. Graceful shutdown
+
+```typescript
+// src/main.ts
+import { telemetry } from './telemetry.js';
+
+async function main() {
+  // ... your application logic ...
+
+  // On shutdown, flush remaining events
+  process.on('SIGTERM', async () => {
+    await telemetry.stop();
+    process.exit(0);
+  });
+}
+
+main();
+```
+
+---
+
+## Using Predictions
+
+The telemetry API provides crowd-sourced predictions for token usage, cost, and duration based on historical data. The SDK caches these predictions locally.
+
+### Pre-populate the cache
+
+Call `refreshPredictions()` at startup with the dimension combinations your application uses:
+
+```typescript
+import { TaskType, Provider, Complexity } from '@mosaicstack/telemetry-client';
+import { telemetry } from './telemetry.js';
+
+// Fetch predictions for all combinations you'll need
+await telemetry.refreshPredictions([
+  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
+  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.MEDIUM },
+  { task_type: TaskType.IMPLEMENTATION, model: 'claude-sonnet-4-5-20250929', provider: Provider.ANTHROPIC, complexity: Complexity.HIGH },
+  { task_type: TaskType.TESTING, model: 'claude-haiku-4-5-20251001', provider: Provider.ANTHROPIC, complexity: Complexity.LOW },
+]);
+```
+
+### Read cached predictions
+
+```typescript
+const prediction = telemetry.getPrediction({
+  task_type: TaskType.IMPLEMENTATION,
+  model: 'claude-sonnet-4-5-20250929',
+  provider: Provider.ANTHROPIC,
+  complexity: Complexity.MEDIUM,
+});
+
+if (prediction?.prediction) {
+  const p = prediction.prediction;
+  console.log('Token predictions (median):', {
+    inputTokens: p.input_tokens.median,
+    outputTokens: p.output_tokens.median,
+  });
+  console.log('Cost prediction:', `$${(p.cost_usd_micros.median / 1_000_000).toFixed(2)}`);
+  console.log('Duration prediction:', `${(p.duration_ms.median / 1000).toFixed(0)}s`);
+  console.log('Correction factors:', {
+    input: p.correction_factors.input,   // >1.0 means estimates tend to be too low
+    output: p.correction_factors.output,
+  });
+  console.log('Quality:', {
+    gatePassRate: `${(p.quality.gate_pass_rate * 100).toFixed(0)}%`,
+    successRate: `${(p.quality.success_rate * 100).toFixed(0)}%`,
+  });
+
+  // Check confidence level
+  if (prediction.metadata.confidence === 'low') {
+    console.warn('Low confidence — small sample size or fallback was applied');
+  }
+}
+```
+
+### Understand fallback behavior
+
+When the server doesn't have enough data for an exact match, it broadens the query by dropping dimensions (e.g., ignoring complexity). The `metadata` fields tell you what happened:
+
+| `fallback_level` | Meaning |
+|-------------------|---------|
+| `0` | Exact match on all dimensions |
+| `1+` | Some dimensions were dropped to find data |
+| `-1` | No prediction data available at any level |
+
+---
+
+## Environment-Specific Configuration
+
+### Development
+
+```typescript
+const client = new TelemetryClient({
+  serverUrl: 'http://localhost:8000',         // Local dev server
+  apiKey: process.env.TELEMETRY_API_KEY!,
+  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
+  dryRun: true,                               // Don't send real data
+  submitIntervalMs: 10_000,                    // Flush more frequently for debugging
+  onError: (err) => console.error('[telemetry]', err),
+});
+```
+
+### Production
+
+```typescript
+const client = new TelemetryClient({
+  serverUrl: 'https://tel-api.mosaicstack.dev',
+  apiKey: process.env.TELEMETRY_API_KEY!,
+  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
+  submitIntervalMs: 300_000,                   // 5 min (default)
+  maxRetries: 3,                               // Retry on transient failures
+  onError: (err) => {
+    // Route to your observability stack
+    logger.error('Telemetry submission failed', { error: err.message });
+  },
+});
+```
+
+### Conditional enable/disable
+
+```typescript
+const client = new TelemetryClient({
+  serverUrl: process.env.TELEMETRY_API_URL!,
+  apiKey: process.env.TELEMETRY_API_KEY!,
+  instanceId: process.env.TELEMETRY_INSTANCE_ID!,
+  enabled: process.env.TELEMETRY_ENABLED !== 'false',  // Opt-out via env var
+});
+```
+
+When `enabled` is `false`, `track()` returns immediately without queuing.
+
+---
+
+## Error Handling
+
+The SDK is designed to never disrupt your application:
+
+- **`track()` never throws.** All errors are caught and routed to the `onError` callback.
+- **Failed batches are re-queued.** If a submission fails, events are prepended back to the queue for the next flush cycle.
+- **Exponential backoff with jitter.** Retries use 1s base delay, doubling up to 60s, with random jitter to prevent thundering herd.
+- **`Retry-After` header support.** On HTTP 429 (rate limited), the SDK respects the server's `Retry-After` header.
+- **HTTP 403 is not retried.** An API key / instance ID mismatch is a permanent error.
+
+### Custom error handling
+
+```typescript
+const client = new TelemetryClient({
+  // ...
+  onError: (error) => {
+    if (error.message.includes('HTTP 403')) {
+      console.error('Telemetry auth failed — check API key and instance ID');
+    } else if (error.message.includes('HTTP 429')) {
+      console.warn('Telemetry rate limited — events will be retried');
+    } else {
+      console.error('Telemetry error:', error.message);
+    }
+  },
+});
+```
+
+---
+
+## Batch Submission Behavior
+
+The SDK batches events for efficiency:
+
+1. `track(event)` adds the event to an in-memory queue (bounded, FIFO eviction at capacity).
+2. Every `submitIntervalMs` (default: 5 minutes), the background timer drains the queue in batches of up to `batchSize` (default/max: 100).
+3. Each batch is POSTed to `POST /v1/events/batch` with exponential backoff on failure.
+4. Calling `stop()` flushes all remaining events before resolving.
+
+The server accepts up to **100 events per batch** and supports **partial success** — some events may be accepted while others (e.g., duplicates) are rejected.
+
+---
+
+## API Version Compatibility
+
+| SDK Version | API Version | Schema Version |
+|-------------|-------------|----------------|
+| 0.1.x | v1 (`/v1/` endpoints) | `1.0` |
+
+The `EventBuilder` automatically sets `schema_version: "1.0"` on every event. The SDK submits to `/v1/events/batch` and queries `/v1/predictions/batch`.
+
+When the telemetry API introduces a v2, this SDK will add support in a new major release. The server supports two API versions simultaneously during a 6-month deprecation window.