- Create LlmTelemetryTrackerService for non-blocking event emission
- Normalize token usage across Anthropic, OpenAI, Ollama providers
- Add cost table with per-token pricing in microdollars
- Instrument chat, chatStream, and embed methods
- Infer task type from calling context
- Aggregate streaming tokens after stream ends with fallback estimation
- Add 69 unit tests for tracker service, cost table, and LLM service
Refs #371
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactor LlmService to delegate to LlmManagerService instead of using
Ollama directly. This enables multiple provider support and user-specific
provider configuration.
Changes:
- Remove direct Ollama client from LlmService
- Delegate all LLM operations to provider via LlmManagerService
- Update health status to use provider-agnostic interface
- Add PrismaModule to LlmModule for manager service
- Maintain backward compatibility with existing API
- Achieve 89.74% test coverage
Fixes#127
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Ollama client library (ollama npm package)
- Create LlmService for chat completion and embeddings
- Support streaming responses via Server-Sent Events
- Add configuration via env vars (OLLAMA_HOST, OLLAMA_TIMEOUT)
- Create endpoints: GET /llm/health, GET /llm/models, POST /llm/chat, POST /llm/embed
- Replace old OllamaModule with new LlmModule
- Add comprehensive tests with >85% coverage
Closes#21