Commit Graph

253 Commits

Author SHA1 Message Date
eca2c46e9d merge: resolve conflicts with develop (telemetry + lockfile)
Some checks failed
ci/woodpecker/push/infra Pipeline was successful
ci/woodpecker/push/api Pipeline failed
ci/woodpecker/push/web Pipeline failed
ci/woodpecker/push/orchestrator Pipeline failed
ci/woodpecker/push/coordinator Pipeline was successful
Keep both Mosaic Telemetry section (from develop) and Matrix Dev
Environment section (from feature branch) in .env.example.
Regenerate pnpm-lock.yaml with both dependency trees merged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 12:12:43 -06:00
8d19ac1f4b fix(#377): remediate code review and security findings
Some checks failed
ci/woodpecker/push/infra Pipeline was successful
ci/woodpecker/push/api Pipeline failed
- Fix sendThreadMessage room mismatch: use channelId from options instead of hardcoded controlRoomId
- Add .catch() to fire-and-forget handleRoomMessage to prevent silent error swallowing
- Wrap dispatchJob in try-catch for user-visible error reporting in handleFixCommand
- Add MATRIX_BOT_USER_ID validation in connect() to prevent infinite message loops
- Fix streamResponse error masking: wrap finally/catch side-effects in try-catch
- Replace unsafe type assertion with public getClient() in MatrixRoomService
- Add orphaned room warning in provisionRoom on DB failure
- Add provider identity to Herald error logs
- Add channelId to ThreadMessageOptions interface and all callers
- Add missing env var warnings in BridgeModule factory
- Fix JSON injection in setup-bot.sh: use jq for safe JSON construction

Fixes #377

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:00:53 -06:00
9cc70dbe31 test(#385): Matrix bridge integration tests
- BridgeModule DI verification (conditional loading)
- Command flow: message -> parser -> dispatch
- Herald multi-provider broadcast
- Room-workspace mapping integration
- Streaming flow verification
- Multi-provider coexistence

Refs #385

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:39:59 -06:00
93cd31435b feat(#383): Streaming AI responses via Matrix message edits
Some checks failed
ci/woodpecker/push/api Pipeline failed
- Add MatrixStreamingService with editMessage, setTypingIndicator, streamResponse
- Rate-limited edits (500ms) for incremental streaming output
- Typing indicator management during generation
- Graceful error handling and fallback for non-streaming scenarios
- Add optional editMessage to IChatProvider interface
- Add getClient() accessor to MatrixService for streaming service
- Register MatrixStreamingService in BridgeModule
- Tests: 20 tests pass

Refs #383

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:34:36 -06:00
ad24720616 feat(#382): Herald Service: broadcast to all active chat providers
Some checks failed
ci/woodpecker/push/api Pipeline failed
- Replace direct DiscordService injection with CHAT_PROVIDERS array
- Herald broadcasts to ALL active chat providers (Discord, Matrix, future)
- Graceful error handling — one provider failure doesn't block others
- Skips disconnected providers automatically
- Tests verify multi-provider broadcasting behavior
- Fix lint: remove unnecessary conditional in matrix.service.ts

Refs #382
2026-02-15 02:25:55 -06:00
771ed484e4 feat(#379): Register MatrixService in BridgeModule with conditional loading
Some checks failed
ci/woodpecker/push/api Pipeline failed
- Add CHAT_PROVIDERS injection token for bridge-agnostic access
- Conditional loading based on env vars (DISCORD_BOT_TOKEN, MATRIX_ACCESS_TOKEN)
- Both bridges can run simultaneously
- No crash if neither bridge is configured
- Tests verify all configuration combinations

Refs #379
2026-02-15 02:18:55 -06:00
7d22c2490a feat(#380): Workspace-to-Matrix-Room mapping and provisioning
Some checks failed
ci/woodpecker/push/api Pipeline failed
- Add matrix_room_id column to workspace table (migration)
- Create MatrixRoomService for room provisioning and mapping
- Auto-create Matrix room on workspace provisioning (when configured)
- Support manual room linking for existing workspaces
- Unit tests for all mapping operations

Refs #380
2026-02-15 02:16:29 -06:00
306c2e5bd8 fix(#371): resolve TypeScript strictness errors in telemetry tracking
Some checks failed
ci/woodpecker/push/coordinator Pipeline failed
ci/woodpecker/push/orchestrator Pipeline was successful
ci/woodpecker/push/api Pipeline was successful
ci/woodpecker/push/web Pipeline failed
- llm-cost-table.ts: Add undefined guard for MODEL_COSTS lookup
- llm-telemetry-tracker.service.ts: Allow undefined in callingContext
  for exactOptionalPropertyTypes compatibility

Refs #371

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:10:22 -06:00
ed23293e1a feat(#373): prediction integration for cost estimation
- Create PredictionService for pre-task cost/token estimates
- Refresh common predictions on startup
- Integrate predictions into LLM telemetry tracker
- Add GET /api/telemetry/estimate endpoint
- Graceful degradation when no prediction data available
- Add unit tests for prediction service

Refs #373

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:10:22 -06:00
fcecf3654b feat(#371): track LLM task completions via Mosaic Telemetry
- Create LlmTelemetryTrackerService for non-blocking event emission
- Normalize token usage across Anthropic, OpenAI, Ollama providers
- Add cost table with per-token pricing in microdollars
- Instrument chat, chatStream, and embed methods
- Infer task type from calling context
- Aggregate streaming tokens after stream ends with fallback estimation
- Add 69 unit tests for tracker service, cost table, and LLM service

Refs #371

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:10:22 -06:00
314dd24dce feat(#369): install @mosaicstack/telemetry-client in API
- Add .npmrc with scoped Gitea npm registry for @mosaicstack packages
- Create MosaicTelemetryModule (global, lifecycle-aware) at
  apps/api/src/mosaic-telemetry/
- Create MosaicTelemetryService wrapping TelemetryClient with
  convenience methods: trackTaskCompletion, getPrediction,
  refreshPredictions, eventBuilder
- Create mosaic-telemetry.config.ts for env var integration via
  NestJS ConfigService
- Register MosaicTelemetryModule in AppModule
- Add 32 unit tests covering module init, service methods, disabled
  mode, dry-run mode, and lifecycle management

Refs #369

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:10:22 -06:00
5b5d3811d6 feat(#378): Install matrix-bot-sdk and create MatrixService skeleton
Some checks failed
ci/woodpecker/push/api Pipeline failed
ci/woodpecker/push/orchestrator Pipeline failed
ci/woodpecker/push/web Pipeline failed
- Add matrix-bot-sdk dependency to @mosaic/api
- Create MatrixService implementing IChatProvider interface
- Support connect/disconnect, message sending, thread management
- Parse @mosaic and !mosaic command prefixes
- Delegate commands to StitcherService (same flow as Discord)
- Add comprehensive unit tests with mocked MatrixClient (31 tests)
- Add Matrix env vars to .env.example

Refs #378

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:04:39 -06:00
8ce6843af2 fix(database,api): add 6 missing table migrations and fix CORS health checks
Some checks failed
ci/woodpecker/push/api Pipeline failed
ci/woodpecker/manual/infra Pipeline was successful
ci/woodpecker/manual/orchestrator Pipeline was successful
ci/woodpecker/manual/coordinator Pipeline was successful
ci/woodpecker/manual/api Pipeline was successful
ci/woodpecker/manual/web Pipeline was successful
Database: 6 models in the Prisma schema had no CREATE TABLE migration:
cron_schedules, workspace_llm_settings, quality_gates, task_rejections,
token_budgets, llm_usage_logs. Same root cause as the federation tables.

CORS: Health check requests (Docker, load balancers) don't send Origin
headers. The CORS config was rejecting these in production, causing
/health to return 500 and Docker to mark the container as unhealthy.
Requests without Origin headers are not cross-origin per the CORS spec
and should be allowed through.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:49:13 -06:00
d2003a7b03 fix(api): make federation config validation non-fatal at startup
All checks were successful
ci/woodpecker/push/api Pipeline was successful
Federation is optional and should not prevent the app from starting
when DEFAULT_WORKSPACE_ID is not set. Changed from throwing (crash)
to logging a warning. The endpoint-level validation in the controller
still rejects requests when federation is unconfigured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:08:09 -06:00
8733a643bf fix(api): remove "type": "module" conflicting with CommonJS build output
All checks were successful
ci/woodpecker/push/api Pipeline was successful
The NestJS tsconfig compiles to CommonJS (module: "CommonJS") but
package.json had "type": "module", causing Node.js v24 to treat the
CJS output as ESM and fail with "exports is not defined in ES module
scope" at startup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:53:43 -06:00
91307c87cc fix(database): add missing federation table migrations
All checks were successful
ci/woodpecker/push/api Pipeline was successful
Federation models (FederationConnection, FederatedIdentity,
FederationMessage) and their enums were defined in the Prisma schema
but never had CREATE TABLE migrations. This caused the
20260203_add_federation_event_subscriptions migration to fail with
"relation federation_messages does not exist".

Adds new migration 20260202200000 to create the 3 missing enums,
3 missing tables, all indexes, and foreign keys. Removes the
now-redundant ALTER TABLE from the 20260203 migration since
event_type is created with the table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:29:37 -06:00
14162b9213 fix(api): use node_modules prisma binary in entrypoint
All checks were successful
ci/woodpecker/push/api Pipeline was successful
npx is unavailable in production image since npm is removed.
Use ./node_modules/.bin/prisma directly instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:05:46 -06:00
bcee4fa601 fix(api): auto-run migrations on container start and fix ESM warning
All checks were successful
ci/woodpecker/push/api Pipeline was successful
- Add docker-entrypoint.sh that runs prisma migrate deploy before
  starting the app, ensuring all tables exist on deployment
- Add "type": "module" to package.json to eliminate Node.js ESM
  reparsing warning for eslint.config.js

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 16:47:57 -06:00
0ca3945061 fix(api): resolve Docker startup failures (secrets, Redis, Prisma)
- Pass BETTER_AUTH_SECRET through all 6 docker-compose files to API container
- Fix BullModule to parse VALKEY_URL instead of VALKEY_HOST/VALKEY_PORT,
  matching all other Redis consumers in the codebase
- Migrate Prisma encryption from removed $use() middleware to $extends()
  client extensions (Prisma 6.x compatibility), keeping extends PrismaClient
  pattern with only account and llmProviderInstance getters overridden

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 11:04:04 -06:00
7b892d5197 fix(api): import AuthModule in FederationModule for DI resolution
All checks were successful
ci/woodpecker/push/api Pipeline was successful
AuthGuard used across federation controllers depends on AuthService,
which requires AuthModule to be imported. Matches pattern used by
TasksModule, ProjectsModule, and CredentialsModule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:36:11 -06:00
e23490a5f7 fix(api): remove redundant CsrfGuard from FederationController
All checks were successful
ci/woodpecker/push/api Pipeline was successful
CsrfGuard is already applied globally via APP_GUARD in AppModule.
The explicit @UseGuards(CsrfGuard) on FederationController caused a
DI error because CsrfService is not provided in FederationModule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:14:03 -06:00
Jason Woltje
0363a14098 fix(#367): migrate Node.js 20 → 24 LTS
All checks were successful
ci/woodpecker/push/orchestrator Pipeline was successful
ci/woodpecker/push/web Pipeline was successful
ci/woodpecker/push/api Pipeline was successful
Node.js 24 (Krypton) entered Active LTS on 2026-02-09. Update all
Dockerfiles, CI pipelines, and engine constraint from node:20-alpine
to node:24-alpine. Corrected .trivyignore: tar CVEs come from Next.js
16.1.6 bundled tar@7.5.2 (not npm). Orchestrator and API images are
clean; web image needs Next.js upstream fix.

Fixes #367

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 15:20:01 -06:00
Jason Woltje
3833805a93 fix(ci): mitigate 11 upstream CVEs at source instead of suppressing
Some checks failed
ci/woodpecker/push/web Pipeline failed
ci/woodpecker/push/infra Pipeline was successful
ci/woodpecker/push/orchestrator Pipeline failed
ci/woodpecker/push/api Pipeline was successful
- docker/postgres/Dockerfile: build gosu from source with Go 1.26 via
  multi-stage build (eliminates 1 CRITICAL + 5 HIGH Go stdlib CVEs)
- apps/{api,web,orchestrator}/Dockerfile: remove npm from production
  images (eliminates 5 HIGH CVEs in npm's bundled cross-spawn/glob/tar)
- .trivyignore: trimmed from 16 to 5 CVEs (OpenBao only — 4 false
  positives from Go pseudo-version + 1 real Go stdlib waiting on upstream)

Fixes #363

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 17:10:44 -06:00
e368083e84 fix(api): import AuthModule in CredentialsModule for DI resolution
All checks were successful
ci/woodpecker/push/build Pipeline was successful
CredentialsController uses AuthGuard which depends on AuthService.
NestJS resolves guard dependencies in the module context, so
CredentialsModule needs to import AuthModule directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 21:14:20 -06:00
9ff1e69860 chore(api): remove debug statements from Dockerfile
Remove temporary debug RUN layers that were added during initial
build troubleshooting. These add build time and leak directory
structure into build logs unnecessarily.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 20:54:37 -06:00
ab64583951 fix: resolve deployment crashes in coordinator and API services
Coordinator: install all dependencies from pyproject.toml instead of
hardcoded subset (missing slowapi, anthropic, opentelemetry-*).

API: FederationAgentService now gracefully disables when orchestrator
URL is not configured instead of throwing and crashing the app.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:41:54 -06:00
c4f6552e12 docs(agents): add AGENTS.md context files for all modules
Adds directory-specific agent context templates for AI-assisted
development across all apps and packages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 22:04:43 -06:00
Jason Woltje
946d84442a fix(deps): patch axios DoS and transitive prototype pollution/decompression vulns
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
Bump axios ^1.13.4→^1.13.5 (GHSA-43fc-jf86-j433). Add pnpm overrides for
lodash/lodash-es >=4.17.23 and undici >=6.23.0 to resolve transitive
vulnerabilities via chevrotain and discord.js.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 13:07:10 -06:00
709499c167 fix(api,orchestrator): fix remaining dependency injection issues
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
API:
- Add AuthModule import to JobEventsModule
- Add AuthModule import to JobStepsModule
- Fixes: AuthGuard dependency resolution in job modules

Orchestrator:
- Add @Optional() decorator to docker parameter in DockerSandboxService
- Fixes: NestJS trying to inject Docker class as dependency

All modules using AuthGuard must import AuthModule.
Docker parameter is optional for testing, needs @Optional() decorator.
2026-02-08 22:24:37 -06:00
ecfd02541f fix(test): add VaultService dependencies to job-events performance test
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add ConfigService mock for encryption configuration
- Add VaultService and CryptoService to test module
- Fixes: PrismaService dependency injection error in test

PrismaService requires VaultService for credential encryption.
Performance tests now properly provide all required dependencies.

Refs #341 (pipeline test failure)
2026-02-08 22:04:24 -06:00
4545c6dc7a fix(api,orchestrator): fix dependency injection and Docker build issues
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
API:
- Add AuthModule import to RunnerJobsModule
- Fixes: Nest can't resolve dependencies of AuthGuard

Orchestrator:
- Remove --prod flag from dependency installation
- Copy full node_modules tree to production stage
- Align Dockerfile with API pattern for monorepo builds
- Fixes: Cannot find module '@nestjs/core'

Both services now match the working API Dockerfile pattern.
2026-02-08 21:59:19 -06:00
dc551f138a fix(test): Use correct CI detection for Woodpecker
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Woodpecker sets CI=woodpecker and CI_PIPELINE_EVENT, not CI=true.
Updated the CI detection to check for both.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 21:47:53 -06:00
75766a37b4 fix(test): Skip loading .env.test in CI environments
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
The .env.test file was being loaded in CI and overriding the CI-provided
DATABASE_URL, causing tests to try connecting to localhost:5432 instead of
the postgres:5432 service.

Fix: Only load .env.test when NOT in CI (check for CI or WOODPECKER env vars).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 21:44:02 -06:00
0b0666558e fix(test): Fix DATABASE_URL environment setup for integration tests
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Fixes integration test failures caused by missing DATABASE_URL environment variable.

Changes:
- Add dotenv as dev dependency to load .env.test in vitest setup
- Add .env.test to .gitignore to prevent committing test credentials
- Create .env.test.example with warning comments for documentation
- Add conditional test skipping when DATABASE_URL is not available
- Add DATABASE_URL format validation in vitest setup
- Add error handling to test cleanup to prevent silent failures
- Remove filesystem path disclosure from error messages

The fix allows integration tests to:
- Load DATABASE_URL from .env.test locally for developers with database setup
- Skip gracefully if DATABASE_URL is not available (no database running)
- Connect to postgres service in CI where DATABASE_URL is explicitly provided

Tests affected: auth-rls.integration.spec.ts and other integration tests
requiring real database connections.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 17:46:59 -06:00
4552c2c460 fix(test): Add ENCRYPTION_KEY to bridge.module.spec.ts and fix API lint errors
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2026-02-07 17:33:32 -06:00
73074932f6 feat(#360): Add federation credential isolation
Implement explicit deny-lists in QueryService and CommandService to prevent
user credentials from leaking across federation boundaries.

## Changes

### Core Implementation
- QueryService: Block all credential-related queries with keyword detection
- CommandService: Block all credential operations (create/update/delete/read)
- Case-insensitive keyword matching for both queries and commands

### Security Features
- Deny-list includes: credential, api_key, secret, token, password, oauth
- Errors returned for blocked operations
- No impact on existing allowed operations (tasks, events, projects, agent commands)

### Testing
- Added 2 unit tests to query.service.spec.ts
- Added 3 unit tests to command.service.spec.ts
- Added 8 integration tests in credential-isolation.integration.spec.ts
- All 377 federation tests passing

### Documentation
- Created comprehensive security doc at docs/security/federation-credential-isolation.md
- Documents 4 security guarantees (G1-G4)
- Includes testing strategy and incident response procedures

## Security Guarantees

1. G1: Credential Confidentiality - Credentials never leave instance in plaintext
2. G2: Cross-Instance Isolation - Compromised key on one instance doesn't affect others
3. G3: Query/Command Isolation - Federated instances cannot query/modify credentials
4. G4: Accidental Exposure Prevention - Credentials cannot leak via messages

## Defense-in-Depth

This implementation adds application-layer protection on top of existing:
- Transit key separation (mosaic-credentials vs mosaic-federation)
- Per-instance OpenBao servers
- Workspace-scoped credential access

Fixes #360

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 16:55:49 -06:00
46d0a06ef5 feat(#356): Build credential CRUD API endpoints
Implement comprehensive CRUD API for managing user credentials with encryption,
RLS, and audit logging following TDD methodology.

Features:
- POST /api/credentials - Create encrypted credential
- GET /api/credentials - List credentials (masked values only)
- GET /api/credentials/:id - Get single credential (masked)
- GET /api/credentials/:id/value - Decrypt plaintext (rate limited 10/min)
- PATCH /api/credentials/:id - Update metadata
- POST /api/credentials/:id/rotate - Rotate credential value
- DELETE /api/credentials/:id - Soft delete

Security:
- All values encrypted via VaultService (TransitKey.CREDENTIALS)
- List/Get endpoints NEVER return plaintext (only maskedValue)
- getValue endpoint rate limited to 10 requests/minute per user
- All operations audit-logged with CREDENTIAL_* ActivityAction
- RLS enforces per-user isolation via getRlsClient() pattern
- Input validation via class-validator DTOs

Testing:
- 26/26 unit tests passing
- 95.71% code coverage (exceeds 85% requirement)
  - Service: 95.16%
  - Controller: 100%
- TypeScript checks pass

Files created:
- apps/api/src/credentials/credentials.service.ts
- apps/api/src/credentials/credentials.service.spec.ts
- apps/api/src/credentials/credentials.controller.ts
- apps/api/src/credentials/credentials.controller.spec.ts
- apps/api/src/credentials/credentials.module.ts
- apps/api/src/credentials/dto/*.dto.ts (5 DTOs)

Files modified:
- apps/api/src/app.module.ts - imported CredentialsModule

Note: Admin credentials endpoints deferred to future issue. Current
implementation covers all user credential endpoints.

Refs #346
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 16:50:02 -06:00
aa2ee5aea3 feat(#359): Encrypt LLM provider API keys in database
Implemented transparent encryption/decryption of LLM provider API keys
stored in llm_provider_instances.config JSON field using OpenBao Transit
encryption.

Implementation:
- Created llm-encryption.middleware.ts with encryption/decryption logic
- Auto-detects format (vault:v1: vs plaintext) for backward compatibility
- Idempotent encryption prevents double-encryption
- Registered middleware in PrismaService
- Created data migration script for active encryption
- Added migrate:encrypt-llm-keys command to package.json

Tests:
- 14 comprehensive unit tests
- 90.76% code coverage (exceeds 85% requirement)
- Tests create, read, update, upsert operations
- Tests error handling and backward compatibility

Migration:
- Lazy migration: New keys encrypted, old keys work until re-saved
- Active migration: pnpm --filter @mosaic/api migrate:encrypt-llm-keys
- No schema changes required
- Zero downtime

Security:
- Uses TransitKey.LLM_CONFIG from OpenBao Transit
- Keys never touch disk in plaintext (in-memory only)
- Transparent to LlmManagerService and providers
- Follows proven pattern from account-encryption.middleware.ts

Files:
- apps/api/src/prisma/llm-encryption.middleware.ts (new)
- apps/api/src/prisma/llm-encryption.middleware.spec.ts (new)
- apps/api/scripts/encrypt-llm-keys.ts (new)
- apps/api/prisma/migrations/20260207_encrypt_llm_api_keys/ (new)
- apps/api/src/prisma/prisma.service.ts (modified)
- apps/api/package.json (modified)

Note: The migration script (encrypt-llm-keys.ts) is not included in
tsconfig.json to avoid rootDir conflicts. It's executed via tsx which
handles TypeScript directly.

Refs #359

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 16:49:37 -06:00
864c23dc94 feat(#355): Create UserCredential model with RLS and encryption support
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Implements secure user credential storage with comprehensive RLS policies
and encryption-ready architecture for Phase 3 of M9-CredentialSecurity.

**Features:**
- UserCredential Prisma model with 19 fields
- CredentialType enum (6 values: API_KEY, OAUTH_TOKEN, etc.)
- CredentialScope enum (USER, WORKSPACE, SYSTEM)
- FORCE ROW LEVEL SECURITY with 3 policies
- Encrypted value storage (OpenBao Transit ready)
- Cascade delete on user/workspace deletion
- Activity logging integration (CREDENTIAL_* actions)
- 28 comprehensive test cases

**Security:**
- RLS owner bypass, user access, workspace admin policies
- SQL injection hardening for is_workspace_admin()
- Encryption version tracking ready
- Full down migration for reversibility

**Testing:**
- 100% enum coverage (all CredentialType + CredentialScope values)
- Unique constraint enforcement
- Foreign key cascade deletes
- Timestamp behavior validation
- JSONB metadata storage

**Files:**
- Migration: 20260207_add_user_credentials (184 lines + 76 line down.sql)
- Security: 20260207163740_fix_sql_injection_is_workspace_admin
- Tests: user-credential.model.spec.ts (28 tests, 544 lines)
- Docs: README.md (228 lines), scratchpad

Fixes #355

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 16:39:15 -06:00
dd171b287f feat(#353): Create VaultService NestJS module for OpenBao Transit
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Implements secure credential encryption using OpenBao Transit API with
automatic fallback to AES-256-GCM when OpenBao is unavailable.

Features:
- AppRole authentication with automatic token renewal at 50% TTL
- Transit encrypt/decrypt with 4 named keys
- Automatic fallback to CryptoService when OpenBao unavailable
- Auto-detection of ciphertext format (vault:v1: vs AES)
- Request timeout protection (5s default)
- Health indicator for monitoring
- Backward compatible with existing AES-encrypted data

Security:
- ERROR-level logging for fallback
- Proper error propagation (no silent failures)
- Request timeouts prevent hung operations
- Secure credential file reading

Migrations:
- Account encryption middleware uses VaultService
- Uses TransitKey.ACCOUNT_TOKENS for OAuth tokens
- Backward compatible with existing encrypted data

Tests: 56 tests passing (36 VaultService + 20 middleware)

Closes #353

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 16:13:05 -06:00
737eb40d18 feat(#352): Encrypt existing plaintext Account tokens
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Implements transparent encryption/decryption of OAuth tokens via Prisma middleware with progressive migration strategy.

Core Implementation:
- Prisma middleware transparently encrypts tokens on write, decrypts on read
- Auto-detects ciphertext format: aes:iv:authTag:encrypted, vault:v1:..., or plaintext
- Uses existing CryptoService (AES-256-GCM) for encryption
- Progressive encryption: tokens encrypted as they're accessed/refreshed
- Zero-downtime migration (schema change only, no bulk data migration)

Security Features:
- Startup key validation prevents silent data loss if ENCRYPTION_KEY changes
- Secure error logging (no stack traces that could leak sensitive data)
- Graceful handling of corrupted encrypted data
- Idempotent encryption prevents double-encryption
- Future-proofed for OpenBao Transit encryption (Phase 2)

Token Fields Encrypted:
- accessToken (OAuth access tokens)
- refreshToken (OAuth refresh tokens)
- idToken (OpenID Connect ID tokens)

Backward Compatibility:
- Existing plaintext tokens readable (encryptionVersion = NULL)
- Progressive encryption on next write
- BetterAuth integration transparent (middleware layer)

Test Coverage:
- 20 comprehensive unit tests (89.06% coverage)
- Encryption/decryption scenarios
- Null/undefined handling
- Corrupted data handling
- Legacy plaintext compatibility
- Future vault format support
- All CRUD operations (create, update, updateMany, upsert)

Files Created:
- apps/api/src/prisma/account-encryption.middleware.ts
- apps/api/src/prisma/account-encryption.middleware.spec.ts
- apps/api/prisma/migrations/20260207_encrypt_account_tokens/migration.sql

Files Modified:
- apps/api/src/prisma/prisma.service.ts (register middleware)
- apps/api/src/prisma/prisma.module.ts (add CryptoService)
- apps/api/src/federation/crypto.service.ts (add key validation)
- apps/api/prisma/schema.prisma (add encryptionVersion)
- .env.example (document ENCRYPTION_KEY)

Fixes #352

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 13:16:43 -06:00
cf9a3dc526 feat(#350): Add RLS policies to auth tables with FORCE enforcement
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Implements Row-Level Security (RLS) policies on accounts and sessions tables with FORCE enforcement.

Core Implementation:
- Added FORCE ROW LEVEL SECURITY to accounts and sessions tables
- Created conditional owner bypass policies (when current_user_id() IS NULL)
- Created user-scoped access policies using current_user_id() helper
- Documented PostgreSQL superuser limitation with production deployment guide

Security Features:
- Prevents cross-user data access at database level
- Defense-in-depth security layer complementing application logic
- Owner bypass allows migrations and BetterAuth operations when no RLS context
- Production requires non-superuser application role (documented in migration)

Test Coverage:
- 22 comprehensive integration tests (9 accounts + 9 sessions + 4 context)
- Complete CRUD coverage: CREATE, READ, UPDATE, DELETE (own + others)
- Superuser detection with fail-fast error message
- Verification that blocked DELETE operations preserve data
- 100% test coverage, all tests passing

Integration:
- Uses RLS context provider from #351 (runWithRlsClient, getRlsClient)
- Parameterized queries using set_config() for security
- Transaction-scoped session variables with SET LOCAL

Files Created:
- apps/api/prisma/migrations/20260207_add_auth_rls_policies/migration.sql
- apps/api/src/auth/auth-rls.integration.spec.ts

Fixes #350

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:49:14 -06:00
93d403807b feat(#351): Implement RLS context interceptor (fix SEC-API-4)
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Implements Row-Level Security (RLS) context propagation via NestJS interceptor and AsyncLocalStorage.

Core Implementation:
- RlsContextInterceptor sets PostgreSQL session variables (app.current_user_id, app.current_workspace_id) within transaction boundaries
- Uses SET LOCAL for transaction-scoped variables, preventing connection pool leakage
- AsyncLocalStorage propagates transaction-scoped Prisma client to services
- Graceful handling of unauthenticated routes
- 30-second transaction timeout with 10-second max wait

Security Features:
- Error sanitization prevents information disclosure to clients
- TransactionClient type provides compile-time safety, prevents invalid method calls
- Defense-in-depth security layer for RLS policy enforcement

Quality Rails Compliance:
- Fixed 154 lint errors in llm-usage module (package-level enforcement)
- Added proper TypeScript typing for Prisma operations
- Resolved all type safety violations

Test Coverage:
- 19 tests (7 provider + 9 interceptor + 3 integration)
- 95.75% overall coverage (100% statements on implementation files)
- All tests passing, zero lint errors

Documentation:
- Comprehensive RLS-CONTEXT-USAGE.md with examples and migration guide

Files Created:
- apps/api/src/common/interceptors/rls-context.interceptor.ts
- apps/api/src/common/interceptors/rls-context.interceptor.spec.ts
- apps/api/src/common/interceptors/rls-context.integration.spec.ts
- apps/api/src/prisma/rls-context.provider.ts
- apps/api/src/prisma/rls-context.provider.spec.ts
- apps/api/src/prisma/RLS-CONTEXT-USAGE.md

Fixes #351

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:25:50 -06:00
Jason Woltje
144495ae6b fix(CQ-API-5): Document throttler in-memory fallback as best-effort
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Add comprehensive JSDoc and inline comments documenting the known race
condition in the in-memory fallback path of ThrottlerValkeyStorageService.
The non-atomic read-modify-write in incrementMemory() is intentionally
left without a mutex because:
- It is only the fallback path when Valkey is unavailable
- The primary Valkey path uses atomic INCR and is race-free
- Adding locking to a rarely-used degraded path adds complexity
  with minimal benefit

Also adds Logger.warn calls when falling back to in-memory mode
at runtime (Redis command failures).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:15:11 -06:00
Jason Woltje
08d077605a fix(SEC-API-28): Replace MCP console.error with NestJS Logger
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Replace all console.error calls in MCP services with NestJS Logger
instances for consistent structured logging in production.

- mcp-hub.service.ts: Add Logger instance, replace console.error in
  onModuleDestroy cleanup
- stdio-transport.ts: Add Logger instance, replace console.error for
  stderr output (as warn) and JSON parse failures (as error)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:11:41 -06:00
Jason Woltje
2e11931ded fix(SEC-API-27): Scope RLS context to transaction boundary
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
createAuthMiddleware was calling SET LOCAL on the raw PrismaClient
outside of any transaction. In PostgreSQL, SET LOCAL without a
transaction acts as a session-level SET, which can leak RLS context
to subsequent requests sharing the same pooled connection, enabling
cross-tenant data access.

Wrapped the setCurrentUser call and downstream handler execution
inside a $transaction block so SET LOCAL is automatically reverted
when the transaction ends (on both success and failure).

Added comprehensive test suite for db-context module verifying:
- RLS context is set on the transaction client, not the raw client
- next() executes inside the transaction boundary
- Authentication errors prevent any transaction from starting
- Errors in downstream handlers propagate correctly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:07:49 -06:00
Jason Woltje
617df12b52 fix(SEC-API-25+26): Enable strict ValidationPipe + tighten CORS origin
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Set forbidNonWhitelisted: true in ValidationPipe to reject requests
  with unknown DTO properties, preventing mass assignment vulnerabilities
- Reject requests with no Origin header in production (SEC-API-26)
- Restrict localhost:3001 to development mode only
- Update CORS tests to cover production/development origin validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:02:55 -06:00
Jason Woltje
92c310333c fix(SEC-REVIEW-4-7): Address remaining MEDIUM security review findings
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Graceful container shutdown: detect "not running" containers and skip
  force-remove escalation, only SIGKILL for genuine stop failures
- data: URI stripping: add security audit logging via NestJS Logger
  when data: URIs are blocked in markdown links and images
- Orchestrator bootstrap: replace void bootstrap() with .catch() handler
  for clear startup failure logging and clean process.exit(1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 14:51:22 -06:00
Jason Woltje
57441e2e64 fix(SEC-REVIEW-3): Add @MaxLength to SearchQueryDto.q for consistency
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
All other search DTOs (SemanticSearchBodyDto, HybridSearchBodyDto,
BrainQueryDto, BrainSearchDto) already enforce @MaxLength(500) on their
query fields. SearchQueryDto.q was missed, leaving the full-text
knowledge search endpoint accepting arbitrarily long queries.

Adds @MaxLength(500) decorator and validation test coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 14:39:08 -06:00
Jason Woltje
6dd2ce1014 fix(CQ-API-7): Fix N+1 query in knowledge tag lookup
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Replace Promise.all of individual findUnique queries per tag with a
single findMany batch query. Only missing tags are created individually.
Tag associations now use createMany instead of individual creates.
Also deduplicates tags by slug via Map, preventing duplicate entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 13:56:39 -06:00