Merge branch 'develop' into work/m4-llm

2026-02-04 02:28:50 +00:00
parent 0669c7cb77 596ec39442
commit b7f4749ffb
16 changed files with 1151 additions and 1204 deletions
--- a/docs/scratchpads/274-command-injection.md
+++ b/docs/scratchpads/274-command-injection.md
@@ -0,0 +1,80 @@
+# Issue #274: Sanitize agent spawn command payloads (command injection risk)
+
+## Objective
+
+Add input validation and sanitization to agent spawn command payloads to prevent command injection vulnerabilities in git operations.
+
+## Security Impact
+
+**Severity:** P0 (Critical) - Blocks production deployment
+**Attack Vector:** Federated instances can inject malicious commands via branch names
+**Risk:** Command injection in git operations allowing arbitrary code execution
+
+## Vulnerability Details
+
+### Attack Flow
+
+1. Attacker sends federation command with malicious branch name
+2. Payload passes through command service without validation
+3. Branch name used directly in `git worktree add` command
+4. Malicious git syntax executed on orchestrator
+
+### Vulnerable Code
+
+**File:** `apps/orchestrator/src/git/worktree-manager.service.ts:82`
+
+```typescript
+await git.raw(["worktree", "add", worktreePath, "-b", branchName, baseBranch]);
+```
+
+**Input Source:** Federation command payload → no validation → git command
+
+### Attack Example
+
+```json
+{
+  "commandType": "agent.spawn",
+  "payload": {
+    "context": {
+      "branch": "feature/--config user.core.sshCommand=malicious"
+    }
+  }
+}
+```
+
+## Approach
+
+### 1. Add Input Validation DTOs
+
+- Strict regex for branch names (alphanumeric + hyphens + underscores + slashes)
+- Repository URL validation (https/ssh only)
+- Reject dangerous characters (`;`, `$`, `` ` ``, `--`, etc.)
+
+### 2. Create Sanitization Utility
+
+- Whitelist-based approach
+- Validate before any git operation
+- Clear error messages on rejection
+
+### 3. Apply at Multiple Layers
+
+- DTO validation (first line of defense)
+- Service-level sanitization (defense in depth)
+- Git operation wrapper (last resort)
+
+## Progress
+
+- [ ] Create validation utility
+- [ ] Update SpawnAgentDto with strict validation
+- [ ] Update SpawnAgentCommandPayload type
+- [ ] Add sanitization in WorktreeManagerService
+- [ ] Add tests for validation
+- [ ] Add tests for sanitization
+- [ ] Security vulnerability FIXED
+- [ ] Create PR
+- [ ] Merge to develop
+- [ ] Close issue #274
+
+## Implementation Status
+
+**IN PROGRESS** - Adding input validation and sanitization
--- a/docs/scratchpads/275-silent-connection-failures.md
+++ b/docs/scratchpads/275-silent-connection-failures.md
@@ -0,0 +1,82 @@
+# Issue #275: Fix silent connection initiation failures
+
+## Objective
+
+Fix silent connection initiation failures where HTTP errors are caught but success is returned to the user, leaving zombie connections in PENDING state forever.
+
+## Location
+
+`apps/api/src/federation/connection.service.ts:72-80`
+
+## Problem
+
+Current code:
+
+```typescript
+try {
+  await firstValueFrom(
+    this.httpService.post(`${remoteUrl}/api/v1/federation/incoming/connect`, signedRequest)
+  );
+  this.logger.log(`Connection request sent to ${remoteUrl}`);
+} catch (error) {
+  this.logger.error(`Failed to send connection request to ${remoteUrl}`, error);
+  // Connection is still created in PENDING state, can be retried
+}
+
+return this.mapToConnectionDetails(connection);
+```
+
+Issues:
+
+- Catches HTTP failures but returns success
+- Connection stays in PENDING state forever
+- Creates zombie connections
+- User sees success message but connection actually failed
+
+## Solution
+
+1. Delete the failed connection from database
+2. Throw exception with clear error message
+3. User gets immediate feedback that connection failed
+
+## Implementation
+
+```typescript
+try {
+  await firstValueFrom(
+    this.httpService.post(`${remoteUrl}/api/v1/federation/incoming/connect`, signedRequest)
+  );
+  this.logger.log(`Connection request sent to ${remoteUrl}`);
+} catch (error) {
+  this.logger.error(`Failed to send connection request to ${remoteUrl}`, error);
+
+  // Delete the failed connection to prevent zombie connections
+  await this.prisma.federationConnection.delete({
+    where: { id: connection.id },
+  });
+
+  throw new BadRequestException(
+    `Failed to initiate connection to ${remoteUrl}: ${error instanceof Error ? error.message : "Unknown error"}`
+  );
+}
+```
+
+## Testing
+
+Test scenarios:
+
+1. Remote instance is unreachable - should throw exception and delete connection
+2. Remote instance returns error - should throw exception and delete connection
+3. Remote instance times out - should throw exception and delete connection
+4. Remote instance returns success - should create connection in PENDING state
+
+## Progress
+
+- [ ] Create scratchpad
+- [ ] Implement fix in connection.service.ts
+- [ ] Add/update tests
+- [ ] Run quality gates
+- [ ] Commit changes
+- [ ] Create PR
+- [ ] Merge to develop
+- [ ] Close issue #275
--- a/docs/scratchpads/276-workspace-authorization.md
+++ b/docs/scratchpads/276-workspace-authorization.md
@@ -0,0 +1,149 @@
+# Issue #276: Add workspace authorization on incoming connections
+
+## Objective
+
+Add proper workspace authorization and controls for incoming federation connections.
+
+## Location
+
+`apps/api/src/federation/federation.controller.ts:211-233`
+
+## Current Problem
+
+```typescript
+@Post("incoming/connect")
+@Throttle({ short: { limit: 3, ttl: 1000 } })
+async handleIncomingConnection(
+  @Body() dto: IncomingConnectionRequestDto
+): Promise<{ status: string; connectionId?: string }> {
+  this.logger.log(`Received connection request from ${dto.instanceId}`);
+
+  // LIMITATION: Incoming connections are created in a default workspace
+  const workspaceId = process.env.DEFAULT_WORKSPACE_ID ?? "default";
+
+  const connection = await this.connectionService.handleIncomingConnectionRequest(
+    workspaceId,
+    dto
+  );
+
+  return {
+    status: "pending",
+    connectionId: connection.id,
+  };
+}
+```
+
+Issues:
+
+- No authorization check - any remote instance can create connections
+- No admin approval workflow
+- Limited audit logging
+- No allowlist/denylist checking
+- Hardcoded default workspace
+
+## Security Impact
+
+- **Authorization bypass**: Remote instances can force connections without permission
+- **Workspace pollution**: Unwanted connections clutter the default workspace
+- **No control**: Administrators have no way to pre-approve or block instances
+
+## Solution Approach
+
+### Phase 1: Audit Logging (This fix)
+
+Add comprehensive audit logging for all incoming connection attempts before implementing full authorization.
+
+Changes:
+
+1. Log all incoming connection requests with full details
+2. Log successful connection creations
+3. Log any validation failures
+4. Include remote instance details in logs
+
+### Phase 2: Authorization Framework (Future)
+
+- Add workspace routing configuration
+- Implement allowlist/denylist at instance level
+- Add admin approval workflow
+- Implement automatic approval for trusted instances
+
+## Implementation (Phase 1)
+
+Add comprehensive audit logging to connection.service.ts:
+
+```typescript
+async handleIncomingConnectionRequest(
+  workspaceId: string,
+  request: ConnectionRequest
+): Promise<ConnectionDetails> {
+  // Audit log: Incoming connection attempt
+  this.auditService.logIncomingConnectionAttempt({
+    workspaceId,
+    remoteInstanceId: request.instanceId,
+    remoteUrl: request.instanceUrl,
+    timestamp: request.timestamp,
+  });
+
+  // Verify signature
+  const verification = this.signatureService.verifyConnectionRequest(request);
+  if (!verification.valid) {
+    // Audit log: Failed verification
+    this.auditService.logConnectionRejected({
+      workspaceId,
+      remoteInstanceId: request.instanceId,
+      reason: 'Invalid signature',
+      error: verification.error,
+    });
+
+    throw new UnauthorizedException(
+      `Invalid connection request signature: ${verification.error}`
+    );
+  }
+
+  // Create connection (existing logic)
+  const connection = await this.prisma.federationConnection.create({...});
+
+  // Audit log: Connection created
+  this.auditService.logIncomingConnectionCreated({
+    workspaceId,
+    connectionId: connection.id,
+    remoteInstanceId: request.instanceId,
+    remoteUrl: request.instanceUrl,
+  });
+
+  return this.mapToConnectionDetails(connection);
+}
+```
+
+## Testing
+
+Test scenarios:
+
+1. Incoming connection with valid signature → logged and created
+2. Incoming connection with invalid signature → logged and rejected
+3. Verify all audit logs contain required fields
+4. Verify workspace isolation in logs
+
+## Progress
+
+- [ ] Create scratchpad
+- [ ] Add audit logging methods to FederationAuditService
+- [ ] Update handleIncomingConnectionRequest with audit logging
+- [ ] Add tests for audit logging
+- [ ] Run quality gates
+- [ ] Commit changes
+- [ ] Create PR
+- [ ] Merge to develop
+- [ ] Close issue #276
+- [ ] Create follow-up issue for Phase 2 (full authorization)
+
+## Notes
+
+This implements the audit logging requirement from the issue. Full authorization (allowlist/denylist, admin approval) will be implemented in a follow-up issue as it requires:
+
+- Database schema changes (allowlist/denylist tables)
+- New configuration endpoints
+- Admin UI changes
+- More extensive testing
+
+Audit logging provides immediate visibility and security monitoring without requiring major architectural changes.
--- a/docs/scratchpads/277-comprehensive-audit-logging.md
+++ b/docs/scratchpads/277-comprehensive-audit-logging.md
@@ -0,0 +1,120 @@
+# Issue #277: Add comprehensive audit logging for security events
+
+## Objective
+
+Add comprehensive audit logging for critical security events to enable forensic analysis and attack detection.
+
+## Missing Logging Areas
+
+### 1. Failed signature verifications
+
+- **Current**: DEBUG level only
+- **Location**: `signature.service.ts`
+- **Required**: WARN level with full details
+
+### 2. Failed OIDC validations
+
+- **Current**: No details logged
+- **Location**: `auth` module
+- **Required**: Full validation failure details
+
+### 3. Capability bypass attempts
+
+- **Current**: Not logged
+- **Location**: `capability.guard.ts`
+- **Required**: Log all denied capabilities
+
+### 4. Rate limit violations
+
+- **Current**: Not logged
+- **Location**: ThrottlerGuard
+- **Required**: Log rate limit hits
+
+### 5. Command injection attempts
+
+- **Current**: Not logged
+- **Location**: `git-validation.util.ts` (recently added)
+- **Required**: Log validation rejections
+
+## Already Implemented
+
+From issue #276 (commit 744290a):
+
+- ✅ Incoming connection attempts
+- ✅ Failed signature verifications for connections
+- ✅ Connection created events
+
+From issue #274 (commit 7a84d96):
+
+- ✅ Git command validation (but not logged)
+
+## Implementation Plan
+
+### Priority 1: Add missing audit methods
+
+1. `logSignatureVerificationFailed()` - Failed signatures
+2. `logRateLimitViolation()` - Rate limit hits
+3. `logCommandInjectionAttempt()` - Malicious input attempts
+
+### Priority 2: Update existing code
+
+1. Add logging to signature.service.ts
+2. Add logging to git-validation.util.ts (throw + log)
+3. Document rate limit violations (if not already handled by NestJS)
+
+### Priority 3: Review capability guard
+
+1. Check if logCapabilityDenied is being called
+2. Add calls if missing
+
+## Status Assessment
+
+After reviewing issue #276, we already have:
+
+- ✅ logCapabilityDenied() method
+- ✅ logIncomingConnectionAttempt()
+- ✅ logIncomingConnectionRejected()
+- ✅ Signature verification failures for connections
+
+What's actually missing:
+
+1. General signature verification failures (outside connection context)
+2. Rate limit violation logging
+3. Command injection attempt logging
+
+## Implementation Approach
+
+Focus on what's truly missing and actionable:
+
+1. **Add command injection attempt logging**
+   - Update git-validation.util.ts to log before throwing
+   - Create logCommandInjectionAttempt() method
+
+2. **Add rate limit logging**
+   - Check if NestJS throttler already logs
+   - Add custom logging if needed
+
+3. **Verify capability logging**
+   - Check that capability.guard.ts calls logCapabilityDenied
+
+## Progress
+
+- [ ] Create scratchpad
+- [ ] Add logCommandInjectionAttempt() to audit service
+- [ ] Update git-validation.util.ts to log attempts
+- [ ] Check capability guard logging
+- [ ] Check rate limit logging
+- [ ] Add tests
+- [ ] Run quality gates
+- [ ] Commit changes
+- [ ] Push and close issue
+
+## Notes
+
+Some of the required logging may already be in place. Need to verify:
+
+1. Capability guard usage
+2. Rate limiter behavior
+3. OIDC validation (may be in auth module, not federation)
+
+Focus on concrete, implementable improvements rather than theoretical gaps.