Files
stack/docs/3-architecture/guard-rails-capability-permissions.md
Jason Woltje 8c8d065cc2
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
feat(arch): Add Guard Rails capability-based permission system design
Guard Rails complement Quality Rails by controlling what agents can do:
- Capability-based permissions (resource:action pattern)
- Read/organize/draft allowed by default
- Execute/admin require explicit grants
- Human-in-the-loop approval for sensitive actions

Examples: email (read/draft , send ), git (commit , force push )

Also:
- Add .admin-credentials and .env.bak.* to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 00:25:53 -06:00

454 lines
13 KiB
Markdown

# Guard Rails: Capability-Based Permission System
## Overview
Mosaic Stack implements two complementary safety systems:
| System | Purpose | Scope |
| ----------------- | -------------------------- | ------------------------------------------- |
| **Quality Rails** | Ensure output quality | Code reviews, linting, tests, token budgets |
| **Guard Rails** | Control agent capabilities | What agents CAN and CANNOT do |
This document describes the **Guard Rails** system—a capability-based permission model that limits what agents, integrations, and plugins can do within the platform.
## Core Principle
> **Prepare freely, execute with approval.**
Agents should be able to read, analyze, organize, and draft—but destructive, irreversible, or sensitive actions require explicit human approval.
## Permission Model
### Capability Structure
Capabilities follow a `resource:action` pattern:
```
<resource>:<action>
Examples:
email:read
email:draft
email:send
calendar:read
calendar:create_draft
calendar:send_invite
git:commit
git:push
git:force_push
```
### Permission Levels
| Level | Description | Example Actions |
| ------------ | ------------------------- | -------------------------------------- |
| **read** | View/query data | Read emails, view calendar, list files |
| **organize** | Non-destructive mutations | Label, sort, archive, tag |
| **draft** | Create pending items | Compose email drafts, stage commits |
| **execute** | Perform actions | Send email, push code, transfer funds |
| **admin** | Destructive/irreversible | Delete, force push, revoke access |
### Default Stance
By default, agents receive:
- ✅ All `read` permissions for their domain
- ✅ All `organize` permissions for their domain
- ✅ All `draft` permissions for their domain
- ❌ No `execute` permissions (must be explicitly granted)
- ❌ No `admin` permissions (must be explicitly granted with additional confirmation)
## Example: Email Integration
```yaml
integration: email
agent: jarvis
capabilities:
granted:
- email:read # Read inbox, threads, attachments
- email:search # Search across mailbox
- email:organize # Label, archive, mark read/unread
- email:draft # Compose and save drafts
denied:
- email:send # Cannot send emails
- email:delete # Cannot permanently delete
requires_approval:
- email:send # Human must click "Send"
- email:delete # Human must confirm deletion
```
### Workflow Example
```
User: "Reply to John's email about the meeting"
Agent Actions:
1. email:read → Reads John's email (allowed)
2. email:search → Finds related context (allowed)
3. email:draft → Composes reply draft (allowed)
4. email:send → BLOCKED
Agent Response:
"I've drafted a reply to John. Review it in your drafts
and click Send when ready."
[Link to draft in email client]
```
## Example: Git Integration
```yaml
integration: git
agent: code-assistant
capabilities:
granted:
- git:read # View repos, commits, diffs
- git:branch # Create/switch branches
- git:commit # Create commits (local)
- git:push_feature # Push to feature branches
denied:
- git:push_main # Cannot push to main/master
- git:force_push # Never force push
- git:delete_branch # Cannot delete branches
requires_approval:
- git:push_main # Requires PR approval
- git:merge # Requires code review
```
## Example: Calendar Integration
```yaml
integration: calendar
agent: jarvis
capabilities:
granted:
- calendar:read # View events, availability
- calendar:analyze # Find conflicts, suggest times
- calendar:draft # Create draft events
denied:
- calendar:send_invite # Cannot send invitations
- calendar:delete # Cannot delete events
- calendar:modify # Cannot modify existing events
requires_approval:
- calendar:send_invite # Human confirms before sending
- calendar:accept # Human confirms RSVP
```
## Example: Financial Integration
```yaml
integration: finance
agent: finance-assistant
capabilities:
granted:
- finance:read # View transactions, balances
- finance:categorize # Categorize transactions
- finance:report # Generate reports
- finance:draft # Prepare transfer requests
denied:
- finance:transfer # Cannot move money
- finance:pay # Cannot make payments
- finance:modify # Cannot edit transactions
requires_approval:
- finance:transfer # Multi-factor approval required
- finance:pay # Human must authorize
```
## Example: Home Automation
```yaml
integration: home
agent: jarvis
capabilities:
granted:
- home:read # View device states
- home:climate # Adjust thermostat
- home:lights # Control lighting
- home:media # Control entertainment
denied:
- home:unlock # Cannot unlock doors
- home:disarm # Cannot disarm security
- home:garage # Cannot open garage
requires_approval:
- home:unlock # Requires biometric + PIN
- home:disarm # Requires security code
```
## Implementation Architecture
### Capability Check Flow
```
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Agent │────▶│ Guard Rail │────▶│ Resource │
│ Request │ │ Gateway │ │ Service │
└─────────────┘ └──────────────┘ └─────────────┘
┌──────┴──────┐
▼ ▼
┌─────────┐ ┌──────────┐
│ Allowed │ │ Denied │
└─────────┘ └──────────┘
│ │
▼ ▼
┌─────────┐ ┌──────────┐
│ Execute │ │ Queue │
│ Action │ │ Approval │
└─────────┘ └──────────┘
```
### Database Schema
```sql
-- Capability definitions
CREATE TABLE capabilities (
id UUID PRIMARY KEY,
resource VARCHAR(100) NOT NULL,
action VARCHAR(100) NOT NULL,
level VARCHAR(20) NOT NULL, -- read, organize, draft, execute, admin
description TEXT,
risk_level VARCHAR(20), -- low, medium, high, critical
UNIQUE(resource, action)
);
-- Agent capability grants
CREATE TABLE agent_capabilities (
id UUID PRIMARY KEY,
agent_id UUID REFERENCES agents(id),
capability_id UUID REFERENCES capabilities(id),
status VARCHAR(20) NOT NULL, -- granted, denied, requires_approval
granted_by UUID REFERENCES users(id),
granted_at TIMESTAMP,
expires_at TIMESTAMP,
conditions JSONB, -- Additional constraints
UNIQUE(agent_id, capability_id)
);
-- Approval queue for requires_approval capabilities
CREATE TABLE capability_approvals (
id UUID PRIMARY KEY,
agent_id UUID REFERENCES agents(id),
capability_id UUID REFERENCES capabilities(id),
request_context JSONB, -- What the agent wants to do
status VARCHAR(20), -- pending, approved, denied, expired
requested_at TIMESTAMP,
decided_at TIMESTAMP,
decided_by UUID REFERENCES users(id),
decision_reason TEXT
);
-- Audit log for all capability checks
CREATE TABLE capability_audit_log (
id UUID PRIMARY KEY,
agent_id UUID REFERENCES agents(id),
capability_id UUID REFERENCES capabilities(id),
result VARCHAR(20), -- allowed, denied, queued
context JSONB,
timestamp TIMESTAMP DEFAULT NOW()
);
```
### API Design
```typescript
// Check if agent has capability
async function checkCapability(
agentId: string,
resource: string,
action: string,
context?: Record<string, unknown>
): Promise<CapabilityResult> {
// Returns: { allowed: boolean, reason?: string, approvalId?: string }
}
// Request approval for blocked capability
async function requestApproval(
agentId: string,
resource: string,
action: string,
context: Record<string, unknown>
): Promise<ApprovalRequest> {
// Creates approval request, notifies user
}
// Grant capability to agent
async function grantCapability(
agentId: string,
capabilityId: string,
grantedBy: string,
options?: {
expiresAt?: Date;
conditions?: Record<string, unknown>;
}
): Promise<void>;
```
## Configuration
### Per-Integration Defaults
Each integration defines sensible defaults:
```yaml
# integrations/email/defaults.yaml
integration: email
default_capabilities:
granted:
- email:read
- email:search
- email:organize
- email:draft
denied:
- email:send
- email:delete
requires_approval:
- email:send
```
### Per-Agent Overrides
Users can customize per agent:
```yaml
# agents/jarvis/capabilities.yaml
agent: jarvis
overrides:
email:
# Jarvis can send to known contacts
email:send:
status: granted
conditions:
recipient_in: known_contacts
# But still needs approval for new recipients
email:send_new:
status: requires_approval
```
## User Experience
### Approval Notifications
When an agent hits a `requires_approval` capability:
1. **Agent informs user** what it wants to do
2. **Draft/preview created** for user review
3. **Notification sent** via preferred channel (app, email, SMS)
4. **User approves/denies** with optional feedback
5. **Agent proceeds or adjusts** based on decision
### Approval UI
```
┌────────────────────────────────────────────────┐
│ 🤖 Jarvis needs your approval │
├────────────────────────────────────────────────┤
│ │
│ Action: Send email │
│ To: john.smith@example.com │
│ Subject: Re: Project Update │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ Hi John, │ │
│ │ │ │
│ │ Thanks for the update. I've reviewed... │ │
│ │ [Preview truncated - click to expand] │ │
│ └────────────────────────────────────────────┘ │
│ │
│ [Deny] [Edit Draft] [✓ Approve & Send] │
│ │
└────────────────────────────────────────────────┘
```
## Security Considerations
### Defense in Depth
Guard Rails are one layer of security:
1. **Authentication** - Who is the agent?
2. **Authorization** - What can the agent do? (Guard Rails)
3. **Rate Limiting** - How often can they do it?
4. **Audit Logging** - What did they do?
5. **Anomaly Detection** - Is this behavior unusual?
### Capability Escalation Prevention
- Agents cannot grant capabilities to themselves
- Agents cannot grant capabilities to other agents
- Capability grants require human authorization
- Critical capabilities require multi-factor confirmation
### Time-Limited Grants
For sensitive operations, capabilities can be time-limited:
```yaml
capability_grant:
agent: jarvis
capability: email:send
expires_in: 1h
max_uses: 5
reason: "Processing inbox backlog"
```
## Future Enhancements
### Contextual Permissions
Grant capabilities based on context:
```yaml
email:send:
granted_when:
- recipient_in: known_contacts
- thread_initiated_by: user
- content_reviewed: true
denied_when:
- contains_sensitive_data: true
- recipient_is_external: true
```
### Learning Mode
Track what approvals are commonly granted to suggest permission adjustments:
```
"You've approved 47 email sends from Jarvis to your team.
Would you like to auto-approve emails to @yourcompany.com?"
```
### Delegation Chains
Allow users to delegate approval authority:
```yaml
delegation:
from: jason
to: melanie
capabilities:
- calendar:send_invite
scope: family_calendar
expires: 2026-03-01
```
## Related Documentation
- [Quality Rails Architecture](./quality-rails-orchestration-architecture.md)
- [Agent Security Model](./agent-security-model.md)
- [Integration Development Guide](../2-development/integrations.md)