Files
stack/docs/3-architecture/guard-rails-capability-permissions.md
Jason Woltje 8c8d065cc2
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
feat(arch): Add Guard Rails capability-based permission system design
Guard Rails complement Quality Rails by controlling what agents can do:
- Capability-based permissions (resource:action pattern)
- Read/organize/draft allowed by default
- Execute/admin require explicit grants
- Human-in-the-loop approval for sensitive actions

Examples: email (read/draft , send ), git (commit , force push )

Also:
- Add .admin-credentials and .env.bak.* to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 00:25:53 -06:00

13 KiB

Guard Rails: Capability-Based Permission System

Overview

Mosaic Stack implements two complementary safety systems:

System Purpose Scope
Quality Rails Ensure output quality Code reviews, linting, tests, token budgets
Guard Rails Control agent capabilities What agents CAN and CANNOT do

This document describes the Guard Rails system—a capability-based permission model that limits what agents, integrations, and plugins can do within the platform.

Core Principle

Prepare freely, execute with approval.

Agents should be able to read, analyze, organize, and draft—but destructive, irreversible, or sensitive actions require explicit human approval.

Permission Model

Capability Structure

Capabilities follow a resource:action pattern:

<resource>:<action>

Examples:
  email:read
  email:draft
  email:send
  calendar:read
  calendar:create_draft
  calendar:send_invite
  git:commit
  git:push
  git:force_push

Permission Levels

Level Description Example Actions
read View/query data Read emails, view calendar, list files
organize Non-destructive mutations Label, sort, archive, tag
draft Create pending items Compose email drafts, stage commits
execute Perform actions Send email, push code, transfer funds
admin Destructive/irreversible Delete, force push, revoke access

Default Stance

By default, agents receive:

  • All read permissions for their domain
  • All organize permissions for their domain
  • All draft permissions for their domain
  • No execute permissions (must be explicitly granted)
  • No admin permissions (must be explicitly granted with additional confirmation)

Example: Email Integration

integration: email
agent: jarvis

capabilities:
  granted:
    - email:read # Read inbox, threads, attachments
    - email:search # Search across mailbox
    - email:organize # Label, archive, mark read/unread
    - email:draft # Compose and save drafts

  denied:
    - email:send # Cannot send emails
    - email:delete # Cannot permanently delete

  requires_approval:
    - email:send # Human must click "Send"
    - email:delete # Human must confirm deletion

Workflow Example

User: "Reply to John's email about the meeting"

Agent Actions:
1. email:read     → Reads John's email (allowed)
2. email:search   → Finds related context (allowed)
3. email:draft    → Composes reply draft (allowed)
4. email:send     → BLOCKED

Agent Response:
"I've drafted a reply to John. Review it in your drafts
and click Send when ready."

[Link to draft in email client]

Example: Git Integration

integration: git
agent: code-assistant

capabilities:
  granted:
    - git:read # View repos, commits, diffs
    - git:branch # Create/switch branches
    - git:commit # Create commits (local)
    - git:push_feature # Push to feature branches

  denied:
    - git:push_main # Cannot push to main/master
    - git:force_push # Never force push
    - git:delete_branch # Cannot delete branches

  requires_approval:
    - git:push_main # Requires PR approval
    - git:merge # Requires code review

Example: Calendar Integration

integration: calendar
agent: jarvis

capabilities:
  granted:
    - calendar:read # View events, availability
    - calendar:analyze # Find conflicts, suggest times
    - calendar:draft # Create draft events

  denied:
    - calendar:send_invite # Cannot send invitations
    - calendar:delete # Cannot delete events
    - calendar:modify # Cannot modify existing events

  requires_approval:
    - calendar:send_invite # Human confirms before sending
    - calendar:accept # Human confirms RSVP

Example: Financial Integration

integration: finance
agent: finance-assistant

capabilities:
  granted:
    - finance:read # View transactions, balances
    - finance:categorize # Categorize transactions
    - finance:report # Generate reports
    - finance:draft # Prepare transfer requests

  denied:
    - finance:transfer # Cannot move money
    - finance:pay # Cannot make payments
    - finance:modify # Cannot edit transactions

  requires_approval:
    - finance:transfer # Multi-factor approval required
    - finance:pay # Human must authorize

Example: Home Automation

integration: home
agent: jarvis

capabilities:
  granted:
    - home:read # View device states
    - home:climate # Adjust thermostat
    - home:lights # Control lighting
    - home:media # Control entertainment

  denied:
    - home:unlock # Cannot unlock doors
    - home:disarm # Cannot disarm security
    - home:garage # Cannot open garage

  requires_approval:
    - home:unlock # Requires biometric + PIN
    - home:disarm # Requires security code

Implementation Architecture

Capability Check Flow

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Agent     │────▶│  Guard Rail  │────▶│  Resource   │
│  Request    │     │   Gateway    │     │   Service   │
└─────────────┘     └──────────────┘     └─────────────┘
                           │
                    ┌──────┴──────┐
                    ▼             ▼
              ┌─────────┐   ┌──────────┐
              │ Allowed │   │  Denied  │
              └─────────┘   └──────────┘
                    │             │
                    ▼             ▼
              ┌─────────┐   ┌──────────┐
              │ Execute │   │  Queue   │
              │ Action  │   │ Approval │
              └─────────┘   └──────────┘

Database Schema

-- Capability definitions
CREATE TABLE capabilities (
  id UUID PRIMARY KEY,
  resource VARCHAR(100) NOT NULL,
  action VARCHAR(100) NOT NULL,
  level VARCHAR(20) NOT NULL,  -- read, organize, draft, execute, admin
  description TEXT,
  risk_level VARCHAR(20),      -- low, medium, high, critical
  UNIQUE(resource, action)
);

-- Agent capability grants
CREATE TABLE agent_capabilities (
  id UUID PRIMARY KEY,
  agent_id UUID REFERENCES agents(id),
  capability_id UUID REFERENCES capabilities(id),
  status VARCHAR(20) NOT NULL,  -- granted, denied, requires_approval
  granted_by UUID REFERENCES users(id),
  granted_at TIMESTAMP,
  expires_at TIMESTAMP,
  conditions JSONB,             -- Additional constraints
  UNIQUE(agent_id, capability_id)
);

-- Approval queue for requires_approval capabilities
CREATE TABLE capability_approvals (
  id UUID PRIMARY KEY,
  agent_id UUID REFERENCES agents(id),
  capability_id UUID REFERENCES capabilities(id),
  request_context JSONB,        -- What the agent wants to do
  status VARCHAR(20),           -- pending, approved, denied, expired
  requested_at TIMESTAMP,
  decided_at TIMESTAMP,
  decided_by UUID REFERENCES users(id),
  decision_reason TEXT
);

-- Audit log for all capability checks
CREATE TABLE capability_audit_log (
  id UUID PRIMARY KEY,
  agent_id UUID REFERENCES agents(id),
  capability_id UUID REFERENCES capabilities(id),
  result VARCHAR(20),           -- allowed, denied, queued
  context JSONB,
  timestamp TIMESTAMP DEFAULT NOW()
);

API Design

// Check if agent has capability
async function checkCapability(
  agentId: string,
  resource: string,
  action: string,
  context?: Record<string, unknown>
): Promise<CapabilityResult> {
  // Returns: { allowed: boolean, reason?: string, approvalId?: string }
}

// Request approval for blocked capability
async function requestApproval(
  agentId: string,
  resource: string,
  action: string,
  context: Record<string, unknown>
): Promise<ApprovalRequest> {
  // Creates approval request, notifies user
}

// Grant capability to agent
async function grantCapability(
  agentId: string,
  capabilityId: string,
  grantedBy: string,
  options?: {
    expiresAt?: Date;
    conditions?: Record<string, unknown>;
  }
): Promise<void>;

Configuration

Per-Integration Defaults

Each integration defines sensible defaults:

# integrations/email/defaults.yaml
integration: email
default_capabilities:
  granted:
    - email:read
    - email:search
    - email:organize
    - email:draft
  denied:
    - email:send
    - email:delete
  requires_approval:
    - email:send

Per-Agent Overrides

Users can customize per agent:

# agents/jarvis/capabilities.yaml
agent: jarvis
overrides:
  email:
    # Jarvis can send to known contacts
    email:send:
      status: granted
      conditions:
        recipient_in: known_contacts
    # But still needs approval for new recipients
    email:send_new:
      status: requires_approval

User Experience

Approval Notifications

When an agent hits a requires_approval capability:

  1. Agent informs user what it wants to do
  2. Draft/preview created for user review
  3. Notification sent via preferred channel (app, email, SMS)
  4. User approves/denies with optional feedback
  5. Agent proceeds or adjusts based on decision

Approval UI

┌────────────────────────────────────────────────┐
│ 🤖 Jarvis needs your approval                  │
├────────────────────────────────────────────────┤
│                                                │
│ Action: Send email                             │
│ To: john.smith@example.com                     │
│ Subject: Re: Project Update                    │
│                                                │
│ ┌────────────────────────────────────────────┐ │
│ │ Hi John,                                   │ │
│ │                                            │ │
│ │ Thanks for the update. I've reviewed...    │ │
│ │ [Preview truncated - click to expand]      │ │
│ └────────────────────────────────────────────┘ │
│                                                │
│  [Deny]  [Edit Draft]  [✓ Approve & Send]     │
│                                                │
└────────────────────────────────────────────────┘

Security Considerations

Defense in Depth

Guard Rails are one layer of security:

  1. Authentication - Who is the agent?
  2. Authorization - What can the agent do? (Guard Rails)
  3. Rate Limiting - How often can they do it?
  4. Audit Logging - What did they do?
  5. Anomaly Detection - Is this behavior unusual?

Capability Escalation Prevention

  • Agents cannot grant capabilities to themselves
  • Agents cannot grant capabilities to other agents
  • Capability grants require human authorization
  • Critical capabilities require multi-factor confirmation

Time-Limited Grants

For sensitive operations, capabilities can be time-limited:

capability_grant:
  agent: jarvis
  capability: email:send
  expires_in: 1h
  max_uses: 5
  reason: "Processing inbox backlog"

Future Enhancements

Contextual Permissions

Grant capabilities based on context:

email:send:
  granted_when:
    - recipient_in: known_contacts
    - thread_initiated_by: user
    - content_reviewed: true
  denied_when:
    - contains_sensitive_data: true
    - recipient_is_external: true

Learning Mode

Track what approvals are commonly granted to suggest permission adjustments:

"You've approved 47 email sends from Jarvis to your team.
Would you like to auto-approve emails to @yourcompany.com?"

Delegation Chains

Allow users to delegate approval authority:

delegation:
  from: jason
  to: melanie
  capabilities:
    - calendar:send_invite
  scope: family_calendar
  expires: 2026-03-01