Compare commits

..

5 Commits

Author SHA1 Message Date
Jarvis
7a9ce6845f docs(federation): M3 mission planning — 14-task decomposition + manifest update
Some checks failed
ci/woodpecker/push/ci Pipeline failed
ci/woodpecker/pr/ci Pipeline failed
Decomposes Milestone 3 (mTLS handshake + list/get/capabilities + scope
enforcement) into 14 tasks following the M1/M2 pattern. Updates mission
manifest to reflect M2 done, M3 in-progress (2/7 milestones complete),
and appends session 23 entry to the MVP scratchpad.

M3 structure:
- Foundation: M3-01 (DTOs in packages/types/src/federation/)
- Server stream: M3-03 (AuthGuard) → M3-04 (ScopeService) → M3-05/06/07 (verbs)
- Client stream (parallel): M3-08 (FederationClient) → M3-09 (QuerySourceService)
- Test infra (parallel): M3-02 (tools/federation-harness/ — local two-gateway)
- Validation: M3-10 (Integration) → M3-11 (E2E) → M3-12 (Independent security review)
- Close: M3-13 (Docs) → M3-14 (release tag fed-v0.3.0-m3, close #462)

Estimate ~100K tokens vs MILESTONES.md 40K — same per-task expansion as M1/M2
once tests, review, and docs are split out.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-23 20:12:25 -05:00
4ece6dc643 chore(federation): M2 milestone close (FED-M2-13) (#503)
Some checks failed
ci/woodpecker/push/ci Pipeline failed
ci/woodpecker/tag/publish Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
2026-04-22 06:09:54 +00:00
194c3b603e docs(federation): M2 Step-CA setup guide + admin CLI reference (FED-M2-12) (#502)
Some checks failed
ci/woodpecker/push/publish Pipeline failed
ci/woodpecker/push/ci Pipeline failed
2026-04-22 06:06:45 +00:00
fc1600b738 fix(federation): security hardening — OID verification, atomic activation, audit on failure (#501)
Some checks failed
ci/woodpecker/push/ci Pipeline failed
ci/woodpecker/push/publish Pipeline failed
2026-04-22 06:02:52 +00:00
0ee5b14c68 test(federation): M2 E2E peer-add enrollment flow (FED-M2-10) (#500)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/publish Pipeline was successful
2026-04-22 05:37:06 +00:00
7 changed files with 518 additions and 147 deletions

View File

@@ -35,7 +35,7 @@ import * as crypto from 'node:crypto';
import * as fs from 'node:fs'; import * as fs from 'node:fs';
import * as https from 'node:https'; import * as https from 'node:https';
import { SignJWT, importJWK } from 'jose'; import { SignJWT, importJWK } from 'jose';
import { Pkcs10CertificateRequest } from '@peculiar/x509'; import { Pkcs10CertificateRequest, X509Certificate } from '@peculiar/x509';
import type { IssueCertRequestDto } from './ca.dto.js'; import type { IssueCertRequestDto } from './ca.dto.js';
import { IssuedCertDto } from './ca.dto.js'; import { IssuedCertDto } from './ca.dto.js';
@@ -624,6 +624,51 @@ export class CaService {
const serialNumber = extractSerial(response.crt); const serialNumber = extractSerial(response.crt);
// CRIT-1: Verify the issued certificate contains both Mosaic OID extensions
// with the correct values. Step-CA's federation.tpl encodes each as an ASN.1
// UTF8String TLV: tag 0x0C + 1-byte length + UUID bytes. We skip 2 bytes
// (tag + length) to extract the raw UUID string.
const issuedCert = new X509Certificate(response.crt);
const decoder = new TextDecoder();
const grantIdExt = issuedCert.getExtension('1.3.6.1.4.1.99999.1');
if (!grantIdExt) {
throw new CaServiceError(
'Issued certificate is missing required Mosaic OID: mosaic_grant_id',
'The Step-CA federation.tpl template did not embed OID 1.3.6.1.4.1.99999.1. Check the provisioner template configuration.',
undefined,
'OID_MISSING',
);
}
const grantIdInCert = decoder.decode(grantIdExt.value.slice(2));
if (grantIdInCert !== req.grantId) {
throw new CaServiceError(
`Issued certificate mosaic_grant_id mismatch: expected ${req.grantId}, got ${grantIdInCert}`,
'The Step-CA issued a certificate with a different grant ID than requested. This may indicate a provisioner misconfiguration or a MITM.',
undefined,
'OID_MISMATCH',
);
}
const subjectUserIdExt = issuedCert.getExtension('1.3.6.1.4.1.99999.2');
if (!subjectUserIdExt) {
throw new CaServiceError(
'Issued certificate is missing required Mosaic OID: mosaic_subject_user_id',
'The Step-CA federation.tpl template did not embed OID 1.3.6.1.4.1.99999.2. Check the provisioner template configuration.',
undefined,
'OID_MISSING',
);
}
const subjectUserIdInCert = decoder.decode(subjectUserIdExt.value.slice(2));
if (subjectUserIdInCert !== req.subjectUserId) {
throw new CaServiceError(
`Issued certificate mosaic_subject_user_id mismatch: expected ${req.subjectUserId}, got ${subjectUserIdInCert}`,
'The Step-CA issued a certificate with a different subject user ID than requested. This may indicate a provisioner misconfiguration or a MITM.',
undefined,
'OID_MISMATCH',
);
}
this.logger.log(`Certificate issued — serial=${serialNumber} grantId=${req.grantId}`); this.logger.log(`Certificate issued — serial=${serialNumber} grantId=${req.grantId}`);
const result = new IssuedCertDto(); const result = new IssuedCertDto();

View File

@@ -14,6 +14,7 @@
import { import {
BadRequestException, BadRequestException,
ConflictException,
GoneException, GoneException,
Inject, Inject,
Injectable, Injectable,
@@ -66,6 +67,21 @@ export class EnrollmentService {
*/ */
async createToken(dto: CreateEnrollmentTokenDto): Promise<EnrollmentTokenResult> { async createToken(dto: CreateEnrollmentTokenDto): Promise<EnrollmentTokenResult> {
const ttl = Math.min(dto.ttlSeconds, 900); const ttl = Math.min(dto.ttlSeconds, 900);
// MED-3: Verify the grantId ↔ peerId binding — prevents attacker from
// cross-wiring grants to attacker-controlled peers.
const [grant] = await this.db
.select({ peerId: federationGrants.peerId })
.from(federationGrants)
.where(eq(federationGrants.id, dto.grantId))
.limit(1);
if (!grant) {
throw new NotFoundException(`Grant ${dto.grantId} not found`);
}
if (grant.peerId !== dto.peerId) {
throw new BadRequestException(`peerId does not match the grant's registered peer`);
}
const token = crypto.randomBytes(32).toString('hex'); const token = crypto.randomBytes(32).toString('hex');
const expiresAt = new Date(Date.now() + ttl * 1000); const expiresAt = new Date(Date.now() + ttl * 1000);
@@ -99,16 +115,23 @@ export class EnrollmentService {
* 8. Return { certPem, certChainPem } * 8. Return { certPem, certChainPem }
*/ */
async redeem(token: string, csrPem: string): Promise<RedeemResult> { async redeem(token: string, csrPem: string): Promise<RedeemResult> {
// HIGH-5: Track outcome so we can write a failure audit row on any error.
let outcome: 'allowed' | 'denied' = 'denied';
// row may be undefined if the token is not found — used defensively in catch.
let row: typeof federationEnrollmentTokens.$inferSelect | undefined;
try {
// 1. Fetch token row // 1. Fetch token row
const [row] = await this.db const [fetchedRow] = await this.db
.select() .select()
.from(federationEnrollmentTokens) .from(federationEnrollmentTokens)
.where(eq(federationEnrollmentTokens.token, token)) .where(eq(federationEnrollmentTokens.token, token))
.limit(1); .limit(1);
if (!row) { if (!fetchedRow) {
throw new NotFoundException('Enrollment token not found'); throw new NotFoundException('Enrollment token not found');
} }
row = fetchedRow;
// 2. Already used? // 2. Already used?
if (row.usedAt !== null) { if (row.usedAt !== null) {
@@ -144,7 +167,10 @@ export class EnrollmentService {
.update(federationEnrollmentTokens) .update(federationEnrollmentTokens)
.set({ usedAt: sql`NOW()` }) .set({ usedAt: sql`NOW()` })
.where( .where(
and(eq(federationEnrollmentTokens.token, token), isNull(federationEnrollmentTokens.usedAt)), and(
eq(federationEnrollmentTokens.token, token),
isNull(federationEnrollmentTokens.usedAt),
),
) )
.returning({ token: federationEnrollmentTokens.token }); .returning({ token: federationEnrollmentTokens.token });
@@ -164,8 +190,9 @@ export class EnrollmentService {
ttlSeconds: 300, ttlSeconds: 300,
}); });
} catch (err) { } catch (err) {
// HIGH-4: Log only the first 8 hex chars of the token for correlation — never log the full token.
this.logger.error( this.logger.error(
`issueCert failed after token ${token} was claimed — grant ${row.grantId} is stranded pending`, `issueCert failed after token ${token.slice(0, 8)}... was claimed — grant ${row.grantId} is stranded pending`,
err instanceof Error ? err.stack : String(err), err instanceof Error ? err.stack : String(err),
); );
if (err instanceof FederationScopeError) { if (err instanceof FederationScopeError) {
@@ -177,11 +204,19 @@ export class EnrollmentService {
// 7. Atomically activate grant, update peer record, and write audit log. // 7. Atomically activate grant, update peer record, and write audit log.
const certNotAfter = this.extractCertNotAfter(issued.certPem); const certNotAfter = this.extractCertNotAfter(issued.certPem);
await this.db.transaction(async (tx) => { await this.db.transaction(async (tx) => {
await tx // CRIT-2: Guard activation with WHERE status='pending' to prevent double-activation.
const [activated] = await tx
.update(federationGrants) .update(federationGrants)
.set({ status: 'active' }) .set({ status: 'active' })
.where(eq(federationGrants.id, row.grantId)); .where(and(eq(federationGrants.id, row!.grantId), eq(federationGrants.status, 'pending')))
.returning({ id: federationGrants.id });
if (!activated) {
throw new ConflictException(
`Grant ${row!.grantId} is no longer pending — cannot activate`,
);
}
// CRIT-2: Guard peer update with WHERE state='pending'.
await tx await tx
.update(federationPeers) .update(federationPeers)
.set({ .set({
@@ -190,12 +225,12 @@ export class EnrollmentService {
certNotAfter, certNotAfter,
state: 'active', state: 'active',
}) })
.where(eq(federationPeers.id, row.peerId)); .where(and(eq(federationPeers.id, row!.peerId), eq(federationPeers.state, 'pending')));
await tx.insert(federationAuditLog).values({ await tx.insert(federationAuditLog).values({
requestId: crypto.randomUUID(), requestId: crypto.randomUUID(),
peerId: row.peerId, peerId: row!.peerId,
grantId: row.grantId, grantId: row!.grantId,
verb: 'enrollment', verb: 'enrollment',
resource: 'federation_grant', resource: 'federation_grant',
statusCode: 200, statusCode: 200,
@@ -207,24 +242,40 @@ export class EnrollmentService {
`Enrollment complete — peerId=${row.peerId} grantId=${row.grantId} serial=${issued.serialNumber}`, `Enrollment complete — peerId=${row.peerId} grantId=${row.grantId} serial=${issued.serialNumber}`,
); );
outcome = 'allowed';
// 8. Return cert material // 8. Return cert material
return { return {
certPem: issued.certPem, certPem: issued.certPem,
certChainPem: issued.certChainPem, certChainPem: issued.certChainPem,
}; };
} catch (err) {
// HIGH-5: Best-effort audit write on failure — do not let this throw.
if (outcome === 'denied') {
await this.db
.insert(federationAuditLog)
.values({
requestId: crypto.randomUUID(),
peerId: row?.peerId ?? null,
grantId: row?.grantId ?? null,
verb: 'enrollment',
resource: 'federation_grant',
statusCode:
err instanceof GoneException ? 410 : err instanceof NotFoundException ? 404 : 500,
outcome: 'denied',
})
.catch(() => {});
}
throw err;
}
} }
/** /**
* Extract the notAfter date from a PEM certificate. * Extract the notAfter date from a PEM certificate.
* Falls back to 90 days from now if parsing fails. * HIGH-2: No silent fallback — a cert that cannot be parsed should fail loud.
*/ */
private extractCertNotAfter(certPem: string): Date { private extractCertNotAfter(certPem: string): Date {
try {
const cert = new X509Certificate(certPem); const cert = new X509Certificate(certPem);
return new Date(cert.validTo); return new Date(cert.validTo);
} catch {
// Fallback: 90 days from now
return new Date(Date.now() + 90 * 24 * 60 * 60 * 1000);
}
} }
} }

View File

@@ -0,0 +1,106 @@
# Mosaic Federation — Admin CLI Reference
Available since: FED-M2
## Grant Management
### Create a grant
```bash
mosaic federation grant create --user <userId> --peer <peerId> --scope <scope-file.json>
```
The scope file defines what resources and rows the peer may access:
```json
{
"resources": ["tasks", "notes"],
"excluded_resources": ["credentials"],
"max_rows_per_query": 100
}
```
Valid resource values: `tasks`, `notes`, `credentials`, `teams`, `users`
### List grants
```bash
mosaic federation grant list [--peer <peerId>] [--status pending|active|revoked|expired]
```
Shows all federation grants, optionally filtered by peer or status.
### Show a grant
```bash
mosaic federation grant show <grantId>
```
Display details of a single grant, including its scope, activation timestamp, and status.
### Revoke a grant
```bash
mosaic federation grant revoke <grantId> [--reason "Reason text"]
```
Revoke an active grant immediately. Revoked grants cannot be reactivated. The optional reason is stored in the audit log.
### Generate enrollment token
```bash
mosaic federation grant token <grantId> [--ttl <seconds>]
```
Generate a single-use enrollment token for the grant. The default TTL is 900 seconds (15 minutes); maximum 15 minutes.
Output includes the token and the full enrollment URL for the peer to use.
## Peer Management
### Add a peer (remote enrollment)
```bash
mosaic federation peer add <enrollment-url>
```
Enroll a remote peer using the enrollment URL obtained from a grant token. The command:
1. Generates a P-256 ECDSA keypair locally
2. Creates a certificate signing request (CSR)
3. Submits the CSR to the enrollment URL
4. Verifies the returned certificate includes the correct custom OIDs (grant ID and subject user ID)
5. Seals the private key at rest using `BETTER_AUTH_SECRET`
6. Stores the peer record and sealed key in the local gateway database
Once enrollment completes, the peer can authenticate using the certificate and private key.
### List peers
```bash
mosaic federation peer list
```
Shows all enrolled peers, including their certificate fingerprints and activation status.
## REST API Reference
All CLI commands call the local gateway admin API. Equivalent REST endpoints:
| CLI Command | REST Endpoint | Method |
| ------------ | ------------------------------------------------------------------------------------------- | ----------------- |
| grant create | `/api/admin/federation/grants` | POST |
| grant list | `/api/admin/federation/grants` | GET |
| grant show | `/api/admin/federation/grants/:id` | GET |
| grant revoke | `/api/admin/federation/grants/:id/revoke` | PATCH |
| grant token | `/api/admin/federation/grants/:id/tokens` | POST |
| peer list | `/api/admin/federation/peers` | GET |
| peer add | `/api/admin/federation/peers/keypair` + enrollment + `/api/admin/federation/peers/:id/cert` | POST, POST, PATCH |
## Security Notes
- **Enrollment tokens** are single-use and expire in 15 minutes (not configurable beyond 15 minutes)
- **Peer private keys** are encrypted at rest using AES-256-GCM, keyed from `BETTER_AUTH_SECRET`
- **Custom OIDs** in issued certificates are verified post-issuance: the grant ID and subject user ID must match the certificate extensions
- **Grant activation** is atomic — concurrent enrollment attempts for the same grant are rejected
- **Revoked grants** cannot be activated; peers attempting to use a revoked grant's token will be rejected

View File

@@ -7,11 +7,11 @@
**ID:** federation-v1-20260419 **ID:** federation-v1-20260419
**Statement:** Jarvis operates across 34 workstations in two physical locations (home, USC). The user currently reaches back to a single jarvis-brain checkout from every session; a prior OpenBrain attempt caused cache, latency, and opacity pain. This mission builds asymmetric federation between Mosaic Stack gateways so that a session on a user's home gateway can query their work gateway in real time without data ever persisting across the boundary, with full multi-tenant isolation and standard-PKI (X.509 / Step-CA) trust management. **Statement:** Jarvis operates across 34 workstations in two physical locations (home, USC). The user currently reaches back to a single jarvis-brain checkout from every session; a prior OpenBrain attempt caused cache, latency, and opacity pain. This mission builds asymmetric federation between Mosaic Stack gateways so that a session on a user's home gateway can query their work gateway in real time without data ever persisting across the boundary, with full multi-tenant isolation and standard-PKI (X.509 / Step-CA) trust management.
**Phase:** M2 active — Step-CA + grant schema + admin CLI; parallel test-deploy workstream stood up **Phase:** M3 active — mTLS handshake + list/get/capabilities verbs + scope enforcement
**Current Milestone:** FED-M2 **Current Milestone:** FED-M3
**Progress:** 1 / 7 milestones **Progress:** 2 / 7 milestones
**Status:** active **Status:** active
**Last Updated:** 2026-04-21 (M2 decomposed; mos-test-1/-2 designated as federation E2E test hosts) **Last Updated:** 2026-04-21 (M2 closed via PR #503, tag `fed-v0.2.0-m2`, issue #461 closed; M3 decomposed into 14 tasks)
**Parent Mission:** None — new mission **Parent Mission:** None — new mission
## Test Infrastructure ## Test Infrastructure
@@ -63,8 +63,8 @@ Key design references:
| # | ID | Name | Status | Branch | Issue | Started | Completed | | # | ID | Name | Status | Branch | Issue | Started | Completed |
| --- | ------ | --------------------------------------------- | ----------- | ------------------ | ----- | ---------- | ---------- | | --- | ------ | --------------------------------------------- | ----------- | ------------------ | ----- | ---------- | ---------- |
| 1 | FED-M1 | Federated tier infrastructure | done | (12 PRs #470-#481) | #460 | 2026-04-19 | 2026-04-19 | | 1 | FED-M1 | Federated tier infrastructure | done | (12 PRs #470-#481) | #460 | 2026-04-19 | 2026-04-19 |
| 2 | FED-M2 | Step-CA + grant schema + admin CLI | in-progress | (decomposition) | #461 | 2026-04-21 | | | 2 | FED-M2 | Step-CA + grant schema + admin CLI | done | (PRs #483-#503) | #461 | 2026-04-21 | 2026-04-21 |
| 3 | FED-M3 | mTLS handshake + list/get + scope enforcement | not-started | — | #462 | | — | | 3 | FED-M3 | mTLS handshake + list/get + scope enforcement | in-progress | (decomposition) | #462 | 2026-04-21 | — |
| 4 | FED-M4 | search verb + audit log + rate limit | not-started | — | #463 | — | — | | 4 | FED-M4 | search verb + audit log + rate limit | not-started | — | #463 | — | — |
| 5 | FED-M5 | Cache + offline degradation + OTEL | not-started | — | #464 | — | — | | 5 | FED-M5 | Cache + offline degradation + OTEL | not-started | — | #464 | — | — |
| 6 | FED-M6 | Revocation + auto-renewal + CRL | not-started | — | #465 | — | — | | 6 | FED-M6 | Revocation + auto-renewal + CRL | not-started | — | #465 | — | — |
@@ -86,16 +86,23 @@ Key design references:
## Session History ## Session History
| Session | Date | Runtime | Outcome | | Session | Date | Runtime | Outcome |
| ------- | ---------- | ------- | --------------------------------------------------------------------- | | ------- | ----------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| S1 | 2026-04-19 | claude | PRD authored, MILESTONES decomposed, 7 issues filed | | S1 | 2026-04-19 | claude | PRD authored, MILESTONES decomposed, 7 issues filed |
| S2-S4 | 2026-04-19 | claude | FED-M1 complete: 12 tasks (PRs #470-#481) merged; tag `fed-v0.1.0-m1` | | S2-S4 | 2026-04-19 | claude | FED-M1 complete: 12 tasks (PRs #470-#481) merged; tag `fed-v0.1.0-m1` |
| S5-S22 | 2026-04-19 → 2026-04-21 | claude | FED-M2 complete: 13 tasks (PRs #483-#503) merged; tag `fed-v0.2.0-m2`; issue #461 closed. Step-CA + grant schema + admin CLI shipped. |
| S23 | 2026-04-21 | claude | M3 decomposed into 14 tasks in `docs/federation/TASKS.md`. Manifest M3 row → in-progress. Next: kickoff M3-01. |
## Next Step ## Next Step
FED-M2 active. Decomposition landed in `docs/federation/TASKS.md` (M2-01..M2-13 code workstream + DEPLOY-01..DEPLOY-05 parallel test-deploy workstream, ~88K total). Tracking issue #482. FED-M3 active. Decomposition landed in `docs/federation/TASKS.md` (M3-01..M3-14, ~100K estimate). Tracking issue #462.
Parallel execution plan: Execution plan (parallel where possible):
- **CODE workstream**: M2-01 (DB migration) starts immediately — sonnet subagent on `feat/federation-m2-schema`. Then M2-02 → M2-09 sequentially with M2-04/M2-05/M2-06/M2-07 having interleaved CA/storage/grant dependencies. - **Foundation**: M3-01 (DTOs in `packages/types/src/federation/`) starts immediately — sonnet subagent on `feat/federation-m3-types`. Blocks all server + client work.
- **DEPLOY workstream**: DEPLOY-01 (image verify) → DEPLOY-02 (stack template) → DEPLOY-03/04 (mos-test-1/-2 deploy) → DEPLOY-05 (TEST-INFRA.md). Gated on Portainer wrapper PR (`PORTAINER_INSECURE` flag) merging first. - **Server stream** (after M3-01): M3-03 (AuthGuard) + M3-04 (ScopeService) in series, then M3-05 / M3-06 / M3-07 (verbs) in parallel.
- **Re-converge** at M2-10 (E2E test) once both workstreams ready. - **Client stream** (after M3-01, parallel with server): M3-08 (FederationClient) → M3-09 (QuerySourceService).
- **Harness** (parallel with everything): M3-02 (`tools/federation-harness/`) — needed for M3-11.
- **Test gates**: M3-10 (Integration) → M3-11 (E2E with harness) → M3-12 (Independent security review, two rounds budgeted).
- **Close**: M3-13 (Docs) → M3-14 (release tag `fed-v0.3.0-m3`, close #462).
**Test-bed fallback:** `mos-test-1/-2` deploy is still blocked on `FED-M2-DEPLOY-IMG-FIX`. The harness in M3-02 ships a local two-gateway docker-compose so M3-11 is not blocked. Production-host validation is M7's responsibility (PRD AC-12).

View File

@@ -70,6 +70,96 @@ For JSON output (useful in CI/automation):
mosaic gateway doctor --json mosaic gateway doctor --json
``` ```
## Step 2: Step-CA Bootstrap
Step-CA is a certificate authority that issues X.509 certificates for federation peers. In Mosaic federation, it signs peer certificates with custom OIDs that embed grant and user identities, enforcing authorization at the certificate level.
### Prerequisites for Step-CA
Before starting the CA, you must set up the dev password:
```bash
cp infra/step-ca/dev-password.example infra/step-ca/dev-password
# Edit dev-password and set your CA password (minimum 16 characters)
```
The password is required for the CA to boot and derive the provisioner key used by the gateway.
### Start the Step-CA service
Add the step-ca service to your federated stack:
```bash
docker compose -f docker-compose.federated.yml --profile federated up -d step-ca
```
On first boot, the init script (`infra/step-ca/init.sh`) runs automatically. It:
- Generates the CA root key and certificate in the Docker volume
- Creates the `mosaic-fed` JWK provisioner
- Applies the X.509 template from `infra/step-ca/templates/federation.tpl`
The volume is persistent, so subsequent boots reuse the existing CA keys.
Verify the CA is healthy:
```bash
curl https://localhost:9000/health --cacert /tmp/step-ca-root.crt
```
(If the root cert file doesn't exist yet, see the extraction steps below.)
### Extract credentials for the gateway
The gateway requires two credentials from the running CA:
**1. Provisioner key (for `STEP_CA_PROVISIONER_KEY_JSON`)**
```bash
docker exec $(docker ps -qf name=step-ca) cat /home/step/secrets/mosaic-fed.json > /tmp/step-ca-provisioner.json
```
This JSON file contains the JWK public and private keys for the `mosaic-fed` provisioner. Store it securely and pass its contents to the gateway via the `STEP_CA_PROVISIONER_KEY_JSON` environment variable.
**2. Root certificate (for `STEP_CA_ROOT_CERT_PATH`)**
```bash
docker cp $(docker ps -qf name=step-ca):/home/step/certs/root_ca.crt /tmp/step-ca-root.crt
```
This PEM file is the CA's root certificate, used to verify peer certificates issued by step-ca. Pass its path to the gateway via `STEP_CA_ROOT_CERT_PATH`.
### Custom OID Registry
Federation certificates include custom OIDs in the certificate extension. These encode authorization metadata:
| OID | Name | Description |
| ------------------- | ---------------------- | --------------------- |
| 1.3.6.1.4.1.99999.1 | mosaic_grant_id | Federation grant UUID |
| 1.3.6.1.4.1.99999.2 | mosaic_subject_user_id | Subject user UUID |
These OIDs are verified by the gateway after the CSR is signed, ensuring the certificate was issued with the correct grant and user context.
### Environment Variables
Configure the gateway with the following environment variables before startup:
| Variable | Required | Description |
| ------------------------------ | -------- | --------------------------------------------------------------------------------------------------------- |
| `STEP_CA_URL` | Yes | Base URL of the step-ca instance, e.g. `https://step-ca:9000` (use `https://localhost:9000` in local dev) |
| `STEP_CA_PROVISIONER_KEY_JSON` | Yes | JSON-encoded JWK from `/home/step/secrets/mosaic-fed.json` |
| `STEP_CA_ROOT_CERT_PATH` | Yes | Absolute path to the root CA certificate (e.g. `/tmp/step-ca-root.crt`) |
| `BETTER_AUTH_SECRET` | Yes | Secret used to seal peer private keys at rest; already required for M1 |
Example environment setup:
```bash
export STEP_CA_URL="https://localhost:9000"
export STEP_CA_PROVISIONER_KEY_JSON="$(cat /tmp/step-ca-provisioner.json)"
export STEP_CA_ROOT_CERT_PATH="/tmp/step-ca-root.crt"
export BETTER_AUTH_SECRET="<your-secret>"
```
## Troubleshooting ## Troubleshooting
### Port conflicts ### Port conflicts

View File

@@ -64,20 +64,20 @@ Goal: Two federated-tier gateways stood up on Portainer at `mos-test-1.woltje.co
Goal: An admin can create a federation grant; counterparty enrolls; cert is signed by Step-CA with SAN OIDs for `grantId` + `subjectUserId`. No runtime federation traffic flows yet (that's M3). Goal: An admin can create a federation grant; counterparty enrolls; cert is signed by Step-CA with SAN OIDs for `grantId` + `subjectUserId`. No runtime federation traffic flows yet (that's M3).
| id | status | description | issue | agent | branch | depends_on | estimate | notes | | id | status | description | issue | agent | branch | depends_on | estimate | notes |
| --------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | ------ | ---------------------------------- | ---------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | --------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | ------ | ---------------------------------- | ---------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| FED-M2-01 | needs-qa | DB migration: `federation_grants`, `federation_peers`, `federation_audit_log` tables + enum types (`grant_status`, `peer_state`). Drizzle schema + migration generation; migration tests. | #461 | sonnet | feat/federation-m2-schema | — | 5K | PR #486 open. First review NEEDS CHANGES (missing DESC indexes + reserved cols). Remediation subagent `a673dd9355dc26f82` in flight in worktree `agent-a4404ac1`. | | FED-M2-01 | done | DB migration: `federation_grants`, `federation_peers`, `federation_audit_log` tables + enum types (`grant_status`, `peer_state`). Drizzle schema + migration generation; migration tests. | #461 | sonnet | feat/federation-m2-schema | — | 5K | Shipped in PR #486. DESC indexes + reserved cols added after first review; migration tests green. |
| FED-M2-02 | not-started | Add Step-CA sidecar to `docker-compose.federated.yml`: official `smallstep/step-ca` image, persistent CA volume, JWK provisioner config baked into init script. | #461 | sonnet | feat/federation-m2-stepca | DEPLOY-02 | 4K | Profile-gated under `federated`. CA password from secret; dev compose uses dev-only password file. | | FED-M2-02 | done | Add Step-CA sidecar to `docker-compose.federated.yml`: official `smallstep/step-ca` image, persistent CA volume, JWK provisioner config baked into init script. | #461 | sonnet | feat/federation-m2-stepca | DEPLOY-02 | 4K | Shipped in PR #494. Profile-gated under `federated`; CA password from secret; dev compose uses dev-only password file. |
| FED-M2-03 | not-started | Scope JSON schema + validator: `resources` allowlist, `excluded_resources`, `include_teams`, `include_personal`, `max_rows_per_query`. Vitest unit tests for valid + invalid scopes. | #461 | sonnet | feat/federation-m2-scope-schema | — | 4K | Validator independent of CA reusable from grant CRUD + (later) M3 scope enforcement. | | FED-M2-03 | done | Scope JSON schema + validator: `resources` allowlist, `excluded_resources`, `include_teams`, `include_personal`, `max_rows_per_query`. Vitest unit tests for valid + invalid scopes. | #461 | sonnet | feat/federation-m2-scope-schema | — | 4K | Shipped in PR #496 (bundled with grants service). Validator independent of CA; reusable from grant CRUD + M3 scope enforcement. |
| FED-M2-04 | not-started | `apps/gateway/src/federation/ca.service.ts`: Step-CA client (CSR submission, OID-bearing cert retrieval). Mocked + integration tests against real Step-CA container. | #461 | sonnet | feat/federation-m2-ca-service | M2-02 | 6K | SAN OIDs: `grantId` (custom OID 1.3.6.1.4.1.99999.1) + `subjectUserId` (1.3.6.1.4.1.99999.2). Document OID assignments in PRD/SETUP. **Acceptance**: must (a) wire `federation.tpl` template into `mosaic-fed` provisioner config and (b) include a unit/integration test asserting issued certs contain BOTH OIDs — fails-loud guard against silent OID stripping (carry-forward from M2-02 review). | | FED-M2-04 | done | `apps/gateway/src/federation/ca.service.ts`: Step-CA client (CSR submission, OID-bearing cert retrieval). Mocked + integration tests against real Step-CA container. | #461 | sonnet | feat/federation-m2-ca-service | M2-02 | 6K | Shipped in PR #494. SAN OIDs 1.3.6.1.4.1.99999.1 (grantId) + 1.3.6.1.4.1.99999.2 (subjectUserId); integration test asserts both OIDs present in issued cert. |
| FED-M2-05 | not-started | Sealed storage for `client_key_pem` reusing existing `provider_credentials` sealing key. Tests prove DB-at-rest is ciphertext, not PEM. Key rotation path documented (deferred impl). | #461 | sonnet | feat/federation-m2-key-sealing | M2-01 | 5K | Separate from M2-06 to keep crypto seam isolated; reviewer focus is sealing only. | | FED-M2-05 | done | Sealed storage for `client_key_pem` reusing existing `provider_credentials` sealing key. Tests prove DB-at-rest is ciphertext, not PEM. Key rotation path documented (deferred impl). | #461 | sonnet | feat/federation-m2-key-sealing | M2-01 | 5K | Shipped in PR #495. Crypto seam isolated; tests confirm ciphertext-at-rest; key rotation deferred to M6. |
| FED-M2-06 | not-started | `grants.service.ts`: CRUD + status transitions (`pending``active``revoked`); integrates M2-03 (scope) + M2-05 (sealing). Unit tests cover all transitions including invalid ones. | #461 | sonnet | feat/federation-m2-grants-service | M2-03, M2-05 | 6K | Business logic only — CSR + cert work delegated to M2-04. Revocation handler is M6. | | FED-M2-06 | done | `grants.service.ts`: CRUD + status transitions (`pending``active``revoked`); integrates M2-03 (scope) + M2-05 (sealing). Unit tests cover all transitions including invalid ones. | #461 | sonnet | feat/federation-m2-grants-service | M2-03, M2-05 | 6K | Shipped in PR #496. All status transitions covered; invalid transition tests green; revocation handler deferred to M6. |
| FED-M2-07 | not-started | `enrollment.controller.ts`: short-lived single-use token endpoint; CSR signing; updates grant `pending``active`; emits enrollment audit (table-only write, M4 tightens). | #461 | sonnet | feat/federation-m2-enrollment | M2-04, M2-06 | 6K | Tokens single-use with 410 on replay; tokens TTL'd at 15min; rate-limited at request layer (M4 introduces guard, M2 uses simple lock). | | FED-M2-07 | done | `enrollment.controller.ts`: short-lived single-use token endpoint; CSR signing; updates grant `pending``active`; emits enrollment audit (table-only write, M4 tightens). | #461 | sonnet | feat/federation-m2-enrollment | M2-04, M2-06 | 6K | Shipped in PR #497. Tokens single-use with 410 on replay; TTL 15min; rate-limited at request layer. |
| FED-M2-08 | not-started | Admin CLI: `mosaic federation grant create/list/show` + `peer add/list`. Integration with grants.service (no API duplication). Help output + machine-readable JSON option. | #461 | sonnet | feat/federation-m2-cli | M2-06, M2-07 | 7K | `peer add <enrollment-url>` is the client-side flow; resolves enrollment URL → CSR → store sealed key + cert. | | FED-M2-08 | done | Admin CLI: `mosaic federation grant create/list/show` + `peer add/list`. Integration with grants.service (no API duplication). Help output + machine-readable JSON option. | #461 | sonnet | feat/federation-m2-cli | M2-06, M2-07 | 7K | Shipped in PR #498. `peer add <enrollment-url>` client-side flow; JSON output flag; admin REST controller co-shipped. |
| FED-M2-09 | not-started | Integration tests covering MILESTONES.md M2 acceptance tests #1, #2, #3, #5, #7, #8 (single-gateway suite). Real Step-CA container; vitest profile gated by `FEDERATED_INTEGRATION=1`. | #461 | sonnet | feat/federation-m2-integration | M2-08 | 8K | Tests #4 (cert OID match) + #6 (two-gateway peer-add) handled separately by M2-10 (E2E). | | FED-M2-09 | done | Integration tests covering MILESTONES.md M2 acceptance tests #1, #2, #3, #5, #7, #8 (single-gateway suite). Real Step-CA container; vitest profile gated by `FEDERATED_INTEGRATION=1`. | #461 | sonnet | feat/federation-m2-integration | M2-08 | 8K | Shipped in PR #499. All 6 acceptance tests green; gated by FEDERATED_INTEGRATION=1. |
| FED-M2-10 | not-started | E2E test against deployed mos-test-1 + mos-test-2 (or local two-gateway docker-compose if Portainer not ready): MILESTONES test #6 `peer add` yields `active` peer record with valid cert + key. | #461 | sonnet | feat/federation-m2-e2e | M2-08, DEPLOY-04 | 6K | Falls back to local docker-compose-two-gateways if remote test hosts not yet available. Documents both paths. | | FED-M2-10 | done | E2E test against deployed mos-test-1 + mos-test-2 (or local two-gateway docker-compose if Portainer not ready): MILESTONES test #6 `peer add` yields `active` peer record with valid cert + key. | #461 | sonnet | feat/federation-m2-e2e | M2-08, DEPLOY-04 | 6K | Shipped in PR #500. Local two-gateway docker-compose path used; `peer add` yields active peer with valid cert + sealed key. |
| FED-M2-11 | not-started | Independent security review (sonnet, not author of M2-04/05/06/07): focus on single-use token replay, sealing leak surfaces, OID match enforcement, scope schema bypass paths. | #461 | sonnet | feat/federation-m2-security-review | M2-10 | 8K | Apply M1 two-round pattern. Reviewer should explicitly attempt enrollment-token replay, OID-spoofing CSR, and key leak in error messages. | | FED-M2-11 | done | Independent security review (sonnet, not author of M2-04/05/06/07): focus on single-use token replay, sealing leak surfaces, OID match enforcement, scope schema bypass paths. | #461 | sonnet | feat/federation-m2-security-review | M2-10 | 8K | Shipped in PR #501. Two-round review; enrollment-token replay, OID-spoofing CSR, and key leak in error messages all verified and hardened. |
| FED-M2-12 | not-started | Docs update: `docs/federation/SETUP.md` Step-CA section; new `docs/federation/ADMIN-CLI.md` with grant/peer commands; scope schema reference; OID registration note. Runbook still M7-deferred. | #461 | haiku | feat/federation-m2-docs | M2-11 | 4K | Adds CA bootstrap section to SETUP.md with `docker compose --profile federated up step-ca` example. | | FED-M2-12 | done | Docs update: `docs/federation/SETUP.md` Step-CA section; new `docs/federation/ADMIN-CLI.md` with grant/peer commands; scope schema reference; OID registration note. Runbook still M7-deferred. | #461 | haiku | feat/federation-m2-docs | M2-11 | 4K | Shipped in PR #502. SETUP.md CA bootstrap section added; ADMIN-CLI.md created; scope schema reference and OID note included. |
| FED-M2-13 | not-started | PR aggregate close, CI green, merge to main, close #461. Release tag `fed-v0.2.0-m2`. Mark deploy stream complete. Update mission manifest M2 row. | #461 | sonnet | feat/federation-m2-close | M2-12 | 3K | Same close pattern as M1-12; queue-guard before merge; tea release-create with notes including deploy-stream PRs. | | FED-M2-13 | done | PR aggregate close, CI green, merge to main, close #461. Release tag `fed-v0.2.0-m2`. Mark deploy stream complete. Update mission manifest M2 row. | #461 | sonnet | chore/federation-m2-close | M2-12 | 3K | Release tag `fed-v0.2.0-m2` created; issue #461 closed; all M2 PRs #494#502 merged to main. |
**M2 code workstream estimate:** ~72K tokens (vs MILESTONES.md 30K — same over-budget pattern as M1, where per-task breakdown including tests/review/docs catches the real cost). **M2 code workstream estimate:** ~72K tokens (vs MILESTONES.md 30K — same over-budget pattern as M1, where per-task breakdown including tests/review/docs catches the real cost).
@@ -85,7 +85,38 @@ Goal: An admin can create a federation grant; counterparty enrolls; cert is sign
## Milestone 3 — mTLS handshake + list/get + scope enforcement (FED-M3) ## Milestone 3 — mTLS handshake + list/get + scope enforcement (FED-M3)
_Deferred. Issue #462._ Goal: Two federated gateways exchange real data over mTLS. Inbound requests pass through cert validation → grant lookup → scope enforcement → native RBAC → response. `list`, `get`, and `capabilities` verbs land. The federation E2E harness (`tools/federation-harness/`) is the new permanent test bed for M3+ and is gated on every milestone going forward.
> **Critical trust boundary.** Every 401/403 path needs a test. Code review is non-negotiable; M3-12 budgets two review rounds.
>
> **Tracking issue:** #462.
| id | status | description | issue | agent | branch | depends_on | estimate | notes |
| --------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | ------ | ------------------------------------ | ---------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| FED-M3-01 | not-started | `packages/types/src/federation/` — request/response DTOs for `list`, `get`, `capabilities` verbs. Wire-format zod schemas + inferred TS types. Includes `FederationRequest`, `FederationListResponse<T>`, `FederationGetResponse<T>`, `FederationCapabilitiesResponse`, error envelope, `_source` tag. | #462 | sonnet | feat/federation-m3-types | — | 4K | Reusable from gateway server + client + harness. Pure types — no I/O, no NestJS. |
| FED-M3-02 | not-started | `tools/federation-harness/` scaffold: `docker-compose.two-gateways.yml` (Server A + Server B + step-CA), `seed.ts` (provisions grants, peers, sample tasks/notes/credentials per scope variant), `harness.ts` helper (boots stack, returns typed clients). README documents harness use. | #462 | sonnet | feat/federation-m3-harness | DEPLOY-04 (soft) | 8K | Falls back to local docker-compose if `mos-test-1/-2` not yet redeployed (DEPLOY chain blocked on IMG-FIX). Permanent test infra used by M3+. |
| FED-M3-03 | not-started | `apps/gateway/src/federation/server/federation-auth.guard.ts` (NestJS guard). Validates inbound client cert from Fastify TLS context, extracts `grantId` + `subjectUserId` from custom OIDs, loads grant from DB, asserts `status='active'`, attaches `FederationContext` to request. | #462 | sonnet | feat/federation-m3-auth-guard | M3-01 | 8K | Reuses OID parsing logic mirrored from `ca.service.ts` post-issuance verification. 401 on malformed/missing OIDs; 403 on revoked/expired/missing grant. |
| FED-M3-04 | not-started | `apps/gateway/src/federation/server/scope.service.ts`. Pipeline: (1) resource allowlist + excluded check, (2) native RBAC eval as `subjectUserId`, (3) scope filter intersection (`include_teams`, `include_personal`), (4) `max_rows_per_query` cap. Pure service — DB calls injected. | #462 | sonnet | feat/federation-m3-scope-service | M3-01 | 10K | Hardest correctness target in M3. Reuses `parseFederationScope` (M2-03). Returns either `{ allowed: true, filter }` or structured deny reason for audit. |
| FED-M3-05 | not-started | `apps/gateway/src/federation/server/verbs/list.controller.ts`. Wires AuthGuard → ScopeService → tasks/notes/memory query layer; applies row cap; tags rows with `_source`. Resource selector via path param. | #462 | sonnet | feat/federation-m3-verb-list | M3-03, M3-04 | 6K | Routes: `POST /api/federation/v1/list/:resource`. No body persistence. Audit write deferred to M4. |
| FED-M3-06 | not-started | `apps/gateway/src/federation/server/verbs/get.controller.ts`. Single-resource fetch by id; same pipeline as list. 404 on not-found, 403 on RBAC/scope deny — both audited the same way. | #462 | sonnet | feat/federation-m3-verb-get | M3-03, M3-04 | 6K | `POST /api/federation/v1/get/:resource/:id`. Mirrors list controller patterns. |
| FED-M3-07 | not-started | `apps/gateway/src/federation/server/verbs/capabilities.controller.ts`. Read-only enumeration: returns `{ resources, excluded_resources, max_rows_per_query, supported_verbs }` derived from grant scope. Always allowed for an active grant — no RBAC eval. | #462 | sonnet | feat/federation-m3-verb-capabilities | M3-03 | 4K | `GET /api/federation/v1/capabilities`. Smallest verb; useful sanity check that mTLS + auth guard work end-to-end. |
| FED-M3-08 | not-started | `apps/gateway/src/federation/client/federation-client.service.ts`. Outbound mTLS dialer: picks `(certPem, sealed clientKey)` from `federation_peers`, unwraps key, builds undici Agent with mTLS, calls peer verb, parses typed response, wraps non-2xx into `FederationClientError`. | #462 | sonnet | feat/federation-m3-client | M3-01 | 8K | Independent of server stream — can land in parallel with M3-03/04. Cert/key cached per-peer; flushed by future M5/M6 logic. |
| FED-M3-09 | not-started | `apps/gateway/src/federation/client/query-source.service.ts`. Accepts `source: "local" \| "federated:<host>" \| "all"` from gateway query layer; for `"all"` fans out to local + each peer in parallel; merges results; tags every row with `_source`. | #462 | sonnet | feat/federation-m3-query-source | M3-08 | 8K | Per-peer failure surfaces as `_partial: true` in response, not hard failure (sets up M5 offline UX). M5 adds caching + circuit breaker on top. |
| FED-M3-10 | not-started | Integration tests for MILESTONES.md M3 acceptance #6 (malformed OIDs → 401; valid cert + revoked grant → 403) and #7 (`max_rows_per_query` cap). Real PG, mocked TLS context (Fastify req shim). | #462 | sonnet | feat/federation-m3-integration | M3-05, M3-06 | 8K | Vitest profile gated by `FEDERATED_INTEGRATION=1`. Single-gateway suite; no harness required. |
| FED-M3-11 | not-started | E2E tests for MILESTONES.md M3 acceptance #1, #2, #3, #4, #5, #8, #9, #10 (8 cases). Uses harness from M3-02; two real gateways, real Step-CA, real mTLS. Each test asserts both happy-path response and audit/no-persist invariants. | #462 | sonnet | feat/federation-m3-e2e | M3-02, M3-09 | 12K | Largest single task. Each acceptance gets its own `it(...)` for clear failure attribution. |
| FED-M3-12 | not-started | Independent security review (sonnet, not author of M3-03/04/05/06/07/08/09): focus on cert-SAN spoofing, OID extraction edge cases, scope-bypass via filter manipulation, RBAC-bypass via subjectUser swap, response leakage when scope deny. | #462 | sonnet | feat/federation-m3-security-review | M3-11 | 10K | Two review rounds budgeted. PRD requires explicit test for every 401/403 path — review verifies coverage. |
| FED-M3-13 | not-started | Docs update: `docs/federation/SETUP.md` mTLS handshake section, new `docs/federation/HARNESS.md` for federation-harness usage, OID reference table in SETUP.md, scope enforcement pipeline diagram. Runbook still M7-deferred. | #462 | haiku | feat/federation-m3-docs | M3-12 | 5K | One ASCII diagram for the auth-guard → scope → RBAC pipeline; helps future reviewers reason about denial paths. |
| FED-M3-14 | not-started | PR aggregate close, CI green, merge to main, close #462. Release tag `fed-v0.3.0-m3`. Update mission manifest M3 row → done; M4 row → in-progress when work begins. | #462 | sonnet | chore/federation-m3-close | M3-13 | 3K | Same close pattern as M1-12 / M2-13. |
**M3 estimate:** ~100K tokens (vs MILESTONES.md 40K — same per-task breakdown pattern as M1/M2: tests, review, and docs split out from implementation cost). Largest milestone in the federation mission.
**Parallelization opportunities:**
- M3-08 (client) can land in parallel with M3-03/M3-04 (server pipeline) — they only share DTOs from M3-01.
- M3-02 (harness) can land in parallel with everything except M3-11.
- M3-05/M3-06/M3-07 (verbs) are independent of each other once M3-03/M3-04 land.
**Test bed fallback:** If `mos-test-1.woltje.com` / `mos-test-2.woltje.com` are still blocked on `FED-M2-DEPLOY-IMG-FIX` when M3-11 is ready to run, the harness's local `docker-compose.two-gateways.yml` is a sufficient stand-in. Production-host validation moves to M7 acceptance suite (PRD AC-12).
## Milestone 4 — search + audit + rate limit (FED-M4) ## Milestone 4 — search + audit + rate limit (FED-M4)

View File

@@ -612,3 +612,44 @@ Independent security review surfaced three high-impact and four medium findings;
7. DEPLOY-03/04 acceptance probes (`mosaic gateway doctor --json`, pgvector `vector(3)` round-trip) 7. DEPLOY-03/04 acceptance probes (`mosaic gateway doctor --json`, pgvector `vector(3)` round-trip)
8. DEPLOY-05: author `docs/federation/TEST-INFRA.md` 8. DEPLOY-05: author `docs/federation/TEST-INFRA.md`
9. M2-02 (Step-CA sidecar) kicks off after image health is green 9. M2-02 (Step-CA sidecar) kicks off after image health is green
### Session 23 — 2026-04-21 — M2 close + M3 decomposition
**Closed at compaction boundary:** all 13 M2 tasks done, PRs #494#503 merged to `main`, tag `fed-v0.2.0-m2` published, Gitea release notes posted, issue #461 closed. Main at `4ece6dc6`.
**M2 hardening landed in PR #501** (security review remediation):
- CRIT-1: post-issuance OID verification in `ca.service.ts` (rejects cert if `mosaic_grant_id` / `mosaic_subject_user_id` extensions missing or mismatched)
- CRIT-2: atomic activation guard `WHERE status='pending'` on grant + `WHERE state='pending'` on peer; throws `ConflictException` if lost race
- HIGH-2: removed try/catch fallback in `extractCertNotAfter` — parse failures propagate as 500 (no silent 90-day default)
- HIGH-4: token slice for logging (`${token.slice(0, 8)}...`) — no full token in stdout
- HIGH-5: `redeem()` wrapped in try/catch with best-effort failure audit; uses `null` (not `'unknown'`) for nullable UUID FK fallback
- MED-3: `createToken` validates `grant.peerId === dto.peerId`; `BadRequestException` on mismatch
**Remaining M2 security findings deferred to M3+:**
- HIGH-1: peerId/subjectUserId tenancy validation on `createGrant` (M3 ScopeService work surfaces this)
- HIGH-3: Step-CA cert SHA-256 fingerprint pinning (M5 cert handling)
- MED-1: token entropy already 32 bytes — wontfix
- MED-2: per-route rate limit on enrollment endpoint (M4 rate limit work)
- MED-4: CSR CN binding to peer's commonName (M3 AuthGuard work)
**M3 decomposition landed in this session:**
- 14 tasks (M3-01..M3-14), ~100K estimate
- Structure mirrors M1/M2 pattern: foundation → server stream + client stream + harness in parallel → integration → E2E → security review → docs → close
- M3-02 ships local two-gateway docker-compose (`tools/federation-harness/`) so M3-11 E2E is not blocked on the Portainer test bed (which is still blocked on `FED-M2-DEPLOY-IMG-FIX`)
**Subagent doctrine retained from M2:**
- All worker subagents use `isolation: "worktree"` to prevent branch-race incidents
- Code review is independent (different subagent, no overlap with author of work)
- `tea pr create --repo mosaicstack/stack --login mosaicstack` is the working PR-create path; `pr-create.sh` has shell-quoting bugs (followup #45 if not already filed)
- Cost tier: foundational implementation = sonnet, docs = haiku, complex multi-file architecture (security review, scope service) = sonnet with two review rounds
**Next concrete step:**
1. PR for the M3 planning artifact (this commit) — branch `docs/federation-m3-planning`
2. After merge, kickoff M3-01 (DTOs) on `feat/federation-m3-types` with sonnet subagent in worktree
3. Once M3-01 lands, fan out: M3-02 (harness) || M3-03 (AuthGuard) → M3-04 (ScopeService) || M3-08 (FederationClient)
4. Re-converge at M3-10 (Integration) → M3-11 (E2E)