102 lines
7.3 KiB
Markdown
102 lines
7.3 KiB
Markdown
# Mission Manifest — Federation v1
|
||
|
||
> Persistent document tracking full mission scope, status, and session history.
|
||
> Updated by the orchestrator at each phase transition and milestone completion.
|
||
|
||
## Mission
|
||
|
||
**ID:** federation-v1-20260419
|
||
**Statement:** Jarvis operates across 3–4 workstations in two physical locations (home, USC). The user currently reaches back to a single jarvis-brain checkout from every session; a prior OpenBrain attempt caused cache, latency, and opacity pain. This mission builds asymmetric federation between Mosaic Stack gateways so that a session on a user's home gateway can query their work gateway in real time without data ever persisting across the boundary, with full multi-tenant isolation and standard-PKI (X.509 / Step-CA) trust management.
|
||
**Phase:** M2 active — Step-CA + grant schema + admin CLI; parallel test-deploy workstream stood up
|
||
**Current Milestone:** FED-M2
|
||
**Progress:** 1 / 7 milestones
|
||
**Status:** active
|
||
**Last Updated:** 2026-04-21 (M2 decomposed; mos-test-1/-2 designated as federation E2E test hosts)
|
||
**Parent Mission:** None — new mission
|
||
|
||
## Test Infrastructure
|
||
|
||
| Host | Role | Image | Tier |
|
||
| ----------------------- | ----------------------------------- | ------------------------------------- | --------- |
|
||
| `mos-test-1.woltje.com` | Federation Server A (querying side) | `gateway:fed-v0.1.0-m1` (M1 baseline) | federated |
|
||
| `mos-test-2.woltje.com` | Federation Server B (serving side) | `gateway:fed-v0.1.0-m1` (M1 baseline) | federated |
|
||
|
||
These are TEST hosts for federation E2E (M3+). Distinct from PRD AC-12 production targets (`woltje.com` ↔ `uscllc.com`). Deployment workstream tracked in `docs/federation/TASKS.md` under FED-M2-DEPLOY-\*.
|
||
|
||
## Context
|
||
|
||
Federation is the solution to what originally drove OpenBrain. The prior attempt coupled every agent session to a remote service, introduced cache/latency/opacity pain, and created a hard dependency that punished offline use. This redesign:
|
||
|
||
1. Makes federation **gateway-to-gateway**, not agent-to-service
|
||
2. Keeps each user's home instance as source of truth for their data
|
||
3. Exposes scoped, read-only data on demand without persisting across the boundary
|
||
4. Uses X.509 mTLS via Step-CA so rotation/revocation/CRL/OCSP are standard
|
||
5. Supports multi-tenant serving sides (employees on uscllc.com each federating back to their own home gateway) with no cross-user leakage
|
||
6. Requires federation-tier instances on both sides (PG + pgvector + Valkey) — local/standalone tiers cannot federate
|
||
7. Works over public HTTPS (no VPN required); Tailscale is an optional overlay
|
||
|
||
Key design references:
|
||
|
||
- `docs/federation/PRD.md` — 16-section product requirements
|
||
- `docs/federation/MILESTONES.md` — 7-milestone decomposition with per-milestone acceptance tests
|
||
- `docs/federation/TASKS.md` — per-task breakdown (M1 populated; M2-M7 deferred to mission planning)
|
||
- `docs/research/mempalace-evaluation/` (in jarvis-brain) — why we didn't adopt MemPalace
|
||
|
||
## Success Criteria
|
||
|
||
- [ ] AC-1: Two Mosaic Stack gateways on different hosts can establish a federation grant via CLI-driven onboarding
|
||
- [ ] AC-2: Server A can query Server B for `tasks`, `notes`, `memory` respecting scope filters
|
||
- [ ] AC-3: User on B with no grant cannot be queried by A, even if A has a valid grant for another user (cross-user isolation)
|
||
- [ ] AC-4: Revoking a grant on B causes A's next request to fail with a clear error within one request cycle
|
||
- [ ] AC-5: Cert rotation happens automatically at T-7 days; in-progress session survives rotation without user action
|
||
- [ ] AC-6: Rate-limit enforcement returns 429 with `Retry-After`; client backs off
|
||
- [ ] AC-7: With B unreachable, a session on A completes using local data and surfaces "federation offline for `<peer>`" once per session
|
||
- [ ] AC-8: Every federated request appears in B's `federation_audit_log` within 1 second
|
||
- [ ] AC-9: Scope excluding `credentials` means credentials are never returned — even via `search` with matching keywords
|
||
- [ ] AC-10: `mosaic federation status` shows cert expiry, grant status, last success/failure per peer
|
||
- [ ] AC-11: Full 3-employee multi-tenant scenario passes with no cross-user leakage
|
||
- [ ] AC-12: Two-gateway production deployment (woltje.com ↔ uscllc.com) operational ≥7 days without incident
|
||
- [ ] AC-13: All 7 milestones ship as merged PRs with green CI and closed issues
|
||
|
||
## Milestones
|
||
|
||
| # | ID | Name | Status | Branch | Issue | Started | Completed |
|
||
| --- | ------ | --------------------------------------------- | ----------- | ------------------ | ----- | ---------- | ---------- |
|
||
| 1 | FED-M1 | Federated tier infrastructure | done | (12 PRs #470-#481) | #460 | 2026-04-19 | 2026-04-19 |
|
||
| 2 | FED-M2 | Step-CA + grant schema + admin CLI | in-progress | (decomposition) | #461 | 2026-04-21 | — |
|
||
| 3 | FED-M3 | mTLS handshake + list/get + scope enforcement | not-started | — | #462 | — | — |
|
||
| 4 | FED-M4 | search verb + audit log + rate limit | not-started | — | #463 | — | — |
|
||
| 5 | FED-M5 | Cache + offline degradation + OTEL | not-started | — | #464 | — | — |
|
||
| 6 | FED-M6 | Revocation + auto-renewal + CRL | not-started | — | #465 | — | — |
|
||
| 7 | FED-M7 | Multi-user RBAC hardening + acceptance suite | not-started | — | #466 | — | — |
|
||
|
||
## Budget
|
||
|
||
| Milestone | Est. tokens | Parallelizable? |
|
||
| --------- | ----------- | ---------------------- |
|
||
| FED-M1 | 20K | No (foundation) |
|
||
| FED-M2 | 30K | No (needs M1) |
|
||
| FED-M3 | 40K | No (needs M2) |
|
||
| FED-M4 | 20K | No (needs M3) |
|
||
| FED-M5 | 20K | Yes (with M6 after M4) |
|
||
| FED-M6 | 20K | Yes (with M5 after M3) |
|
||
| FED-M7 | 25K | No (needs all) |
|
||
| **Total** | **~175K** | |
|
||
|
||
## Session History
|
||
|
||
| Session | Date | Runtime | Outcome |
|
||
| ------- | ---------- | ------- | --------------------------------------------------------------------- |
|
||
| S1 | 2026-04-19 | claude | PRD authored, MILESTONES decomposed, 7 issues filed |
|
||
| S2-S4 | 2026-04-19 | claude | FED-M1 complete: 12 tasks (PRs #470-#481) merged; tag `fed-v0.1.0-m1` |
|
||
|
||
## Next Step
|
||
|
||
FED-M2 active. Decomposition landed in `docs/federation/TASKS.md` (M2-01..M2-13 code workstream + DEPLOY-01..DEPLOY-05 parallel test-deploy workstream, ~88K total). Tracking issue #482.
|
||
|
||
Parallel execution plan:
|
||
|
||
- **CODE workstream**: M2-01 (DB migration) starts immediately — sonnet subagent on `feat/federation-m2-schema`. Then M2-02 → M2-09 sequentially with M2-04/M2-05/M2-06/M2-07 having interleaved CA/storage/grant dependencies.
|
||
- **DEPLOY workstream**: DEPLOY-01 (image verify) → DEPLOY-02 (stack template) → DEPLOY-03/04 (mos-test-1/-2 deploy) → DEPLOY-05 (TEST-INFRA.md). Gated on Portainer wrapper PR (`PORTAINER_INSECURE` flag) merging first.
|
||
- **Re-converge** at M2-10 (E2E test) once both workstreams ready.
|