Files
stack/docs/federation/MISSION-MANIFEST.md
Jarvis 7652388c3a
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
docs(federation): decompose FED-M2 + add deploy workstream
- TASKS.md: 13 M2 tasks (Step-CA + grant schema + admin CLI) + 5 deploy tasks (mos-test-1/-2 on Portainer/Swarm)
- MISSION-MANIFEST.md: phase=M2 active; test infra table records mos-test-1/-2 as federation E2E hosts
- Refs new tracking issue #482 (deploy workstream); FED-M2 code workstream still tracks #461

Refs #461 #482
2026-04-21 20:08:12 -05:00

98 lines
7.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Mission Manifest — Federation v1
> Persistent document tracking full mission scope, status, and session history.
> Updated by the orchestrator at each phase transition and milestone completion.
## Mission
**ID:** federation-v1-20260419
**Statement:** Jarvis operates across 34 workstations in two physical locations (home, USC). The user currently reaches back to a single jarvis-brain checkout from every session; a prior OpenBrain attempt caused cache, latency, and opacity pain. This mission builds asymmetric federation between Mosaic Stack gateways so that a session on a user's home gateway can query their work gateway in real time without data ever persisting across the boundary, with full multi-tenant isolation and standard-PKI (X.509 / Step-CA) trust management.
**Phase:** M2 active — Step-CA + grant schema + admin CLI; parallel test-deploy workstream stood up
**Current Milestone:** FED-M2
**Progress:** 1 / 7 milestones
**Status:** active
**Last Updated:** 2026-04-21 (M2 decomposed; mos-test-1/-2 designated as federation E2E test hosts)
## Test Infrastructure
| Host | Role | Image | Tier |
| ----------------------- | ----------------------------------- | ------------------------------------- | --------- |
| `mos-test-1.woltje.com` | Federation Server A (querying side) | `gateway:fed-v0.1.0-m1` (M1 baseline) | federated |
| `mos-test-2.woltje.com` | Federation Server B (serving side) | `gateway:fed-v0.1.0-m1` (M1 baseline) | federated |
These are TEST hosts for federation E2E (M3+). Distinct from PRD AC-12 production targets (`woltje.com``uscllc.com`). Deployment workstream tracked in `docs/federation/TASKS.md` under FED-M2-DEPLOY-\*.
**Parent Mission:** None — new mission
## Context
Federation is the solution to what originally drove OpenBrain. The prior attempt coupled every agent session to a remote service, introduced cache/latency/opacity pain, and created a hard dependency that punished offline use. This redesign:
1. Makes federation **gateway-to-gateway**, not agent-to-service
2. Keeps each user's home instance as source of truth for their data
3. Exposes scoped, read-only data on demand without persisting across the boundary
4. Uses X.509 mTLS via Step-CA so rotation/revocation/CRL/OCSP are standard
5. Supports multi-tenant serving sides (employees on uscllc.com each federating back to their own home gateway) with no cross-user leakage
6. Requires federation-tier instances on both sides (PG + pgvector + Valkey) — local/standalone tiers cannot federate
7. Works over public HTTPS (no VPN required); Tailscale is an optional overlay
Key design references:
- `docs/federation/PRD.md` — 16-section product requirements
- `docs/federation/MILESTONES.md` — 7-milestone decomposition with per-milestone acceptance tests
- `docs/federation/TASKS.md` — per-task breakdown (M1 populated; M2-M7 deferred to mission planning)
- `docs/research/mempalace-evaluation/` (in jarvis-brain) — why we didn't adopt MemPalace
## Success Criteria
- [ ] AC-1: Two Mosaic Stack gateways on different hosts can establish a federation grant via CLI-driven onboarding
- [ ] AC-2: Server A can query Server B for `tasks`, `notes`, `memory` respecting scope filters
- [ ] AC-3: User on B with no grant cannot be queried by A, even if A has a valid grant for another user (cross-user isolation)
- [ ] AC-4: Revoking a grant on B causes A's next request to fail with a clear error within one request cycle
- [ ] AC-5: Cert rotation happens automatically at T-7 days; in-progress session survives rotation without user action
- [ ] AC-6: Rate-limit enforcement returns 429 with `Retry-After`; client backs off
- [ ] AC-7: With B unreachable, a session on A completes using local data and surfaces "federation offline for `<peer>`" once per session
- [ ] AC-8: Every federated request appears in B's `federation_audit_log` within 1 second
- [ ] AC-9: Scope excluding `credentials` means credentials are never returned — even via `search` with matching keywords
- [ ] AC-10: `mosaic federation status` shows cert expiry, grant status, last success/failure per peer
- [ ] AC-11: Full 3-employee multi-tenant scenario passes with no cross-user leakage
- [ ] AC-12: Two-gateway production deployment (woltje.com ↔ uscllc.com) operational ≥7 days without incident
- [ ] AC-13: All 7 milestones ship as merged PRs with green CI and closed issues
## Milestones
| # | ID | Name | Status | Branch | Issue | Started | Completed |
| --- | ------ | --------------------------------------------- | ----------- | ------------------ | ----- | ---------- | ---------- |
| 1 | FED-M1 | Federated tier infrastructure | done | (12 PRs #470-#481) | #460 | 2026-04-19 | 2026-04-19 |
| 2 | FED-M2 | Step-CA + grant schema + admin CLI | not-started | — | #461 | — | — |
| 3 | FED-M3 | mTLS handshake + list/get + scope enforcement | not-started | — | #462 | — | — |
| 4 | FED-M4 | search verb + audit log + rate limit | not-started | — | #463 | — | — |
| 5 | FED-M5 | Cache + offline degradation + OTEL | not-started | — | #464 | — | — |
| 6 | FED-M6 | Revocation + auto-renewal + CRL | not-started | — | #465 | — | — |
| 7 | FED-M7 | Multi-user RBAC hardening + acceptance suite | not-started | — | #466 | — | — |
## Budget
| Milestone | Est. tokens | Parallelizable? |
| --------- | ----------- | ---------------------- |
| FED-M1 | 20K | No (foundation) |
| FED-M2 | 30K | No (needs M1) |
| FED-M3 | 40K | No (needs M2) |
| FED-M4 | 20K | No (needs M3) |
| FED-M5 | 20K | Yes (with M6 after M4) |
| FED-M6 | 20K | Yes (with M5 after M3) |
| FED-M7 | 25K | No (needs all) |
| **Total** | **~175K** | |
## Session History
| Session | Date | Runtime | Outcome |
| ------- | ---------- | ------- | --------------------------------------------------------------------- |
| S1 | 2026-04-19 | claude | PRD authored, MILESTONES decomposed, 7 issues filed |
| S2-S4 | 2026-04-19 | claude | FED-M1 complete: 12 tasks (PRs #470-#481) merged; tag `fed-v0.1.0-m1` |
## Next Step
FED-M1 complete (12 PRs #470-#481, tag `fed-v0.1.0-m1`). Federated tier infrastructure is testable end-to-end: see `docs/federation/SETUP.md` and `docs/guides/migrate-tier.md`.
Begin FED-M2 (Step-CA + grant schema + admin CLI) when planning is greenlit. Issue #461 tracks scope; orchestrator decomposes M2 into per-task rows in `docs/federation/TASKS.md` at the start of M2.