Files
stack/tools/federation-harness/README.md
Jarvis b445033c69 feat(federation): two-gateway test harness scaffold (FED-M3-02)
Adds tools/federation-harness/ — the permanent test bed for M3+ federation
E2E tests. Boots two gateways (Server A + Server B) on a shared Docker bridge
network with per-gateway Postgres/pgvector + Valkey and a shared Step-CA.

- docker-compose.two-gateways.yml: gateway-a/b, postgres-a/b, valkey-a/b,
  step-ca; image digest-pinned to sha256:1069117740e... (sha-9f1a081, #491)
- seed.ts: provisions scope variants A/B/C via real admin REST API; walks
  full enrollment flow (peer keypair → grant → token → redeem → cert store)
- harness.ts: bootHarness/tearDownHarness/serverA/serverB/seed helpers for
  vitest; idempotent boot (reuses running stack when both gateways healthy)
- README.md: prereqs, topology, seed usage, vitest integration, port override,
  troubleshooting, image digest note

No production code modified. Quality gates: typecheck ✓ lint ✓ format ✓

Closes #462

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 21:54:46 -05:00

219 lines
7.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Federation Test Harness
Local two-gateway federation test infrastructure for Mosaic Stack M3+.
This harness boots two real gateway instances (`gateway-a`, `gateway-b`) on a
shared Docker bridge network, each backed by its own Postgres (pgvector) +
Valkey, sharing a single Step-CA. It is the test bed for all M3+ federation
E2E tests.
## Prerequisites
- Docker with Compose v2 (`docker compose version` ≥ 2.20)
- pnpm (for running via repo scripts)
- `infra/step-ca/dev-password` must exist (copy from `infra/step-ca/dev-password.example`)
## Network Topology
```
Host machine
├── localhost:14001 → gateway-a (Server A — home / requesting)
├── localhost:14002 → gateway-b (Server B — work / serving)
├── localhost:15432 → postgres-a
├── localhost:15433 → postgres-b
├── localhost:16379 → valkey-a
├── localhost:16380 → valkey-b
└── localhost:19000 → step-ca (shared CA)
Docker network: fed-test-net (bridge)
gateway-a ←──── mTLS ────→ gateway-b
↘ ↗
step-ca
```
Ports are chosen to avoid collision with the base dev stack (5433, 6380, 14242, 9000).
## Starting the Harness
```bash
# From repo root
docker compose -f tools/federation-harness/docker-compose.two-gateways.yml up -d
# Wait for all services to be healthy (~60-90s on first boot due to NestJS cold start)
docker compose -f tools/federation-harness/docker-compose.two-gateways.yml ps
```
## Seeding Test Data
The seed script provisions three grant scope variants (A, B, C) and walks the
full enrollment flow so Server A ends up with active peers pointing at Server B.
```bash
# Assumes stack is already running
pnpm tsx tools/federation-harness/seed.ts
# Or boot + seed in one step
pnpm tsx tools/federation-harness/seed.ts --boot
```
### Scope Variants
| Variant | Resources | Filters | Excluded | Purpose |
| ------- | ------------------ | ---------------------------------- | ----------- | ------------------------------- |
| A | tasks, notes | include_personal: true | (none) | Personal data federation |
| B | tasks | include_teams: ['T1'], no personal | (none) | Team-scoped, no personal |
| C | tasks, credentials | include_personal: true | credentials | Sanity: excluded wins over list |
## Using from Vitest
```ts
import {
bootHarness,
tearDownHarness,
serverA,
serverB,
seed,
} from '../../tools/federation-harness/harness.js';
import type { HarnessHandle } from '../../tools/federation-harness/harness.js';
let handle: HarnessHandle;
beforeAll(async () => {
handle = await bootHarness();
}, 180_000); // allow 3 min for Docker pull + NestJS cold start
afterAll(async () => {
await tearDownHarness(handle);
});
test('variant A: list tasks returns personal tasks', async () => {
const seedResult = await seed(handle, 'variantA');
const a = serverA(handle);
const res = await fetch(`${a.baseUrl}/api/federation/tasks`, {
headers: { 'x-federation-grant': seedResult.grants.variantA.id },
});
expect(res.status).toBe(200);
});
```
The `bootHarness()` function is **idempotent**: if both gateways are already
healthy, it reuses the running stack and returns `ownedStack: false`. Tests
should not call `tearDownHarness` when `ownedStack` is false unless they
explicitly want to shut down a shared stack.
## Vitest Config (pnpm test:federation)
Add to `vitest.config.ts` at repo root (or a dedicated config):
```ts
// vitest.federation.config.ts
import { defineConfig } from 'vitest/config';
export default defineConfig({
test: {
include: ['**/*.federation.test.ts'],
testTimeout: 60_000,
hookTimeout: 180_000,
reporters: ['verbose'],
},
});
```
Then add to root `package.json`:
```json
"test:federation": "vitest run --config vitest.federation.config.ts"
```
## Nuking State
```bash
# Remove containers AND volumes (ephemeral state — CA keys, DBs, everything)
docker compose -f tools/federation-harness/docker-compose.two-gateways.yml down -v
```
On next `up`, Step-CA re-initialises from scratch and generates new CA keys.
## Step-CA Root Certificate
The CA root lives in the `fed-harness-step-ca` Docker volume at
`/home/step/certs/root_ca.crt`. To extract it to the host:
```bash
docker run --rm \
-v fed-harness-step-ca:/home/step \
alpine cat /home/step/certs/root_ca.crt > /tmp/fed-harness-root-ca.crt
```
## Troubleshooting
### Port conflicts
Default host ports: 14001, 14002, 15432, 15433, 16379, 16380, 19000.
Override via environment variables before `docker compose up`:
```bash
GATEWAY_A_HOST_PORT=14101 GATEWAY_B_HOST_PORT=14102 \
docker compose -f tools/federation-harness/docker-compose.two-gateways.yml up -d
```
### Image pull failures
The gateway image is digest-pinned to:
```
git.mosaicstack.dev/mosaicstack/stack/gateway@sha256:1069117740e00ccfeba357cae38c43f3729fe5ae702740ce474f6512414d7c02
```
(sha-9f1a081, post-#491 IMG-FIX)
If the registry is unreachable, Docker will use the locally cached image if
present. If no local image exists, the compose up will fail with a pull error.
In that case:
1. Ensure you can reach `git.mosaicstack.dev` (VPN, DNS, etc.).
2. Log in: `docker login git.mosaicstack.dev`
3. Pull manually: `docker pull git.mosaicstack.dev/mosaicstack/stack/gateway@sha256:1069117740e00ccfeba357cae38c43f3729fe5ae702740ce474f6512414d7c02`
### NestJS cold start
Gateway containers take 4060 seconds to become healthy on first boot (Node.js
module resolution + NestJS DI bootstrap). The `start_period: 60s` in the
compose healthcheck covers this. `bootHarness()` polls for up to 3 minutes.
### Step-CA startup
Step-CA initialises on first boot (generates CA keys). This takes ~5-10s.
The `start_period: 30s` in the healthcheck covers it. Both gateways wait for
Step-CA to be healthy before starting (`depends_on: step-ca: condition: service_healthy`).
### dev-password missing
The Step-CA container requires `infra/step-ca/dev-password` to be mounted.
Copy the example and set a local password:
```bash
cp infra/step-ca/dev-password.example infra/step-ca/dev-password
# Edit the file to set your preferred dev CA password
```
The file is `.gitignore`d — do not commit it.
## Image Digest Note
The gateway image is pinned to `sha256:1069117740e00ccfeba357cae38c43f3729fe5ae702740ce474f6512414d7c02`
(sha-9f1a081). This is the digest promoted by PR #491 (IMG-FIX). The `latest`
tag is forbidden per Mosaic image policy. When a new gateway build is promoted,
update the digest in `docker-compose.two-gateways.yml` and in this file.
## Permanent Infrastructure
This harness is designed to outlive M3 and be reused by M4+ milestone tests.
It is not a throwaway scaffold — treat it as production test infrastructure:
- Keep it idempotent.
- Do not hardcode test assumptions in the harness layer (put them in tests).
- Update the seed script when new scope variants are needed.
- The README and harness should be kept in sync as the federation API evolves.