No hardcoded brain.woltje.com or any specific host. baseUrl + apiKey required; missing = OpenBrainConfigError at bootstrap.
130 lines
5.0 KiB
Markdown
130 lines
5.0 KiB
Markdown
# PRD: openclaw-openbrain-context
|
|
|
|
**Version:** 0.0.1
|
|
**Status:** Approved
|
|
|
|
---
|
|
|
|
## Problem Statement
|
|
|
|
OpenClaw compacts context when sessions grow long, causing information loss. The new `ContextEngine` plugin interface (merged in PR #22201) allows replacing the default compaction strategy with a persistent, lossless alternative. OpenBrain is a self-hosted pgvector + REST API service that can store and semantically retrieve conversation context indefinitely.
|
|
|
|
## Solution
|
|
|
|
An OpenClaw plugin that implements the `ContextEngine` interface and stores all context in OpenBrain using semantic embeddings. Sessions become effectively lossless — context is archived, not discarded. On reassembly, relevant context is retrieved via semantic search.
|
|
|
|
## Reference Implementation
|
|
|
|
lossless-claw (https://github.com/Martian-Engineering/lossless-claw) is the primary reference. It uses SQLite; we use OpenBrain (Postgres + pgvector via REST). Our implementation is simpler: no local DB, no migrations — everything goes through the OpenBrain REST API.
|
|
|
|
## OpenBrain API
|
|
|
|
```
|
|
Base: <user-configured — OPENBRAIN_URL env var or plugin config.baseUrl — NO DEFAULT>
|
|
Auth: Bearer <OPENBRAIN_API_KEY env var or plugin config.apiKey — NO DEFAULT>
|
|
|
|
POST /v1/thoughts { content, source, metadata }
|
|
POST /v1/search { query, limit }
|
|
GET /v1/thoughts/recent ?limit=N&source=X
|
|
PATCH /v1/thoughts/:id { content?, metadata? }
|
|
DELETE /v1/thoughts/:id
|
|
DELETE /v1/thoughts ?source=X&metadata_id=Y (bulk)
|
|
```
|
|
|
|
## Plugin Architecture
|
|
|
|
### ContextEngine interface lifecycle hooks to implement:
|
|
- `bootstrap(sessionId)` — init session context, retrieve prior context from OpenBrain
|
|
- `ingest(message)` — store each message turn to OpenBrain
|
|
- `ingestBatch(messages)` — bulk store (e.g., on session import)
|
|
- `afterTurn()` — post-turn hook (optional: summarize/compress)
|
|
- `assemble(maxTokens)` — retrieve relevant context for next turn via semantic search
|
|
- `compact(params)` — archive current context to OpenBrain, return minimal summary
|
|
- `prepareSubagentSpawn(subagentId)` — pass relevant context to spawned subagent
|
|
- `onSubagentEnded(subagentId, reason)` — integrate subagent results back
|
|
- `dispose()` — cleanup
|
|
|
|
### Source tagging strategy
|
|
Each thought stored in OpenBrain:
|
|
- `source`: `openclaw:<agentId>` (e.g., `openclaw:main`)
|
|
- `metadata.sessionId`: session key
|
|
- `metadata.turn`: turn index
|
|
- `metadata.role`: `user` | `assistant` | `tool`
|
|
- `metadata.type`: `message` | `summary` | `subagent-result`
|
|
|
|
### Retrieval strategy
|
|
On `assemble()`:
|
|
1. Fetch recent N messages (ordered, for continuity)
|
|
2. Semantic search with last user message as query
|
|
3. Merge + deduplicate, trim to maxTokens
|
|
|
|
## Plugin Registration
|
|
|
|
```typescript
|
|
// index.ts
|
|
import { OpenBrainContextEngine } from "./src/engine.js";
|
|
|
|
export function register(api: OpenClawPluginApi) {
|
|
api.registerContextEngine("openbrain", (config) =>
|
|
new OpenBrainContextEngine(config)
|
|
);
|
|
}
|
|
```
|
|
|
|
## Configuration (openclaw.json)
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"contextEngine": "openbrain"
|
|
}
|
|
},
|
|
"plugins": {
|
|
"entries": {
|
|
"openclaw-openbrain-context": {
|
|
"enabled": true,
|
|
"config": {
|
|
"baseUrl": "https://your-openbrain-instance.example.com",
|
|
"apiKey": "your-api-key",
|
|
"recentMessages": 20,
|
|
"semanticSearchLimit": 10,
|
|
"source": "openclaw"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Tech Stack
|
|
- TypeScript strict
|
|
- No local DB (pure REST calls to OpenBrain)
|
|
- openclaw/plugin-sdk for ContextEngine interface
|
|
- pnpm, vitest, ESLint
|
|
|
|
## ⚠️ Hard Rule: No Hardcoded Instance Values
|
|
|
|
This plugin will be used by anyone running their own OpenBrain instance. The following are STRICTLY FORBIDDEN in source code, defaults, or fallback logic:
|
|
- Any hardcoded URL (no `brain.woltje.com` or any other specific domain)
|
|
- Any hardcoded API key
|
|
- Any `process.env.OPENBRAIN_URL || 'https://brain.woltje.com'` fallback patterns
|
|
|
|
Required behavior:
|
|
- `baseUrl` and `apiKey` MUST be provided via plugin config or env vars
|
|
- If either is missing, throw `OpenBrainConfigError` at `bootstrap()` time with a clear message: "openclaw-openbrain-context: baseUrl and apiKey are required. Set them in your openclaw.json plugin config."
|
|
- No silent degradation, no defaults to any specific host
|
|
|
|
## Acceptance Criteria (v0.0.1)
|
|
1. Plugin registers as a ContextEngine with id `openbrain`
|
|
2. `ingest()` stores messages to OpenBrain with correct metadata
|
|
3. `assemble()` retrieves recent + semantically relevant context within token budget
|
|
4. `compact()` archives turn summary, returns minimal prompt-injection
|
|
5. `bootstrap()` loads prior session context on restart; throws `OpenBrainConfigError` if config missing
|
|
6. Tests pass, TypeScript strict, ESLint clean
|
|
7. openclaw.plugin.json with correct manifest
|
|
8. README documents: self-host OpenBrain setup, all config options with types/defaults, env var pattern, example config
|
|
|
|
## ASSUMPTION: ContextEngine interface shape
|
|
Based on lossless-claw source and PR #22201. Plugin SDK import path: `openclaw/plugin-sdk`.
|