Files

Garry Tan 0e9f8814a5 feat: v0.16.0 — durable agent runtime (gbrain agent + subagent handler + plugin loader) (#258 )

* refactor(mcp): extract buildToolDefs helper for subagent tool registry reuse

The inline operations.map(...) block in src/mcp/server.ts became the only
source of truth for agent-facing tool definitions. Extract into a reusable
exported helper so the v0.15 subagent tool registry can call it with a
filtered OPERATIONS subset instead of duplicating the shape.

Byte-for-byte equivalence regression pinned in test/mcp-tool-defs.test.ts —
legacy inline mapping kept verbatim inside the test so any future drift
between the new helper and the pre-extraction MCP schema fails loudly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(operations): subagent-aware OperationContext + put_page namespace

Adds three optional fields to OperationContext:
- jobId?: number — the currently running Minion job id
- subagentId?: number — the owning subagent job id for tool-dispatched calls
- viaSubagent?: boolean — FAIL-CLOSED flag for agent-path gating

put_page now enforces a namespace rule when invoked on the subagent tool
dispatch path (viaSubagent=true): writes MUST target
`wiki/agents/<subagentId>/...`. Anchored, slash-boundary enforced so a
collision like `wiki/agents/12evil/...` can't impersonate subagent 12.

The check runs BEFORE the dry-run short-circuit so preview calls surface
the same rejection. Fail-closed: a missing subagentId with viaSubagent=true
rejects every slug rather than letting a dispatcher bug open a hole.

Existing callers unaffected — all three fields are optional and the legacy
put_page behavior is unchanged when viaSubagent is undefined/false.

12 regression + namespace tests pin:
- local CLI writes (viaSubagent unset) accept arbitrary slugs
- MCP writes (remote=true, viaSubagent unset) accept arbitrary slugs
- subagent-path: anchored prefix accepted, wrong id rejected, prefix-
collision defeated, leading-slash rejected, bare-prefix rejected,
fail-closed on missing/NaN subagentId, permission_denied code emitted

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(schema): v0.15.0 subagent runtime tables + migration orchestrator

Adds three new tables for the durable LLM agent runtime:

subagent_messages — Anthropic message-block persistence.
Parallel tool_use blocks in one assistant
message live in content_blocks JSONB, not
across rows (fixes the (job_id, turn_idx, role)
misdesign codex caught in v0.13 drafting).

subagent_tool_executions — Two-phase tool ledger. INSERT pending before
execute, UPDATE complete/failed after. Replay
re-runs pending rows only if the tool is
idempotent (v1 ships only idempotent tools so
this is preventive).

subagent_rate_leases — Lease-based concurrency cap for outbound
providers (e.g. anthropic:messages). Stale
leases auto-prune on next acquire so crashed
workers can't strand capacity.

All DDL uses CREATE TABLE/INDEX IF NOT EXISTS — order-independent vs
PR #244's initSchema() reorder, and idempotent across fresh-install +
upgrade paths. Shipped in both src/schema.sql (Postgres) and
src/core/pglite-schema.ts (PGLite); schema-embedded.ts regenerated.

Migration orchestrator v0_15_0.ts (phases: schema → verify → record).
v0_14_0.ts is a no-op stub so the registry's version sequence stays
gapless (v0.14.0 shipped shell-jobs — code change, no DB migration).

10 unit tests for registry wiring, ordering, dry-run phase behavior, and
schema-embedded table presence. test/apply-migrations.test.ts updated for
the two new registry entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): emit child_done on every terminal + max_stalled per-job + terminal set fix

Three correctness fixes the v0.15 subagent aggregator spine depends on:

1. child_done emission on ALL terminal transitions, not just success.
- completeJob already emitted on success — now also tags outcome='complete'.
- failJob newly emits on terminal 'failed' or 'dead' (outcome='failed'|'dead',
error=<text>), BEFORE the parent-terminal UPDATE so the EXISTS guard on
the inbox INSERT doesn't skip it on fail_parent paths (codex catch).
- cancelJob now emits outcome='cancelled' per descendant with a parent.
- handleTimeouts now emits outcome='timeout' per timed-out child.
ChildDoneMessage gains optional { outcome, error } — backwards compatible
(legacy writers omitted them; consumers treat absent outcome as 'complete').

2. Parent-resolution terminal set now includes 'failed'.
Pre-v0.15 the `NOT EXISTS (... status NOT IN ('completed','dead','cancelled'))`
guard treated a failed child as still-pending, stranding aggregator parents
that chose on_child_fail='continue' or 'ignore' in waiting-children forever.
Expanded to {completed, failed, dead, cancelled} everywhere parent resolution
reads child status (completeJob inline, failJob remove_dep + continue,
cancelJob sweep, handleTimeouts sweep, and the resolveParent method itself).

3. MinionJobInput.max_stalled threads through MinionQueue.add() on INSERT.
Column exists with default 1 — that is "first stall → dead", which defeats
crash recovery for long-running handlers. Subagent children will set
max_stalled: 3 to survive mid-run worker kills. Second-submitter under an
idempotency-key hit does NOT mutate the existing row (codex-flagged
footgun — first-submit options are load-bearing state).

13 unit tests pin: emission on each of completeJob/failJob/cancelJob/
handleTimeouts, insertion order on fail_parent, terminal-set expansion with
continue policy, max_stalled default + override + idempotency behavior.

E2E tier 1 (Postgres) passes 141 tests unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): rate-leases + waitForCompletion infra for v0.15 subagent

Two infrastructure modules the subagent handler spine depends on:

rate-leases.ts — lease-based concurrency cap for outbound providers
(anthropic:messages, openai:*, etc.). Counter-based limiters leak capacity
on worker crash; leases are owner-tagged rows with expires_at that
auto-prune on the next acquire. Two-phase: txn-scoped pg_advisory_xact_lock
guards the check-then-insert so concurrent acquires can't both win the
"last slot". renewLeaseWithBackoff retries 3x (250/500/1000ms) for mid-
call DB blips — on persistent failure the LLM-loop caller aborts with a
renewable error so the worker re-claims and the rate invariant is
preserved. Owner FK cascades clean up leases on job deletion.

wait-for-completion.ts — poll-until-terminal helper for CLI callers.
Minions' NOTIFY is worker-side only; `gbrain agent run --follow` polls
getJob() until status is {completed, failed, dead, cancelled}. TimeoutError
carries jobId + elapsedMs and does NOT cancel the job — the user can
inspect via `gbrain jobs get <id>` later. Supports AbortSignal for Ctrl-C
without throwing. Default pollMs is 1000 on Postgres, 250 on PGLite (inline
CLI has no network RTT).

21 unit tests cover: single/multi acquire under cap, rejection past cap,
release frees slot, different keys are independent, stale prune, cascade
on owner delete, renew bumps expires_at, renew on missing is false,
backoff path success + pruned short-circuit. waitForCompletion: fast-path
terminal, transitions mid-wait (completed/failed/cancelled), TimeoutError
shape, abort-signal early exit, non-existent job error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): subagent ToolDef types + brain-tool registry (v0.15)

Types first so the handler has a stable contract:
- SubagentHandlerData / AggregatorHandlerData — the two job.data shapes
- ToolCtx (engine, jobId, remote, signal) + ToolDef (name, description,
input_schema, idempotent, execute) — Anthropic-envelope, distinct from
the MCP McpToolDef extraction landed earlier
- ContentBlock discriminated union for subagent_messages.content_blocks
- SubagentStopReason + SubagentResult emitted on terminal completion

brain-allowlist.ts derives one ToolDef per allow-listed OPERATION. Reuses
the ParamDef → JSONSchema shape from the MCP extraction in a local helper
(Anthropic's input_schema field diverges from MCP's inputSchema by a
character). The 11-name allow-list is read-safe + put_page — every
destructive / filesystem / identity-mutating op stays off by default.

put_page gets a namespace-wrapped tool schema: `slug` pattern = anchored
`^wiki/agents/<subagentId>/.+`. The server-side check in put_page op
(shipped in prior commit) is still the authoritative gate — the schema
just helps the model write correct slugs first-try. `subagentId` is
plumbed into the ToolCtx so the viaSubagent=true fail-closed path lights
up on every tool-dispatched put_page.

filterAllowedTools narrows a registry by subagent_def's allowed_tools
frontmatter field. Rejects unknown names at load time (no silent drop —
typos in a skills/subagents/*.md would otherwise ship to prod with a
tool silently missing).

18 tests pin: every allowlist name exists in OPERATIONS (catches upstream
rename), Anthropic name regex, put_page namespace pattern per-subagent,
execute() routes through the op handler with viaSubagent=true, out-of-
namespace put_page throws permission_denied, filter passes prefixed +
unprefixed names, rejects unknowns, deduplicates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): subagent-audit JSONL + transcript renderer

Two small plumbing pieces the v0.15 subagent handler + `gbrain agent logs`
depend on:

subagent-audit.ts — JSONL-rotated audit log mirroring the shell-audit
pattern. Two event flavors: submission (one line per job submit) and
heartbeat (one line per turn boundary — llm_call_started / completed /
tool_called / tool_result / tool_failed). Heartbeats fix the "--follow on
a long Anthropic call shows nothing for 30 seconds" problem codex flagged.
Never logs prompts or tool inputs (PII risk — subagent input_vars may
carry user-supplied free text); DOES log tokens, ms_elapsed, tool_name,
first 200 chars of error text. Rotates weekly via ISO week. `readSubagent
AuditForJob` is the readback path for `gbrain agent logs` — scans the
current + prior week file so job boundaries across weeks still resolve.
`GBRAIN_AUDIT_DIR` overrides the default ~/.gbrain/audit/ for container
deploys.

transcript.ts — renders subagent_messages + subagent_tool_executions to
markdown. Message order is authoritative; tool rows splice under their
owning assistant tool_use by tool_use_id. Handles text, tool_use (with
pending / complete / failed execution rows), tool_result (skipped if
we already rendered the owning tool_use — avoids double-printing), and
unknown block types (fenced JSON dump for diagnostics). Output is
UTF-8-safe truncated at maxOutputBytes.

21 unit tests: ISO week filename rotation (incl. 2027-01-01 → W53-2026
boundary), submission + heartbeat write shapes, 200-char error cap, best-
effort write failure doesn't throw, readback filters by job_id and
sinceIso. Transcript: empty input, ordering, token line, tool_use +
complete/failed/pending execution rendering, truncation, unknown-block
diagnostic dump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): subagent LLM-loop handler with crash-resumable replay

The main event: runs one Anthropic Messages API conversation with tool
use, persists every turn + tool execution, and resumes cleanly after a
worker kill anywhere in the loop.

Design points that carry the v0.15 guarantees:

1. Two-phase tool persistence. INSERT status='pending' before dispatch,
UPDATE to 'complete' or 'failed' after. subagent_messages rows are
the canonical conversation; subagent_tool_executions rows are the
canonical "did this tool run + what did it return". Either DB commit
is atomic, so replay has a single source of truth.

2. Replay reconciliation. If the last persisted message is an assistant
with tool_use blocks AND no following synthesized user message, we
crashed mid-dispatch. On resume, finish those tools first (respecting
idempotent flag for 'pending' rows), synthesize the user turn, and
THEN call the LLM again. Non-idempotent pending rows abort the job
with a clear error — v0.15 ships only idempotent tools so this is
preventive.

3. Rate lease around every LLM call. acquireLease before, releaseLease
after (both success and error paths). acquired=false throws
RateLeaseUnavailableError — the worker treats it as a renewable
error and re-claims later, so a temporary capacity cap doesn't fail
the job terminally.

4. Anthropic prompt caching. system block gets cache_control=ephemeral;
the LAST tool def gets it too (Anthropic caches everything up to and
including the marked block). ~10x cost reduction on multi-turn
agents per the plan.

5. Dual-signal abort. AbortSignal.any merges ctx.signal (timeout / lock
loss / cancel) with ctx.shutdownSignal (worker SIGTERM). Both feed
the Anthropic call's AbortSignal; mid-turn abort bails before the
next LLM call with whatever turns are already persisted. Node ≥ 20
has AbortSignal.any; older runtimes get a manual-merge polyfill.

6. Injectable Anthropic client. The real SDK implements MessagesClient
structurally; tests inject a FakeMessagesClient that scripts
responses.

12 unit tests pin: no-tool happy path, single tool_use complete, tool
throws → failed row + loop continues, unknown tool name rejection,
max_turns cap, crash-then-resume with partial state, replay skips already-
complete tool execs without re-invoking execute, non-idempotent pending
rejects on resume, lease acquire + release roundtrip, RateLeaseUnavailable
under cap-full, missing prompt validation, allowed_tools unknown-name.

NOT in v0.15: refusal detection (stop_reason + content shape), stop_reason
=max_tokens partial recovery, mid-call lease renewal with backoff loop.
All three are documented as P2 items in the plan file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): subagent_aggregator handler with mixed-outcome rendering

Claims AFTER all subagent children resolve — by then Lane 1B's queue
changes have posted one child_done message per terminal transition into
this job's inbox (complete / failed / dead / cancelled / timeout). The
aggregator reads those, builds a deterministic markdown summary, and
returns it as the handler result.

Not an LLM call in v0.15 — output is reproducible concatenation so
fan-out runs stay comparable. v0.16+ can add an LLM synthesis pass
behind an opt-in flag.

Contract:
- empty children_ids → `(no children)` marker
- missing child_done (shouldn't happen under v0.15 invariants but
possible if a terminal-state path slipped past Lane 1B) → counted as
failed with "no child_done message observed" error
- non-complete outcomes: result is null in the output so no payload
leaks alongside a failure label
- children appear in the order children_ids was supplied
- custom aggregate_prompt_template replaces the markdown header

13 unit tests cover: empty input, all-success, mixed outcomes, result
suppression on failure, missing child_done handling, order preservation,
custom template, progress + log emission, stringified JSONB payload
parsing, non-child_done inbox filtering, legacy-writer outcome fallback,
and internal helper edges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): GBRAIN_PLUGIN_PATH loader + plugin-authors guide (v0.15)

Plumbing that makes Wintermute (and future downstream agents) day-1
usable on v0.15. Host repos drop a `gbrain.plugin.json` + `subagents/`
directory somewhere, set GBRAIN_PLUGIN_PATH (colon-separated like \$PATH),
and their custom subagent defs load at worker startup.

Path policy is strict: absolute paths only. Relative, ~-prefixed, and
URL-style (https://, file://) all rejected with warnings — the user
controls where plugins live. Non-existent paths and files (not dirs) are
warned and skipped so a typo doesn't crash worker startup.

Collision policy: left-wins. If two plugins ship a subagent with the same
name, the first one in GBRAIN_PLUGIN_PATH keeps it and the other gets a
warning naming both sources. Deterministic + debuggable.

Trust policy: plugins ship subagent defs ONLY. Cannot declare new tools,
cannot extend the brain allow-list, cannot override safety flags. The
subagent def's `allowed_tools:` frontmatter MUST subset the derived
registry — validation happens at load time (worker startup), not at
dispatch time, so a typo in a skill gives a loud startup error instead
of silently "tool never fires at 3am."

Manifest `plugin_version: "gbrain-plugin-v1"` locks the contract. Unknown
versions rejected. `subagents` field escape attempts (`../../../etc` etc)
rejected. gray-matter handles the markdown frontmatter parse — subagent
defs don't conform to the page schema, so we don't use parseMarkdown.

docs/guides/plugin-authors.md is the Wintermute-facing walkthrough.
Covers the minimum viable plugin shape, the three policies, the
frontmatter fields, known caveats (audit JSONL is local-only, tool calls
always run remote=true, put_page is namespace-scoped).

22 unit tests pin path rejection, missing/invalid manifest, unsupported
version, escape-attempt, basename fallback for missing frontmatter.name,
allowed_tools round-trip, unknown-tool rejection with validAgentToolNames,
empty env, multi-path, collision warning with left-wins, trimmed paths,
manifest-rejection as warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): gbrain agent run + logs + worker registration (v0.15 Lane 4H)

Three integration seams wired:

src/commands/agent.ts — \`gbrain agent run\`. Submits subagent jobs (or a
fan-out of N + aggregator) under the trusted-submit flag so the
PROTECTED_JOB_NAMES guard doesn't reject. Fan-out path creates the
aggregator first (so children can reference its id as parent), submits
each child with on_child_fail='continue' (required by Lane 1B's terminal-
set + child_done machinery), then jsonb_set's the aggregator's
children_ids. Short-circuits a 1-entry manifest to a single subagent
with no aggregator. Follow mode runs agent-logs streaming + waitFor
Completion in parallel and exits on terminal status; detach prints the
job id and exits. Ctrl-C is handled as detach, not cancel — the job
keeps running, consistent with durability invariants.

src/commands/agent-logs.ts — \`gbrain agent logs\`. Merges ~/.gbrain/audit/
subagent-jobs-*.jsonl (heartbeats + submissions) with subagent_messages
(persisted conversation) in one chronological stream. --follow polls at
1s and exits when the job hits terminal. --since accepts ISO-8601 OR
relative shorthand (5m / 1h / 2d). Writes transcript tail (full message
+ tool tree) only for terminal jobs, so mid-run --follow doesn't spam a
half-rendered transcript.

src/commands/jobs.ts registerBuiltinHandlers — matches the shell-handler
opt-in shape. GBRAIN_ALLOW_LLM_JOBS=1 registers the subagent +
subagent_aggregator handlers, then loads plugins from GBRAIN_PLUGIN_PATH
with validAgentToolNames pulled from BRAIN_TOOL_ALLOWLIST. Every plugin
warning + loaded-plugin line prints to stderr, mirroring the openclaw-
seam startup convention.

src/core/minions/protected-names.ts — subagent + subagent_aggregator
join the protected set. MCP submit_job returns permission_denied; only
trusted-CLI callers (with allowProtectedSubmit) can insert these rows.

src/cli.ts — adds 'agent' to CLI_ONLY + dispatches it like 'jobs'.

Test fallout: subagent-handler.test.ts + subagent-transcript.test.ts
helpers now submit under allowProtectedSubmit (they insert rows named
'subagent' directly against the queue). 23 new tests in agent-cli.test.ts
cover: flag parsing (including --detach implies !follow, --tools comma
split, -- terminator, unknown flag throw), --since parse (ISO, relative
5m/2h/1d, unparseable error), protected-name guard for all three names,
trusted-submit gate, and a fan-out integration check that verifies the
aggregator + children shape after --fanout-manifest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): rename max_children test's spawned jobs off the protected 'subagent' name

The spawn-storm test submitted 50 literal-string 'subagent' children to
exercise the max_children row-lock serialization. In v0.15 'subagent' is
a PROTECTED_JOB_NAME (CLI-only; trusted submit required), so the old
literal submission now throws before reaching the row-lock check.

The test is about max_children semantics, not the v0.15 subagent runtime
specifically — rename the child name to 'child_worker' so the test
exercises the exact same queue.add path without tripping the new guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(ship): v0.15.0 — VERSION, CHANGELOG, README, upgrading-agents, CLAUDE.md

Bumps VERSION → 0.15.0 and package.json → 0.15.0 (resolves the pre-existing
drift — on master, VERSION=0.14.0 but package.json=0.13.1; src/version.ts
reads package.json, so this is what the binary prints now).

CHANGELOG lands the release-summary entry in the GStack voice + the full
itemized change list (11 new modules, 3 new tables, queue correctness
fixes, trust-model additions, 159 new unit tests). Voice rules respected
— no em dashes, no AI vocabulary, real file names + real numbers.

README gets a "Durable agents: `gbrain agent` (v0.15)" section next to
the Minions block, with the three canonical CLI shapes (single run,
fanout-manifest, logs --follow) and a pointer to plugin-authors.md.

docs/UPGRADING_DOWNSTREAM_AGENTS.md gets a full v0.15.0 section covering
the four adoption steps downstream agents (Wintermute and similar) need:
(1) worker opt-in via GBRAIN_ALLOW_LLM_JOBS, (2) moving custom subagent
defs to a plugin repo, (3) replacing ephemeral subagent runs with durable
`gbrain agent run`, (4) the put_page namespace rule for agent-driven writes.

CLAUDE.md updated with concise per-file descriptions for every new module:
the handler, aggregator, audit, rate-leases, wait-for-completion,
transcript, plugin-loader, brain-allowlist, tool-defs extraction, agent
CLI + logs CLI, and the registerBuiltinHandlers wiring for subagent
handlers + plugin-loader.

Verified: binary builds (940 modules, 89ms compile), prints `gbrain 0.15.0`,
`gbrain agent --help` shows the new subcommand shape. 170 new tests pass
(full v0.15 surface). Full unit suite passes bar one parallel-load
flake on a pre-existing E2E (graph-quality, passes in isolation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): drop GBRAIN_ALLOW_LLM_JOBS flag — subagent handlers always-on

The env flag was ceremony. Shell jobs need the flag because they execute
arbitrary CLI commands (RCE surface). Subagent jobs don't — they call the
Anthropic API with whatever ANTHROPIC_API_KEY is in env, so the key is
already the cost gate (no key → SDK fails on the first turn). And
who-can-submit is already protected by PROTECTED_JOB_NAMES +
TrustedSubmitOpts: MCP callers get permission_denied; only `gbrain agent
run` with allowProtectedSubmit can insert subagent / subagent_aggregator
rows. The flag added nothing the existing guards didn't already give us.

registerBuiltinHandlers now always registers subagent + subagent_aggregator
and loads GBRAIN_PLUGIN_PATH plugins. Worker startup prints:

[minion worker] subagent handlers enabled

instead of the conditional enabled/disabled pair. Plugin discovery runs
unconditionally — empty PATH is a no-op.

README, CHANGELOG, docs/UPGRADING_DOWNSTREAM_AGENTS, CLAUDE.md, agent CLI
help text, and subagent handler docstring all updated to drop the flag
reference. Shell handler's GBRAIN_ALLOW_SHELL_JOBS gate is untouched —
separate concern (RCE, not billing).

Full suite: 1859 pass, 0 fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: scrub private agent-fork name from all public artifacts

Enforces the rule added to CLAUDE.md (privacy section): never say
`Wintermute` in any CHANGELOG, README, doc, PR, or commit message.
Reader-facing copy says `your OpenClaw` (the term covers every
downstream OpenClaw deployment — Wintermute, Hermes, AlphaClaw — in
one umbrella the reader already recognizes). First-person /
origin-story copy says `Garry's OpenClaw` (honest that this is the
production deployment driving the feature, without exposing the
private agent's name).

Swept across:
CHANGELOG.md (v0.15 entry + 4 historical mentions)
README.md
TODOS.md
docs/UPGRADING_DOWNSTREAM_AGENTS.md
docs/guides/plugin-authors.md (including example plugin names)
docs/guides/plugin-handlers.md
docs/guides/minions-fix.md
docs/designs/KNOWLEDGE_RUNTIME.md (27 refs, mostly analytical)
docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md
skills/migrations/v0.11.0.md
skills/skillpack-check/SKILL.md
scripts/skillify-check.ts
src/commands/doctor.ts
src/commands/migrations/v0_15_0.ts
src/commands/skillpack-check.ts
src/core/enrichment/completeness.ts
src/core/minions/plugin-loader.ts
src/core/operations.ts
src/core/output/scaffold.ts

Intentionally kept (these mentions define/test the rule itself):
CLAUDE.md — the privacy rule section necessarily uses the literal
name to define the restriction and examples
test/plugin-loader.test.ts — fixture name in a plugin-loading test;
renaming risks breaking assertion logic
test/integrations.test.ts — the word appears in a privacy-regex
test that explicitly enforces name redaction
test/doctor-minions-check.test.ts — a comment referencing the rule
CEO plan artifact at ~/.gstack/projects/… — private, not distributed

Binary builds (941 modules), 198/198 relevant tests pass, `gbrain --version`
prints `0.15.0`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: gitignore bun --compile artifacts with a glob, not specific hashes

Each `bun build --compile` emits a fresh hash-named `.*-*.bun-build` file
in cwd. The prior entries listed two specific hashes that were already
stale, so every build after those created a new untracked file requiring
manual cleanup.

Replace the two stale entries with `*.bun-build` so any current or future
compile artifact is ignored automatically.

Verified: ran `bun build --compile`, got two new `.*-*.bun-build` files,
`git status` stays clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(ship): rename v0.15.0 → v0.16.0

gbrain master is at 0.14.2. Other 0.15.x PRs may land before/after
this one — we bump the minor (new capability) and lock to 0.16.0 so
ordering with concurrent work doesn't matter.

Touches:
- VERSION: 0.15.0 → 0.16.0
- package.json: 0.15.0 → 0.16.0
- Rename src/commands/migrations/v0_15_0.ts → v0_16_0.ts (+ all
version strings inside + import in index.ts registry)
- Rename test/migrations-v0_15_0.test.ts → migrations-v0_16_0.test.ts
- test/apply-migrations.test.ts: skippedFuture lists now reference
'0.16.0'
- test/put-page-namespace.test.ts + test/mcp-tool-defs.test.ts: Lane
comment refs updated
- src/schema.sql + src/core/pglite-schema.ts: "v0.15.0" section
comment updated; src/core/schema-embedded.ts regenerated
- CHANGELOG.md: top entry renamed to [0.16.0]; inline v0_15_0 /
v0.15.0 refs swept
- docs/UPGRADING_DOWNSTREAM_AGENTS.md: section heading v0.15.0 → v0.16.0

Verified: `gbrain --version` prints 0.16.0, migration registry /
buildPlan / put_page / mcp-tool-defs / handlers tests all green
(49/49).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: reframe v0.16 durability headline around OpenClaw crashes

"Laptop closed mid-run" framing implied a consumer workflow. Real pain is
OpenClaw subagents dying daily on worker kill, memory blip, or timeout.
Headline + README copy match the body now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt after README copy change

Regen drift guard caught the README edit from 83beec4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-21 21:14:17 -07:00

5.9 KiB

Raw Blame History

Plugin authors guide (v0.15)

gbrain discovers subagent definitions from outside this repo via GBRAIN_PLUGIN_PATH. If you maintain a downstream agent (your OpenClaw deployment, a workflow host, a private tool) and want to ship custom subagents alongside it, drop a plugin directory on that env path.

This guide is for plugin authors. The CLI user doesn't need to read it.

Minimum viable plugin

/path/to/my-plugin/
├── gbrain.plugin.json
└── subagents/
    └── my-summarizer.md

gbrain.plugin.json:

{
  "name": "my-plugin",
  "version": "1.0.0",
  "plugin_version": "gbrain-plugin-v1"
}

subagents/my-summarizer.md:

---
name: my-summarizer
model: claude-sonnet-4-6
allowed_tools:
  - brain_search
  - brain_get_page
---

You are a brain page summarizer. Given a slug, fetch the page and produce
a 3-sentence summary.

Turning it on

export GBRAIN_PLUGIN_PATH="/path/to/my-plugin"
gbrain jobs work           # worker startup prints the plugin load line
gbrain agent run "summarize meetings/2026-04-20" --subagent-def my-summarizer

Multiple plugins: colon-separated, just like $PATH.

export GBRAIN_PLUGIN_PATH="/path/to/plugin-a:/path/to/plugin-b"

Rules (strict by design)

Path policy. Absolute paths only. Relative paths, ~-prefixed paths, and URL-style paths (https://, file://) are rejected with a warning. You control where your plugin lives on disk; gbrain doesn't guess.

Collision policy. If two plugins ship a subagent with the same name, the one listed FIRST in GBRAIN_PLUGIN_PATH wins. The other is dropped with a warning naming both sources.

Trust policy. Plugins ship subagent definitions ONLY in v0.15:

You cannot declare new tools.
You cannot extend the brain tool allow-list.
You cannot override any agentSafe or similar flag.
Your allowed_tools: frontmatter field MUST subset the derived brain tool registry. Names not in the registry are rejected at plugin load time (worker startup), NOT at subagent dispatch time — so a typo in your plugin gives you a loud startup error, not a silent "tool never fires" at 3am.

v0.16+ may open up plugin-declared tools with a separate contract. Don't expect it.

`gbrain.plugin.json`

field	type	required	notes
`name`	string	yes	Human-readable plugin id. Shows up in warnings and collision logs.
`version`	string	yes	Your plugin's semver. Informational.
`plugin_version`	string	yes	Contract lock. Must equal `"gbrain-plugin-v1"` for v0.15.
`subagents`	string	no	Subdir name (default `subagents`). Escape-attempts are rejected.
`description`	string	no	Shown in future `gbrain plugin list`.

Subagent definition files

Plain markdown with YAML frontmatter. The body is the system prompt. The frontmatter controls runtime behavior.

Recognized frontmatter fields:

field	type	required	notes
`name`	string	no	Subagent identifier used as `--subagent-def`. Defaults to the file basename.
`model`	string	no	Anthropic model id. Defaults to the handler default (sonnet).
`max_turns`	number	no	Cap on assistant turns. Defaults to 20.
`allowed_tools`	string[]	no	Whitelist of tool names. Must subset the derived brain registry. Rejected on mismatch.

Unknown frontmatter fields are preserved but ignored by the handler. v0.16 may consume more of them.

Caveats that will bite you

Plugin definitions can't change during a run. The loader reads the disk once at worker startup. Editing a subagent def doesn't re-take effect until you restart the worker. This is deliberate — live reloads would break crash-resumable replay.
~/.gbrain/audit/subagent-jobs-*.jsonl is local only. If your worker runs on a different host than the gbrain agent logs caller, the CLI won't see heartbeats from that worker. v0.16 will unify this; for now assume worker + CLI share a filesystem.
Tool calls always run with ctx.remote = true. Even on local CLI invocation. Tools that gate on remote=true (file_upload's strict confinement, put_page's namespace check) will apply. Good default; a subagent definition that wants local-filesystem reach beyond the brain can't have it.
put_page writes are namespace-scoped. A subagent with id 42 can only write under wiki/agents/42/.... This is enforced both in the tool schema (the slug pattern shown to the model) AND server-side in the put_page operation (fail-closed if viaSubagent=true). Don't try to route around it; you'll get permission_denied.

Example: a downstream-OpenClaw plugin

~/your-openclaw/
└── gbrain-plugin/
    ├── gbrain.plugin.json
    └── subagents/
        ├── meeting-ingestion.md
        ├── signal-detector.md
        └── daily-task-prep.md

~/your-openclaw/gbrain-plugin/gbrain.plugin.json:

{
  "name": "your-openclaw",
  "version": "2026.4.20",
  "plugin_version": "gbrain-plugin-v1",
  "description": "Your OpenClaw's personal-brain subagents"
}

Environment:

export GBRAIN_PLUGIN_PATH="$HOME/your-openclaw/gbrain-plugin"

Then your OpenClaw calls gbrain agent run --subagent-def meeting-ingestion --fanout-by transcript ... and its definitions load automatically.

5.9 KiB Raw Blame History

Plugin authors guide (v0.15)

Minimum viable plugin

Turning it on

Rules (strict by design)

gbrain.plugin.json

Subagent definition files

Caveats that will bite you

Example: a downstream-OpenClaw plugin

5.9 KiB

Raw Blame History

`gbrain.plugin.json`