* refactor(mcp): extract buildToolDefs helper for subagent tool registry reuse The inline operations.map(...) block in src/mcp/server.ts became the only source of truth for agent-facing tool definitions. Extract into a reusable exported helper so the v0.15 subagent tool registry can call it with a filtered OPERATIONS subset instead of duplicating the shape. Byte-for-byte equivalence regression pinned in test/mcp-tool-defs.test.ts — legacy inline mapping kept verbatim inside the test so any future drift between the new helper and the pre-extraction MCP schema fails loudly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(operations): subagent-aware OperationContext + put_page namespace Adds three optional fields to OperationContext: - jobId?: number — the currently running Minion job id - subagentId?: number — the owning subagent job id for tool-dispatched calls - viaSubagent?: boolean — FAIL-CLOSED flag for agent-path gating put_page now enforces a namespace rule when invoked on the subagent tool dispatch path (viaSubagent=true): writes MUST target `wiki/agents/<subagentId>/...`. Anchored, slash-boundary enforced so a collision like `wiki/agents/12evil/...` can't impersonate subagent 12. The check runs BEFORE the dry-run short-circuit so preview calls surface the same rejection. Fail-closed: a missing subagentId with viaSubagent=true rejects every slug rather than letting a dispatcher bug open a hole. Existing callers unaffected — all three fields are optional and the legacy put_page behavior is unchanged when viaSubagent is undefined/false. 12 regression + namespace tests pin: - local CLI writes (viaSubagent unset) accept arbitrary slugs - MCP writes (remote=true, viaSubagent unset) accept arbitrary slugs - subagent-path: anchored prefix accepted, wrong id rejected, prefix- collision defeated, leading-slash rejected, bare-prefix rejected, fail-closed on missing/NaN subagentId, permission_denied code emitted Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(schema): v0.15.0 subagent runtime tables + migration orchestrator Adds three new tables for the durable LLM agent runtime: subagent_messages — Anthropic message-block persistence. Parallel tool_use blocks in one assistant message live in content_blocks JSONB, not across rows (fixes the (job_id, turn_idx, role) misdesign codex caught in v0.13 drafting). subagent_tool_executions — Two-phase tool ledger. INSERT pending before execute, UPDATE complete/failed after. Replay re-runs pending rows only if the tool is idempotent (v1 ships only idempotent tools so this is preventive). subagent_rate_leases — Lease-based concurrency cap for outbound providers (e.g. anthropic:messages). Stale leases auto-prune on next acquire so crashed workers can't strand capacity. All DDL uses CREATE TABLE/INDEX IF NOT EXISTS — order-independent vs PR #244's initSchema() reorder, and idempotent across fresh-install + upgrade paths. Shipped in both src/schema.sql (Postgres) and src/core/pglite-schema.ts (PGLite); schema-embedded.ts regenerated. Migration orchestrator v0_15_0.ts (phases: schema → verify → record). v0_14_0.ts is a no-op stub so the registry's version sequence stays gapless (v0.14.0 shipped shell-jobs — code change, no DB migration). 10 unit tests for registry wiring, ordering, dry-run phase behavior, and schema-embedded table presence. test/apply-migrations.test.ts updated for the two new registry entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): emit child_done on every terminal + max_stalled per-job + terminal set fix Three correctness fixes the v0.15 subagent aggregator spine depends on: 1. child_done emission on ALL terminal transitions, not just success. - completeJob already emitted on success — now also tags outcome='complete'. - failJob newly emits on terminal 'failed' or 'dead' (outcome='failed'|'dead', error=<text>), BEFORE the parent-terminal UPDATE so the EXISTS guard on the inbox INSERT doesn't skip it on fail_parent paths (codex catch). - cancelJob now emits outcome='cancelled' per descendant with a parent. - handleTimeouts now emits outcome='timeout' per timed-out child. ChildDoneMessage gains optional { outcome, error } — backwards compatible (legacy writers omitted them; consumers treat absent outcome as 'complete'). 2. Parent-resolution terminal set now includes 'failed'. Pre-v0.15 the `NOT EXISTS (... status NOT IN ('completed','dead','cancelled'))` guard treated a failed child as still-pending, stranding aggregator parents that chose on_child_fail='continue' or 'ignore' in waiting-children forever. Expanded to {completed, failed, dead, cancelled} everywhere parent resolution reads child status (completeJob inline, failJob remove_dep + continue, cancelJob sweep, handleTimeouts sweep, and the resolveParent method itself). 3. MinionJobInput.max_stalled threads through MinionQueue.add() on INSERT. Column exists with default 1 — that is "first stall → dead", which defeats crash recovery for long-running handlers. Subagent children will set max_stalled: 3 to survive mid-run worker kills. Second-submitter under an idempotency-key hit does NOT mutate the existing row (codex-flagged footgun — first-submit options are load-bearing state). 13 unit tests pin: emission on each of completeJob/failJob/cancelJob/ handleTimeouts, insertion order on fail_parent, terminal-set expansion with continue policy, max_stalled default + override + idempotency behavior. E2E tier 1 (Postgres) passes 141 tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): rate-leases + waitForCompletion infra for v0.15 subagent Two infrastructure modules the subagent handler spine depends on: rate-leases.ts — lease-based concurrency cap for outbound providers (anthropic:messages, openai:*, etc.). Counter-based limiters leak capacity on worker crash; leases are owner-tagged rows with expires_at that auto-prune on the next acquire. Two-phase: txn-scoped pg_advisory_xact_lock guards the check-then-insert so concurrent acquires can't both win the "last slot". renewLeaseWithBackoff retries 3x (250/500/1000ms) for mid- call DB blips — on persistent failure the LLM-loop caller aborts with a renewable error so the worker re-claims and the rate invariant is preserved. Owner FK cascades clean up leases on job deletion. wait-for-completion.ts — poll-until-terminal helper for CLI callers. Minions' NOTIFY is worker-side only; `gbrain agent run --follow` polls getJob() until status is {completed, failed, dead, cancelled}. TimeoutError carries jobId + elapsedMs and does NOT cancel the job — the user can inspect via `gbrain jobs get <id>` later. Supports AbortSignal for Ctrl-C without throwing. Default pollMs is 1000 on Postgres, 250 on PGLite (inline CLI has no network RTT). 21 unit tests cover: single/multi acquire under cap, rejection past cap, release frees slot, different keys are independent, stale prune, cascade on owner delete, renew bumps expires_at, renew on missing is false, backoff path success + pruned short-circuit. waitForCompletion: fast-path terminal, transitions mid-wait (completed/failed/cancelled), TimeoutError shape, abort-signal early exit, non-existent job error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): subagent ToolDef types + brain-tool registry (v0.15) Types first so the handler has a stable contract: - SubagentHandlerData / AggregatorHandlerData — the two job.data shapes - ToolCtx (engine, jobId, remote, signal) + ToolDef (name, description, input_schema, idempotent, execute) — Anthropic-envelope, distinct from the MCP McpToolDef extraction landed earlier - ContentBlock discriminated union for subagent_messages.content_blocks - SubagentStopReason + SubagentResult emitted on terminal completion brain-allowlist.ts derives one ToolDef per allow-listed OPERATION. Reuses the ParamDef → JSONSchema shape from the MCP extraction in a local helper (Anthropic's input_schema field diverges from MCP's inputSchema by a character). The 11-name allow-list is read-safe + put_page — every destructive / filesystem / identity-mutating op stays off by default. put_page gets a namespace-wrapped tool schema: `slug` pattern = anchored `^wiki/agents/<subagentId>/.+`. The server-side check in put_page op (shipped in prior commit) is still the authoritative gate — the schema just helps the model write correct slugs first-try. `subagentId` is plumbed into the ToolCtx so the viaSubagent=true fail-closed path lights up on every tool-dispatched put_page. filterAllowedTools narrows a registry by subagent_def's allowed_tools frontmatter field. Rejects unknown names at load time (no silent drop — typos in a skills/subagents/*.md would otherwise ship to prod with a tool silently missing). 18 tests pin: every allowlist name exists in OPERATIONS (catches upstream rename), Anthropic name regex, put_page namespace pattern per-subagent, execute() routes through the op handler with viaSubagent=true, out-of- namespace put_page throws permission_denied, filter passes prefixed + unprefixed names, rejects unknowns, deduplicates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): subagent-audit JSONL + transcript renderer Two small plumbing pieces the v0.15 subagent handler + `gbrain agent logs` depend on: subagent-audit.ts — JSONL-rotated audit log mirroring the shell-audit pattern. Two event flavors: submission (one line per job submit) and heartbeat (one line per turn boundary — llm_call_started / completed / tool_called / tool_result / tool_failed). Heartbeats fix the "--follow on a long Anthropic call shows nothing for 30 seconds" problem codex flagged. Never logs prompts or tool inputs (PII risk — subagent input_vars may carry user-supplied free text); DOES log tokens, ms_elapsed, tool_name, first 200 chars of error text. Rotates weekly via ISO week. `readSubagent AuditForJob` is the readback path for `gbrain agent logs` — scans the current + prior week file so job boundaries across weeks still resolve. `GBRAIN_AUDIT_DIR` overrides the default ~/.gbrain/audit/ for container deploys. transcript.ts — renders subagent_messages + subagent_tool_executions to markdown. Message order is authoritative; tool rows splice under their owning assistant tool_use by tool_use_id. Handles text, tool_use (with pending / complete / failed execution rows), tool_result (skipped if we already rendered the owning tool_use — avoids double-printing), and unknown block types (fenced JSON dump for diagnostics). Output is UTF-8-safe truncated at maxOutputBytes. 21 unit tests: ISO week filename rotation (incl. 2027-01-01 → W53-2026 boundary), submission + heartbeat write shapes, 200-char error cap, best- effort write failure doesn't throw, readback filters by job_id and sinceIso. Transcript: empty input, ordering, token line, tool_use + complete/failed/pending execution rendering, truncation, unknown-block diagnostic dump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): subagent LLM-loop handler with crash-resumable replay The main event: runs one Anthropic Messages API conversation with tool use, persists every turn + tool execution, and resumes cleanly after a worker kill anywhere in the loop. Design points that carry the v0.15 guarantees: 1. Two-phase tool persistence. INSERT status='pending' before dispatch, UPDATE to 'complete' or 'failed' after. subagent_messages rows are the canonical conversation; subagent_tool_executions rows are the canonical "did this tool run + what did it return". Either DB commit is atomic, so replay has a single source of truth. 2. Replay reconciliation. If the last persisted message is an assistant with tool_use blocks AND no following synthesized user message, we crashed mid-dispatch. On resume, finish those tools first (respecting idempotent flag for 'pending' rows), synthesize the user turn, and THEN call the LLM again. Non-idempotent pending rows abort the job with a clear error — v0.15 ships only idempotent tools so this is preventive. 3. Rate lease around every LLM call. acquireLease before, releaseLease after (both success and error paths). acquired=false throws RateLeaseUnavailableError — the worker treats it as a renewable error and re-claims later, so a temporary capacity cap doesn't fail the job terminally. 4. Anthropic prompt caching. system block gets cache_control=ephemeral; the LAST tool def gets it too (Anthropic caches everything up to and including the marked block). ~10x cost reduction on multi-turn agents per the plan. 5. Dual-signal abort. AbortSignal.any merges ctx.signal (timeout / lock loss / cancel) with ctx.shutdownSignal (worker SIGTERM). Both feed the Anthropic call's AbortSignal; mid-turn abort bails before the next LLM call with whatever turns are already persisted. Node ≥ 20 has AbortSignal.any; older runtimes get a manual-merge polyfill. 6. Injectable Anthropic client. The real SDK implements MessagesClient structurally; tests inject a FakeMessagesClient that scripts responses. 12 unit tests pin: no-tool happy path, single tool_use complete, tool throws → failed row + loop continues, unknown tool name rejection, max_turns cap, crash-then-resume with partial state, replay skips already- complete tool execs without re-invoking execute, non-idempotent pending rejects on resume, lease acquire + release roundtrip, RateLeaseUnavailable under cap-full, missing prompt validation, allowed_tools unknown-name. NOT in v0.15: refusal detection (stop_reason + content shape), stop_reason =max_tokens partial recovery, mid-call lease renewal with backoff loop. All three are documented as P2 items in the plan file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): subagent_aggregator handler with mixed-outcome rendering Claims AFTER all subagent children resolve — by then Lane 1B's queue changes have posted one child_done message per terminal transition into this job's inbox (complete / failed / dead / cancelled / timeout). The aggregator reads those, builds a deterministic markdown summary, and returns it as the handler result. Not an LLM call in v0.15 — output is reproducible concatenation so fan-out runs stay comparable. v0.16+ can add an LLM synthesis pass behind an opt-in flag. Contract: - empty children_ids → `(no children)` marker - missing child_done (shouldn't happen under v0.15 invariants but possible if a terminal-state path slipped past Lane 1B) → counted as failed with "no child_done message observed" error - non-complete outcomes: result is null in the output so no payload leaks alongside a failure label - children appear in the order children_ids was supplied - custom aggregate_prompt_template replaces the markdown header 13 unit tests cover: empty input, all-success, mixed outcomes, result suppression on failure, missing child_done handling, order preservation, custom template, progress + log emission, stringified JSONB payload parsing, non-child_done inbox filtering, legacy-writer outcome fallback, and internal helper edges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): GBRAIN_PLUGIN_PATH loader + plugin-authors guide (v0.15) Plumbing that makes Wintermute (and future downstream agents) day-1 usable on v0.15. Host repos drop a `gbrain.plugin.json` + `subagents/` directory somewhere, set GBRAIN_PLUGIN_PATH (colon-separated like \$PATH), and their custom subagent defs load at worker startup. Path policy is strict: absolute paths only. Relative, ~-prefixed, and URL-style (https://, file://) all rejected with warnings — the user controls where plugins live. Non-existent paths and files (not dirs) are warned and skipped so a typo doesn't crash worker startup. Collision policy: left-wins. If two plugins ship a subagent with the same name, the first one in GBRAIN_PLUGIN_PATH keeps it and the other gets a warning naming both sources. Deterministic + debuggable. Trust policy: plugins ship subagent defs ONLY. Cannot declare new tools, cannot extend the brain allow-list, cannot override safety flags. The subagent def's `allowed_tools:` frontmatter MUST subset the derived registry — validation happens at load time (worker startup), not at dispatch time, so a typo in a skill gives a loud startup error instead of silently "tool never fires at 3am." Manifest `plugin_version: "gbrain-plugin-v1"` locks the contract. Unknown versions rejected. `subagents` field escape attempts (`../../../etc` etc) rejected. gray-matter handles the markdown frontmatter parse — subagent defs don't conform to the page schema, so we don't use parseMarkdown. docs/guides/plugin-authors.md is the Wintermute-facing walkthrough. Covers the minimum viable plugin shape, the three policies, the frontmatter fields, known caveats (audit JSONL is local-only, tool calls always run remote=true, put_page is namespace-scoped). 22 unit tests pin path rejection, missing/invalid manifest, unsupported version, escape-attempt, basename fallback for missing frontmatter.name, allowed_tools round-trip, unknown-tool rejection with validAgentToolNames, empty env, multi-path, collision warning with left-wins, trimmed paths, manifest-rejection as warning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): gbrain agent run + logs + worker registration (v0.15 Lane 4H) Three integration seams wired: src/commands/agent.ts — \`gbrain agent run\`. Submits subagent jobs (or a fan-out of N + aggregator) under the trusted-submit flag so the PROTECTED_JOB_NAMES guard doesn't reject. Fan-out path creates the aggregator first (so children can reference its id as parent), submits each child with on_child_fail='continue' (required by Lane 1B's terminal- set + child_done machinery), then jsonb_set's the aggregator's children_ids. Short-circuits a 1-entry manifest to a single subagent with no aggregator. Follow mode runs agent-logs streaming + waitFor Completion in parallel and exits on terminal status; detach prints the job id and exits. Ctrl-C is handled as detach, not cancel — the job keeps running, consistent with durability invariants. src/commands/agent-logs.ts — \`gbrain agent logs\`. Merges ~/.gbrain/audit/ subagent-jobs-*.jsonl (heartbeats + submissions) with subagent_messages (persisted conversation) in one chronological stream. --follow polls at 1s and exits when the job hits terminal. --since accepts ISO-8601 OR relative shorthand (5m / 1h / 2d). Writes transcript tail (full message + tool tree) only for terminal jobs, so mid-run --follow doesn't spam a half-rendered transcript. src/commands/jobs.ts registerBuiltinHandlers — matches the shell-handler opt-in shape. GBRAIN_ALLOW_LLM_JOBS=1 registers the subagent + subagent_aggregator handlers, then loads plugins from GBRAIN_PLUGIN_PATH with validAgentToolNames pulled from BRAIN_TOOL_ALLOWLIST. Every plugin warning + loaded-plugin line prints to stderr, mirroring the openclaw- seam startup convention. src/core/minions/protected-names.ts — subagent + subagent_aggregator join the protected set. MCP submit_job returns permission_denied; only trusted-CLI callers (with allowProtectedSubmit) can insert these rows. src/cli.ts — adds 'agent' to CLI_ONLY + dispatches it like 'jobs'. Test fallout: subagent-handler.test.ts + subagent-transcript.test.ts helpers now submit under allowProtectedSubmit (they insert rows named 'subagent' directly against the queue). 23 new tests in agent-cli.test.ts cover: flag parsing (including --detach implies !follow, --tools comma split, -- terminator, unknown flag throw), --since parse (ISO, relative 5m/2h/1d, unparseable error), protected-name guard for all three names, trusted-submit gate, and a fan-out integration check that verifies the aggregator + children shape after --fanout-manifest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): rename max_children test's spawned jobs off the protected 'subagent' name The spawn-storm test submitted 50 literal-string 'subagent' children to exercise the max_children row-lock serialization. In v0.15 'subagent' is a PROTECTED_JOB_NAME (CLI-only; trusted submit required), so the old literal submission now throws before reaching the row-lock check. The test is about max_children semantics, not the v0.15 subagent runtime specifically — rename the child name to 'child_worker' so the test exercises the exact same queue.add path without tripping the new guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(ship): v0.15.0 — VERSION, CHANGELOG, README, upgrading-agents, CLAUDE.md Bumps VERSION → 0.15.0 and package.json → 0.15.0 (resolves the pre-existing drift — on master, VERSION=0.14.0 but package.json=0.13.1; src/version.ts reads package.json, so this is what the binary prints now). CHANGELOG lands the release-summary entry in the GStack voice + the full itemized change list (11 new modules, 3 new tables, queue correctness fixes, trust-model additions, 159 new unit tests). Voice rules respected — no em dashes, no AI vocabulary, real file names + real numbers. README gets a "Durable agents: `gbrain agent` (v0.15)" section next to the Minions block, with the three canonical CLI shapes (single run, fanout-manifest, logs --follow) and a pointer to plugin-authors.md. docs/UPGRADING_DOWNSTREAM_AGENTS.md gets a full v0.15.0 section covering the four adoption steps downstream agents (Wintermute and similar) need: (1) worker opt-in via GBRAIN_ALLOW_LLM_JOBS, (2) moving custom subagent defs to a plugin repo, (3) replacing ephemeral subagent runs with durable `gbrain agent run`, (4) the put_page namespace rule for agent-driven writes. CLAUDE.md updated with concise per-file descriptions for every new module: the handler, aggregator, audit, rate-leases, wait-for-completion, transcript, plugin-loader, brain-allowlist, tool-defs extraction, agent CLI + logs CLI, and the registerBuiltinHandlers wiring for subagent handlers + plugin-loader. Verified: binary builds (940 modules, 89ms compile), prints `gbrain 0.15.0`, `gbrain agent --help` shows the new subcommand shape. 170 new tests pass (full v0.15 surface). Full unit suite passes bar one parallel-load flake on a pre-existing E2E (graph-quality, passes in isolation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): drop GBRAIN_ALLOW_LLM_JOBS flag — subagent handlers always-on The env flag was ceremony. Shell jobs need the flag because they execute arbitrary CLI commands (RCE surface). Subagent jobs don't — they call the Anthropic API with whatever ANTHROPIC_API_KEY is in env, so the key is already the cost gate (no key → SDK fails on the first turn). And who-can-submit is already protected by PROTECTED_JOB_NAMES + TrustedSubmitOpts: MCP callers get permission_denied; only `gbrain agent run` with allowProtectedSubmit can insert subagent / subagent_aggregator rows. The flag added nothing the existing guards didn't already give us. registerBuiltinHandlers now always registers subagent + subagent_aggregator and loads GBRAIN_PLUGIN_PATH plugins. Worker startup prints: [minion worker] subagent handlers enabled instead of the conditional enabled/disabled pair. Plugin discovery runs unconditionally — empty PATH is a no-op. README, CHANGELOG, docs/UPGRADING_DOWNSTREAM_AGENTS, CLAUDE.md, agent CLI help text, and subagent handler docstring all updated to drop the flag reference. Shell handler's GBRAIN_ALLOW_SHELL_JOBS gate is untouched — separate concern (RCE, not billing). Full suite: 1859 pass, 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: scrub private agent-fork name from all public artifacts Enforces the rule added to CLAUDE.md (privacy section): never say `Wintermute` in any CHANGELOG, README, doc, PR, or commit message. Reader-facing copy says `your OpenClaw` (the term covers every downstream OpenClaw deployment — Wintermute, Hermes, AlphaClaw — in one umbrella the reader already recognizes). First-person / origin-story copy says `Garry's OpenClaw` (honest that this is the production deployment driving the feature, without exposing the private agent's name). Swept across: CHANGELOG.md (v0.15 entry + 4 historical mentions) README.md TODOS.md docs/UPGRADING_DOWNSTREAM_AGENTS.md docs/guides/plugin-authors.md (including example plugin names) docs/guides/plugin-handlers.md docs/guides/minions-fix.md docs/designs/KNOWLEDGE_RUNTIME.md (27 refs, mostly analytical) docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md skills/migrations/v0.11.0.md skills/skillpack-check/SKILL.md scripts/skillify-check.ts src/commands/doctor.ts src/commands/migrations/v0_15_0.ts src/commands/skillpack-check.ts src/core/enrichment/completeness.ts src/core/minions/plugin-loader.ts src/core/operations.ts src/core/output/scaffold.ts Intentionally kept (these mentions define/test the rule itself): CLAUDE.md — the privacy rule section necessarily uses the literal name to define the restriction and examples test/plugin-loader.test.ts — fixture name in a plugin-loading test; renaming risks breaking assertion logic test/integrations.test.ts — the word appears in a privacy-regex test that explicitly enforces name redaction test/doctor-minions-check.test.ts — a comment referencing the rule CEO plan artifact at ~/.gstack/projects/… — private, not distributed Binary builds (941 modules), 198/198 relevant tests pass, `gbrain --version` prints `0.15.0`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: gitignore bun --compile artifacts with a glob, not specific hashes Each `bun build --compile` emits a fresh hash-named `.*-*.bun-build` file in cwd. The prior entries listed two specific hashes that were already stale, so every build after those created a new untracked file requiring manual cleanup. Replace the two stale entries with `*.bun-build` so any current or future compile artifact is ignored automatically. Verified: ran `bun build --compile`, got two new `.*-*.bun-build` files, `git status` stays clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(ship): rename v0.15.0 → v0.16.0 gbrain master is at 0.14.2. Other 0.15.x PRs may land before/after this one — we bump the minor (new capability) and lock to 0.16.0 so ordering with concurrent work doesn't matter. Touches: - VERSION: 0.15.0 → 0.16.0 - package.json: 0.15.0 → 0.16.0 - Rename src/commands/migrations/v0_15_0.ts → v0_16_0.ts (+ all version strings inside + import in index.ts registry) - Rename test/migrations-v0_15_0.test.ts → migrations-v0_16_0.test.ts - test/apply-migrations.test.ts: skippedFuture lists now reference '0.16.0' - test/put-page-namespace.test.ts + test/mcp-tool-defs.test.ts: Lane comment refs updated - src/schema.sql + src/core/pglite-schema.ts: "v0.15.0" section comment updated; src/core/schema-embedded.ts regenerated - CHANGELOG.md: top entry renamed to [0.16.0]; inline v0_15_0 / v0.15.0 refs swept - docs/UPGRADING_DOWNSTREAM_AGENTS.md: section heading v0.15.0 → v0.16.0 Verified: `gbrain --version` prints 0.16.0, migration registry / buildPlan / put_page / mcp-tool-defs / handlers tests all green (49/49). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: reframe v0.16 durability headline around OpenClaw crashes "Laptop closed mid-run" framing implied a consumer workflow. Real pain is OpenClaw subagents dying daily on worker kill, memory blip, or timeout. Headline + README copy match the body now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate llms-full.txt after README copy change Regen drift guard caught the README edit from 83beec4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
254 lines
9.0 KiB
TypeScript
254 lines
9.0 KiB
TypeScript
/**
|
|
* plugin-loader tests. Exercise the full path/manifest/validation surface
|
|
* using ephemeral tmp dirs so no repo content is touched.
|
|
*/
|
|
|
|
import { describe, test, expect, beforeAll, afterAll, beforeEach } from 'bun:test';
|
|
import * as fs from 'node:fs';
|
|
import * as path from 'node:path';
|
|
import * as os from 'node:os';
|
|
import {
|
|
loadPluginsFromEnv,
|
|
loadSinglePlugin,
|
|
SUPPORTED_PLUGIN_VERSION,
|
|
__testing,
|
|
} from '../src/core/minions/plugin-loader.ts';
|
|
|
|
let tmpRoot: string;
|
|
|
|
beforeAll(() => {
|
|
tmpRoot = fs.mkdtempSync(path.join(os.tmpdir(), 'plugin-loader-test-'));
|
|
});
|
|
|
|
afterAll(() => {
|
|
fs.rmSync(tmpRoot, { recursive: true, force: true });
|
|
});
|
|
|
|
beforeEach(() => {
|
|
for (const f of fs.readdirSync(tmpRoot)) {
|
|
fs.rmSync(path.join(tmpRoot, f), { recursive: true, force: true });
|
|
}
|
|
});
|
|
|
|
// Helper: build a plugin directory with a manifest + a subagents/ tree.
|
|
function writePlugin(
|
|
name: string,
|
|
opts: {
|
|
plugin_version?: string;
|
|
subagents?: Record<string, string>;
|
|
subagents_field?: string;
|
|
omit_manifest?: boolean;
|
|
bad_manifest_json?: boolean;
|
|
} = {},
|
|
): string {
|
|
const dir = path.join(tmpRoot, name);
|
|
fs.mkdirSync(dir, { recursive: true });
|
|
|
|
if (!opts.omit_manifest) {
|
|
const manifest = {
|
|
name,
|
|
version: '1.0.0',
|
|
plugin_version: opts.plugin_version ?? SUPPORTED_PLUGIN_VERSION,
|
|
...(opts.subagents_field ? { subagents: opts.subagents_field } : {}),
|
|
};
|
|
fs.writeFileSync(
|
|
path.join(dir, 'gbrain.plugin.json'),
|
|
opts.bad_manifest_json ? '{not valid json' : JSON.stringify(manifest, null, 2),
|
|
);
|
|
}
|
|
|
|
if (opts.subagents) {
|
|
const sadir = path.join(dir, opts.subagents_field ?? 'subagents');
|
|
fs.mkdirSync(sadir, { recursive: true });
|
|
for (const [file, content] of Object.entries(opts.subagents)) {
|
|
fs.writeFileSync(path.join(sadir, file), content);
|
|
}
|
|
}
|
|
|
|
return dir;
|
|
}
|
|
|
|
describe('path policy', () => {
|
|
test('relative paths rejected', () => {
|
|
expect(__testing.rejectIfNotAbsolute('relative/path')).toMatch(/relative path rejected/);
|
|
});
|
|
|
|
test('~-prefixed paths rejected (no implicit expansion)', () => {
|
|
expect(__testing.rejectIfNotAbsolute('~/subagents')).toMatch(/~-prefixed/);
|
|
});
|
|
|
|
test('remote URLs rejected', () => {
|
|
expect(__testing.rejectIfNotAbsolute('https://example.com/plugins')).toMatch(/remote URL/);
|
|
expect(__testing.rejectIfNotAbsolute('file:///abs/p')).toMatch(/remote URL/);
|
|
});
|
|
|
|
test('absolute POSIX path accepted', () => {
|
|
expect(__testing.rejectIfNotAbsolute('/abs/path')).toBeNull();
|
|
});
|
|
});
|
|
|
|
describe('loadSinglePlugin', () => {
|
|
test('loads a minimal manifest + one subagent def', () => {
|
|
const dir = writePlugin('wintermute', {
|
|
subagents: {
|
|
'meeting-ingestion.md': `---\nname: meeting-ingestion\nmodel: sonnet\n---\n\nYou are a meeting ingester.\n`,
|
|
},
|
|
});
|
|
const res = loadSinglePlugin(dir);
|
|
expect('error' in res).toBe(false);
|
|
if ('error' in res) return;
|
|
expect(res.manifest.name).toBe('wintermute');
|
|
expect(res.subagents.length).toBe(1);
|
|
expect(res.subagents[0]!.name).toBe('meeting-ingestion');
|
|
expect(res.subagents[0]!.body.trim()).toBe('You are a meeting ingester.');
|
|
});
|
|
|
|
test('missing manifest returns error', () => {
|
|
const dir = writePlugin('empty', { omit_manifest: true });
|
|
const res = loadSinglePlugin(dir);
|
|
expect('error' in res).toBe(true);
|
|
if ('error' in res) expect(res.error).toMatch(/missing gbrain\.plugin\.json/);
|
|
});
|
|
|
|
test('invalid manifest JSON returns error', () => {
|
|
const dir = writePlugin('bad-json', { bad_manifest_json: true });
|
|
const res = loadSinglePlugin(dir);
|
|
expect('error' in res).toBe(true);
|
|
if ('error' in res) expect(res.error).toMatch(/invalid manifest JSON/);
|
|
});
|
|
|
|
test('unsupported plugin_version rejected', () => {
|
|
const dir = writePlugin('future', { plugin_version: 'gbrain-plugin-v999' });
|
|
const res = loadSinglePlugin(dir);
|
|
expect('error' in res).toBe(true);
|
|
if ('error' in res) expect(res.error).toMatch(/unsupported plugin_version/);
|
|
});
|
|
|
|
test('escape-attempt subagents field rejected', () => {
|
|
const dir = writePlugin('escape', { subagents_field: '../../../etc' });
|
|
const res = loadSinglePlugin(dir);
|
|
expect('error' in res).toBe(true);
|
|
if ('error' in res) expect(res.error).toMatch(/escapes plugin root/);
|
|
});
|
|
|
|
test('falls back to file basename when frontmatter.name is missing', () => {
|
|
const dir = writePlugin('nameless', {
|
|
subagents: {
|
|
'implicit-name.md': `---\nmodel: sonnet\n---\nbody\n`,
|
|
},
|
|
});
|
|
const res = loadSinglePlugin(dir);
|
|
if ('error' in res) throw new Error(res.error);
|
|
expect(res.subagents[0]!.name).toBe('implicit-name');
|
|
});
|
|
|
|
test('allowed_tools frontmatter list of strings survives round-trip', () => {
|
|
const dir = writePlugin('tools', {
|
|
subagents: {
|
|
'researcher.md': `---\nname: researcher\nallowed_tools:\n - brain_search\n - brain_get_page\n---\nbody\n`,
|
|
},
|
|
});
|
|
const res = loadSinglePlugin(dir);
|
|
if ('error' in res) throw new Error(res.error);
|
|
expect(res.subagents[0]!.allowed_tools).toEqual(['brain_search', 'brain_get_page']);
|
|
});
|
|
|
|
test('allowed_tools referencing unknown tool names fails load', () => {
|
|
const dir = writePlugin('rogue', {
|
|
subagents: {
|
|
'typo.md': `---\nname: typo\nallowed_tools:\n - brain_seerch\n---\nbody\n`,
|
|
},
|
|
});
|
|
const res = loadSinglePlugin(dir, {
|
|
validAgentToolNames: new Set(['brain_search', 'brain_get_page']),
|
|
});
|
|
expect('error' in res).toBe(true);
|
|
if ('error' in res) expect(res.error).toMatch(/unknown tools: brain_seerch/);
|
|
});
|
|
|
|
test('validation passes when allowed_tools are all in the registry', () => {
|
|
const dir = writePlugin('clean', {
|
|
subagents: {
|
|
'ok.md': `---\nname: ok\nallowed_tools:\n - brain_search\n---\nbody\n`,
|
|
},
|
|
});
|
|
const res = loadSinglePlugin(dir, {
|
|
validAgentToolNames: new Set(['brain_search']),
|
|
});
|
|
expect('error' in res).toBe(false);
|
|
});
|
|
|
|
test('skipping validation (no validAgentToolNames) allows any allowed_tools', () => {
|
|
const dir = writePlugin('no-validate', {
|
|
subagents: {
|
|
'anything.md': `---\nname: anything\nallowed_tools:\n - tool_we_have_not_shipped_yet\n---\nbody\n`,
|
|
},
|
|
});
|
|
const res = loadSinglePlugin(dir);
|
|
expect('error' in res).toBe(false);
|
|
});
|
|
});
|
|
|
|
describe('loadPluginsFromEnv', () => {
|
|
test('empty env returns no plugins, no warnings', () => {
|
|
const r = loadPluginsFromEnv({ envPath: '' });
|
|
expect(r.plugins).toEqual([]);
|
|
expect(r.warnings).toEqual([]);
|
|
});
|
|
|
|
test('multi-path: colon-separated PATH loads both', () => {
|
|
const a = writePlugin('a', { subagents: { 'x.md': `---\nname: x\n---\nbody` } });
|
|
const b = writePlugin('b', { subagents: { 'y.md': `---\nname: y\n---\nbody` } });
|
|
const r = loadPluginsFromEnv({ envPath: `${a}:${b}` });
|
|
expect(r.plugins.length).toBe(2);
|
|
expect(r.plugins[0]!.manifest.name).toBe('a');
|
|
expect(r.plugins[1]!.manifest.name).toBe('b');
|
|
});
|
|
|
|
test('collision: left-wins with a warning', () => {
|
|
const left = writePlugin('left', { subagents: { 'shared.md': `---\nname: shared\n---\nleft body` } });
|
|
const right = writePlugin('right', { subagents: { 'shared.md': `---\nname: shared\n---\nright body` } });
|
|
const r = loadPluginsFromEnv({ envPath: `${left}:${right}` });
|
|
expect(r.plugins.length).toBe(2);
|
|
// Only the left plugin contributes the `shared` subagent.
|
|
const leftSubs = r.plugins[0]!.subagents.map(s => s.name);
|
|
const rightSubs = r.plugins[1]!.subagents.map(s => s.name);
|
|
expect(leftSubs).toContain('shared');
|
|
expect(rightSubs).not.toContain('shared');
|
|
expect(r.warnings.some(w => /collision.*shared/.test(w))).toBe(true);
|
|
});
|
|
|
|
test('non-existent path is warned + skipped', () => {
|
|
const r = loadPluginsFromEnv({ envPath: '/definitely/does/not/exist/here' });
|
|
expect(r.plugins.length).toBe(0);
|
|
expect(r.warnings.some(w => /does not exist/.test(w))).toBe(true);
|
|
});
|
|
|
|
test('relative path in env is warned + skipped', () => {
|
|
const r = loadPluginsFromEnv({ envPath: 'relative/dir' });
|
|
expect(r.plugins.length).toBe(0);
|
|
expect(r.warnings.some(w => /relative path rejected/.test(w))).toBe(true);
|
|
});
|
|
|
|
test('a file (not a directory) is warned + skipped', () => {
|
|
const file = path.join(tmpRoot, 'not-a-dir.txt');
|
|
fs.writeFileSync(file, 'x');
|
|
const r = loadPluginsFromEnv({ envPath: file });
|
|
expect(r.plugins.length).toBe(0);
|
|
expect(r.warnings.some(w => /not a directory/.test(w))).toBe(true);
|
|
});
|
|
|
|
test('trims whitespace around paths', () => {
|
|
const a = writePlugin('trimmed', { subagents: { 'x.md': `---\nname: x\n---\nbody` } });
|
|
const r = loadPluginsFromEnv({ envPath: ` ${a} ` });
|
|
expect(r.plugins.length).toBe(1);
|
|
});
|
|
|
|
test('manifest rejection shows up as a warning (not a throw)', () => {
|
|
const bad = writePlugin('futurep', { plugin_version: 'gbrain-plugin-v999' });
|
|
const r = loadPluginsFromEnv({ envPath: bad });
|
|
expect(r.plugins.length).toBe(0);
|
|
expect(r.warnings.some(w => /unsupported plugin_version/.test(w))).toBe(true);
|
|
});
|
|
});
|