gbrain

Author	SHA1	Message	Date
Garry Tan	dcd13dd638	feat: v0.16.4 — gbrain check-resolvable CLI + skillify-check wiring (#325 ) * Merge origin/master into garrytan/check-resolvable-v1 Resolves CHANGELOG.md conflict: preserved v0.16.1/v0.16.2/v0.16.3 upstream entries and added v0.16.4 (check-resolvable ship) above them. * refactor: extract findRepoRoot to src/core/repo-root.ts Moves findRepoRoot() from private in doctor.ts to a zero-dependency shared module with a parameterized startDir for test hermeticity. Doctor imports the shared version; no behavior change (default arg matches prior semantics). The new gbrain check-resolvable CLI needs findRepoRoot too; importing from doctor.ts would drag in DB/progress dependencies. * feat: gbrain check-resolvable CLI wrapper Standalone CLI gate over checkResolvable(). Exits 1 on any issue (warnings or errors) per the README:259 contract, stricter than doctor's resolver_health which ignores warnings. Doctor has 15 other checks to lean on; the standalone command has nowhere to hide. - Stable JSON envelope: {ok, skillsDir, report, autoFix, deferred, error, message} - --fix auto-applies DRY fixes via autoFixDryViolations before re-checking - --dry-run with --fix previews without writing; autoFix.fixed shows diff - --verbose prints the deferred-checks note (Checks 5 + 6) - --skills-dir PATH for hermetic test runs - Permissive on unknown flags, matching lint/orphans/publish convention Checks 5 (trigger routing eval) and 6 (brain filing) are tracked as separate GitHub issues and surfaced via the deferred[] field in --json output. Covered by 17 new test cases (flag parsing, JSON envelope shape, exit-code regression gates, --fix wiring, --verbose output). * chore: bump version and changelog (v0.16.4) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: track check-resolvable issue-URL swap in TODOS Defers the filing of GitHub tracking issues for Checks 5 (trigger routing eval) and 6 (brain filing) plus the TBD-check-5/TBD-check-6 URL replacement in src/commands/check-resolvable.ts. Unblocks merging PR #325. * test: fix repo-root CI failure — assert parity, not path contents The 'default arg uses process.cwd()' test asserted the returned path matched /honolulu/, which is the local workspace name but not the CI runner's checkout path (/home/runner/work/gbrain/gbrain). The test's real purpose is behavioral parity: findRepoRoot() === findRepoRoot(cwd). Assert that directly instead of pattern-matching paths. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-22 02:07:00 -07:00
Garry Tan	418d955fd3	docs: v0.16.1 — minions worker deployment guide (from #287 ) (#317 ) * docs: v0.16.1 — minions worker deployment guide (from #287) New docs/guides/minions-deployment.md covering persistent worker deploy patterns (watchdog cron, inline --follow for cron-only workloads) plus the sharp edges of running gbrain jobs work against Supabase in production. Addresses a real gap: existing minions docs (minions-fix.md, minions-shell-jobs.md) cover schema repair and shell-job security, not deploy patterns. With v0.16.0's durable agent runtime, the persistent worker is now load-bearing for subagent + subagent_aggregator handlers too, so a supervised deploy story matters. Pre-landing accuracy pass corrected five factual bugs against current source: - max_stalled column default (5, not 1 or 3) - stalled-jobs smoke-test query (active, not waiting) - watchdog SIGTERM-to-SIGKILL grace (10s minimum, not 2s) - cron env pattern (crontab env lines, not source ~/.bashrc) - --follow exit semantics (blocks until submitted job is terminal, not until queue is empty) Docs-only. No code changed. Zero migration required. Contributed by a downstream agent fork via #287. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: credit Wintermute correctly in v0.16.1 CHANGELOG Wintermute is gbrain's own OpenClaw instance running in production, not a community contributor. The original CHANGELOG framing ("community contributor @wintermute") understated the funnier truth: the agent built on top of the project wrote the deploy guide for the project after hitting its sharp edges in production. Dogfooding with extra steps. Co-Authored-By: Wintermute (OpenClaw) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: rewrite minions deployment guide for agent line-by-line execution Fixes 12 findings from reading v0.16.1 guide as-an-agent would: Real bugs: - Crontab syntax wrong for user crontabs (6-field format dumped into `crontab -e` got "bad minute" or parsed `user` as the command). Now two labeled blocks: 5-field for `crontab -e`, 6-field for `/etc/crontab`. - Watchdog restart loop (old shutdown lines in unrotated log re-matched every 5 min forever). New `minion-watchdog.sh` writes 2-line PID file (PID + restart epoch) and only considers log lines newer than the epoch. Regex rewritten explicit (mawk rejects `{n}` intervals). - Credentials in world-readable /etc/crontab. Secrets move to /etc/gbrain.env (mode 600), referenced via BASH_ENV in crontab. Structural: - Preconditions block (5 fail-fast checks). - "Which option?" decision tree. - Template variable table (6 vars documented). - Upgrade section (v0.13.x -> v0.16.2 checklist). - Option 3: systemd.service + Procfile + fly.toml.partial snippets. - Uninstall section. - `--follow` example uses `gbrain embed --stale` (a real command) instead of the fictional `gbrain enrich`. - Dead-end "Proposed CLI flags (not yet implemented)" replaced with a "Tune per-job today" callout pointing at flags that exist. - Known Issues rewritten as imperatives. Also wires `docs/guides/minions-deployment.md` into `scripts/llms-config.ts` under the Configuration section so remote agents fetching llms.txt / llms-full.txt see the guide by name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.16.2) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: sync v0.16.2 CHANGELOG with the actual --follow example in the guide The shipped docs/guides/minions-deployment.md uses `gbrain embed --stale` (a real command) but the v0.16.2 CHANGELOG entry still referenced `gbrain enrich --brain $GBRAIN_WORKSPACE` (the older draft). Bring the CHANGELOG in line with what actually shipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 00:01:08 -07:00
Garry Tan	0e9f8814a5	feat: v0.16.0 — durable agent runtime (gbrain agent + subagent handler + plugin loader) (#258 ) * refactor(mcp): extract buildToolDefs helper for subagent tool registry reuse The inline operations.map(...) block in src/mcp/server.ts became the only source of truth for agent-facing tool definitions. Extract into a reusable exported helper so the v0.15 subagent tool registry can call it with a filtered OPERATIONS subset instead of duplicating the shape. Byte-for-byte equivalence regression pinned in test/mcp-tool-defs.test.ts — legacy inline mapping kept verbatim inside the test so any future drift between the new helper and the pre-extraction MCP schema fails loudly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(operations): subagent-aware OperationContext + put_page namespace Adds three optional fields to OperationContext: - jobId?: number — the currently running Minion job id - subagentId?: number — the owning subagent job id for tool-dispatched calls - viaSubagent?: boolean — FAIL-CLOSED flag for agent-path gating put_page now enforces a namespace rule when invoked on the subagent tool dispatch path (viaSubagent=true): writes MUST target `wiki/agents/<subagentId>/...`. Anchored, slash-boundary enforced so a collision like `wiki/agents/12evil/...` can't impersonate subagent 12. The check runs BEFORE the dry-run short-circuit so preview calls surface the same rejection. Fail-closed: a missing subagentId with viaSubagent=true rejects every slug rather than letting a dispatcher bug open a hole. Existing callers unaffected — all three fields are optional and the legacy put_page behavior is unchanged when viaSubagent is undefined/false. 12 regression + namespace tests pin: - local CLI writes (viaSubagent unset) accept arbitrary slugs - MCP writes (remote=true, viaSubagent unset) accept arbitrary slugs - subagent-path: anchored prefix accepted, wrong id rejected, prefix- collision defeated, leading-slash rejected, bare-prefix rejected, fail-closed on missing/NaN subagentId, permission_denied code emitted Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(schema): v0.15.0 subagent runtime tables + migration orchestrator Adds three new tables for the durable LLM agent runtime: subagent_messages — Anthropic message-block persistence. Parallel tool_use blocks in one assistant message live in content_blocks JSONB, not across rows (fixes the (job_id, turn_idx, role) misdesign codex caught in v0.13 drafting). subagent_tool_executions — Two-phase tool ledger. INSERT pending before execute, UPDATE complete/failed after. Replay re-runs pending rows only if the tool is idempotent (v1 ships only idempotent tools so this is preventive). subagent_rate_leases — Lease-based concurrency cap for outbound providers (e.g. anthropic:messages). Stale leases auto-prune on next acquire so crashed workers can't strand capacity. All DDL uses CREATE TABLE/INDEX IF NOT EXISTS — order-independent vs PR #244's initSchema() reorder, and idempotent across fresh-install + upgrade paths. Shipped in both src/schema.sql (Postgres) and src/core/pglite-schema.ts (PGLite); schema-embedded.ts regenerated. Migration orchestrator v0_15_0.ts (phases: schema → verify → record). v0_14_0.ts is a no-op stub so the registry's version sequence stays gapless (v0.14.0 shipped shell-jobs — code change, no DB migration). 10 unit tests for registry wiring, ordering, dry-run phase behavior, and schema-embedded table presence. test/apply-migrations.test.ts updated for the two new registry entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): emit child_done on every terminal + max_stalled per-job + terminal set fix Three correctness fixes the v0.15 subagent aggregator spine depends on: 1. child_done emission on ALL terminal transitions, not just success. - completeJob already emitted on success — now also tags outcome='complete'. - failJob newly emits on terminal 'failed' or 'dead' (outcome='failed'\|'dead', error=<text>), BEFORE the parent-terminal UPDATE so the EXISTS guard on the inbox INSERT doesn't skip it on fail_parent paths (codex catch). - cancelJob now emits outcome='cancelled' per descendant with a parent. - handleTimeouts now emits outcome='timeout' per timed-out child. ChildDoneMessage gains optional { outcome, error } — backwards compatible (legacy writers omitted them; consumers treat absent outcome as 'complete'). 2. Parent-resolution terminal set now includes 'failed'. Pre-v0.15 the `NOT EXISTS (... status NOT IN ('completed','dead','cancelled'))` guard treated a failed child as still-pending, stranding aggregator parents that chose on_child_fail='continue' or 'ignore' in waiting-children forever. Expanded to {completed, failed, dead, cancelled} everywhere parent resolution reads child status (completeJob inline, failJob remove_dep + continue, cancelJob sweep, handleTimeouts sweep, and the resolveParent method itself). 3. MinionJobInput.max_stalled threads through MinionQueue.add() on INSERT. Column exists with default 1 — that is "first stall → dead", which defeats crash recovery for long-running handlers. Subagent children will set max_stalled: 3 to survive mid-run worker kills. Second-submitter under an idempotency-key hit does NOT mutate the existing row (codex-flagged footgun — first-submit options are load-bearing state). 13 unit tests pin: emission on each of completeJob/failJob/cancelJob/ handleTimeouts, insertion order on fail_parent, terminal-set expansion with continue policy, max_stalled default + override + idempotency behavior. E2E tier 1 (Postgres) passes 141 tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): rate-leases + waitForCompletion infra for v0.15 subagent Two infrastructure modules the subagent handler spine depends on: rate-leases.ts — lease-based concurrency cap for outbound providers (anthropic:messages, openai:, etc.). Counter-based limiters leak capacity on worker crash; leases are owner-tagged rows with expires_at that auto-prune on the next acquire. Two-phase: txn-scoped pg_advisory_xact_lock guards the check-then-insert so concurrent acquires can't both win the "last slot". renewLeaseWithBackoff retries 3x (250/500/1000ms) for mid- call DB blips — on persistent failure the LLM-loop caller aborts with a renewable error so the worker re-claims and the rate invariant is preserved. Owner FK cascades clean up leases on job deletion. wait-for-completion.ts — poll-until-terminal helper for CLI callers. Minions' NOTIFY is worker-side only; `gbrain agent run --follow` polls getJob() until status is {completed, failed, dead, cancelled}. TimeoutError carries jobId + elapsedMs and does NOT cancel the job — the user can inspect via `gbrain jobs get <id>` later. Supports AbortSignal for Ctrl-C without throwing. Default pollMs is 1000 on Postgres, 250 on PGLite (inline CLI has no network RTT). 21 unit tests cover: single/multi acquire under cap, rejection past cap, release frees slot, different keys are independent, stale prune, cascade on owner delete, renew bumps expires_at, renew on missing is false, backoff path success + pruned short-circuit. waitForCompletion: fast-path terminal, transitions mid-wait (completed/failed/cancelled), TimeoutError shape, abort-signal early exit, non-existent job error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(minions): subagent ToolDef types + brain-tool registry (v0.15) Types first so the handler has a stable contract: - SubagentHandlerData / AggregatorHandlerData — the two job.data shapes - ToolCtx (engine, jobId, remote, signal) + ToolDef (name, description, input_schema, idempotent, execute) — Anthropic-envelope, distinct from the MCP McpToolDef extraction landed earlier - ContentBlock discriminated union for subagent_messages.content_blocks - SubagentStopReason + SubagentResult emitted on terminal completion brain-allowlist.ts derives one ToolDef per allow-listed OPERATION. Reuses the ParamDef → JSONSchema shape from the MCP extraction in a local helper (Anthropic's input_schema field diverges from MCP's inputSchema by a character). The 11-name allow-list is read-safe + put_page — every destructive / filesystem / identity-mutating op stays off by default. put_page gets a namespace-wrapped tool schema: `slug` pattern = anchored `^wiki/agents/<subagentId>/.+`. The server-side check in put_page op (shipped in prior commit) is still the authoritative gate — the schema just helps the model write correct slugs first-try. `subagentId` is plumbed into the ToolCtx so the viaSubagent=true fail-closed path lights up on every tool-dispatched put_page. filterAllowedTools narrows a registry by subagent_def's allowed_tools frontmatter field. Rejects unknown names at load time (no silent drop — typos in a skills/subagents/.md would otherwise ship to prod with a tool silently missing). 18 tests pin: every allowlist name exists in OPERATIONS (catches upstream rename), Anthropic name regex, put_page namespace pattern per-subagent, execute() routes through the op handler with viaSubagent=true, out-of- namespace put_page throws permission_denied, filter passes prefixed + unprefixed names, rejects unknowns, deduplicates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(minions): subagent-audit JSONL + transcript renderer Two small plumbing pieces the v0.15 subagent handler + `gbrain agent logs` depend on: subagent-audit.ts — JSONL-rotated audit log mirroring the shell-audit pattern. Two event flavors: submission (one line per job submit) and heartbeat (one line per turn boundary — llm_call_started / completed / tool_called / tool_result / tool_failed). Heartbeats fix the "--follow on a long Anthropic call shows nothing for 30 seconds" problem codex flagged. Never logs prompts or tool inputs (PII risk — subagent input_vars may carry user-supplied free text); DOES log tokens, ms_elapsed, tool_name, first 200 chars of error text. Rotates weekly via ISO week. `readSubagent AuditForJob` is the readback path for `gbrain agent logs` — scans the current + prior week file so job boundaries across weeks still resolve. `GBRAIN_AUDIT_DIR` overrides the default ~/.gbrain/audit/ for container deploys. transcript.ts — renders subagent_messages + subagent_tool_executions to markdown. Message order is authoritative; tool rows splice under their owning assistant tool_use by tool_use_id. Handles text, tool_use (with pending / complete / failed execution rows), tool_result (skipped if we already rendered the owning tool_use — avoids double-printing), and unknown block types (fenced JSON dump for diagnostics). Output is UTF-8-safe truncated at maxOutputBytes. 21 unit tests: ISO week filename rotation (incl. 2027-01-01 → W53-2026 boundary), submission + heartbeat write shapes, 200-char error cap, best- effort write failure doesn't throw, readback filters by job_id and sinceIso. Transcript: empty input, ordering, token line, tool_use + complete/failed/pending execution rendering, truncation, unknown-block diagnostic dump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): subagent LLM-loop handler with crash-resumable replay The main event: runs one Anthropic Messages API conversation with tool use, persists every turn + tool execution, and resumes cleanly after a worker kill anywhere in the loop. Design points that carry the v0.15 guarantees: 1. Two-phase tool persistence. INSERT status='pending' before dispatch, UPDATE to 'complete' or 'failed' after. subagent_messages rows are the canonical conversation; subagent_tool_executions rows are the canonical "did this tool run + what did it return". Either DB commit is atomic, so replay has a single source of truth. 2. Replay reconciliation. If the last persisted message is an assistant with tool_use blocks AND no following synthesized user message, we crashed mid-dispatch. On resume, finish those tools first (respecting idempotent flag for 'pending' rows), synthesize the user turn, and THEN call the LLM again. Non-idempotent pending rows abort the job with a clear error — v0.15 ships only idempotent tools so this is preventive. 3. Rate lease around every LLM call. acquireLease before, releaseLease after (both success and error paths). acquired=false throws RateLeaseUnavailableError — the worker treats it as a renewable error and re-claims later, so a temporary capacity cap doesn't fail the job terminally. 4. Anthropic prompt caching. system block gets cache_control=ephemeral; the LAST tool def gets it too (Anthropic caches everything up to and including the marked block). ~10x cost reduction on multi-turn agents per the plan. 5. Dual-signal abort. AbortSignal.any merges ctx.signal (timeout / lock loss / cancel) with ctx.shutdownSignal (worker SIGTERM). Both feed the Anthropic call's AbortSignal; mid-turn abort bails before the next LLM call with whatever turns are already persisted. Node ≥ 20 has AbortSignal.any; older runtimes get a manual-merge polyfill. 6. Injectable Anthropic client. The real SDK implements MessagesClient structurally; tests inject a FakeMessagesClient that scripts responses. 12 unit tests pin: no-tool happy path, single tool_use complete, tool throws → failed row + loop continues, unknown tool name rejection, max_turns cap, crash-then-resume with partial state, replay skips already- complete tool execs without re-invoking execute, non-idempotent pending rejects on resume, lease acquire + release roundtrip, RateLeaseUnavailable under cap-full, missing prompt validation, allowed_tools unknown-name. NOT in v0.15: refusal detection (stop_reason + content shape), stop_reason =max_tokens partial recovery, mid-call lease renewal with backoff loop. All three are documented as P2 items in the plan file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): subagent_aggregator handler with mixed-outcome rendering Claims AFTER all subagent children resolve — by then Lane 1B's queue changes have posted one child_done message per terminal transition into this job's inbox (complete / failed / dead / cancelled / timeout). The aggregator reads those, builds a deterministic markdown summary, and returns it as the handler result. Not an LLM call in v0.15 — output is reproducible concatenation so fan-out runs stay comparable. v0.16+ can add an LLM synthesis pass behind an opt-in flag. Contract: - empty children_ids → `(no children)` marker - missing child_done (shouldn't happen under v0.15 invariants but possible if a terminal-state path slipped past Lane 1B) → counted as failed with "no child_done message observed" error - non-complete outcomes: result is null in the output so no payload leaks alongside a failure label - children appear in the order children_ids was supplied - custom aggregate_prompt_template replaces the markdown header 13 unit tests cover: empty input, all-success, mixed outcomes, result suppression on failure, missing child_done handling, order preservation, custom template, progress + log emission, stringified JSONB payload parsing, non-child_done inbox filtering, legacy-writer outcome fallback, and internal helper edges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): GBRAIN_PLUGIN_PATH loader + plugin-authors guide (v0.15) Plumbing that makes Wintermute (and future downstream agents) day-1 usable on v0.15. Host repos drop a `gbrain.plugin.json` + `subagents/` directory somewhere, set GBRAIN_PLUGIN_PATH (colon-separated like \$PATH), and their custom subagent defs load at worker startup. Path policy is strict: absolute paths only. Relative, ~-prefixed, and URL-style (https://, file://) all rejected with warnings — the user controls where plugins live. Non-existent paths and files (not dirs) are warned and skipped so a typo doesn't crash worker startup. Collision policy: left-wins. If two plugins ship a subagent with the same name, the first one in GBRAIN_PLUGIN_PATH keeps it and the other gets a warning naming both sources. Deterministic + debuggable. Trust policy: plugins ship subagent defs ONLY. Cannot declare new tools, cannot extend the brain allow-list, cannot override safety flags. The subagent def's `allowed_tools:` frontmatter MUST subset the derived registry — validation happens at load time (worker startup), not at dispatch time, so a typo in a skill gives a loud startup error instead of silently "tool never fires at 3am." Manifest `plugin_version: "gbrain-plugin-v1"` locks the contract. Unknown versions rejected. `subagents` field escape attempts (`../../../etc` etc) rejected. gray-matter handles the markdown frontmatter parse — subagent defs don't conform to the page schema, so we don't use parseMarkdown. docs/guides/plugin-authors.md is the Wintermute-facing walkthrough. Covers the minimum viable plugin shape, the three policies, the frontmatter fields, known caveats (audit JSONL is local-only, tool calls always run remote=true, put_page is namespace-scoped). 22 unit tests pin path rejection, missing/invalid manifest, unsupported version, escape-attempt, basename fallback for missing frontmatter.name, allowed_tools round-trip, unknown-tool rejection with validAgentToolNames, empty env, multi-path, collision warning with left-wins, trimmed paths, manifest-rejection as warning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): gbrain agent run + logs + worker registration (v0.15 Lane 4H) Three integration seams wired: src/commands/agent.ts — \`gbrain agent run\`. Submits subagent jobs (or a fan-out of N + aggregator) under the trusted-submit flag so the PROTECTED_JOB_NAMES guard doesn't reject. Fan-out path creates the aggregator first (so children can reference its id as parent), submits each child with on_child_fail='continue' (required by Lane 1B's terminal- set + child_done machinery), then jsonb_set's the aggregator's children_ids. Short-circuits a 1-entry manifest to a single subagent with no aggregator. Follow mode runs agent-logs streaming + waitFor Completion in parallel and exits on terminal status; detach prints the job id and exits. Ctrl-C is handled as detach, not cancel — the job keeps running, consistent with durability invariants. src/commands/agent-logs.ts — \`gbrain agent logs\`. Merges ~/.gbrain/audit/ subagent-jobs-.jsonl (heartbeats + submissions) with subagent_messages (persisted conversation) in one chronological stream. --follow polls at 1s and exits when the job hits terminal. --since accepts ISO-8601 OR relative shorthand (5m / 1h / 2d). Writes transcript tail (full message + tool tree) only for terminal jobs, so mid-run --follow doesn't spam a half-rendered transcript. src/commands/jobs.ts registerBuiltinHandlers — matches the shell-handler opt-in shape. GBRAIN_ALLOW_LLM_JOBS=1 registers the subagent + subagent_aggregator handlers, then loads plugins from GBRAIN_PLUGIN_PATH with validAgentToolNames pulled from BRAIN_TOOL_ALLOWLIST. Every plugin warning + loaded-plugin line prints to stderr, mirroring the openclaw- seam startup convention. src/core/minions/protected-names.ts — subagent + subagent_aggregator join the protected set. MCP submit_job returns permission_denied; only trusted-CLI callers (with allowProtectedSubmit) can insert these rows. src/cli.ts — adds 'agent' to CLI_ONLY + dispatches it like 'jobs'. Test fallout: subagent-handler.test.ts + subagent-transcript.test.ts helpers now submit under allowProtectedSubmit (they insert rows named 'subagent' directly against the queue). 23 new tests in agent-cli.test.ts cover: flag parsing (including --detach implies !follow, --tools comma split, -- terminator, unknown flag throw), --since parse (ISO, relative 5m/2h/1d, unparseable error), protected-name guard for all three names, trusted-submit gate, and a fan-out integration check that verifies the aggregator + children shape after --fanout-manifest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> test(e2e): rename max_children test's spawned jobs off the protected 'subagent' name The spawn-storm test submitted 50 literal-string 'subagent' children to exercise the max_children row-lock serialization. In v0.15 'subagent' is a PROTECTED_JOB_NAME (CLI-only; trusted submit required), so the old literal submission now throws before reaching the row-lock check. The test is about max_children semantics, not the v0.15 subagent runtime specifically — rename the child name to 'child_worker' so the test exercises the exact same queue.add path without tripping the new guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(ship): v0.15.0 — VERSION, CHANGELOG, README, upgrading-agents, CLAUDE.md Bumps VERSION → 0.15.0 and package.json → 0.15.0 (resolves the pre-existing drift — on master, VERSION=0.14.0 but package.json=0.13.1; src/version.ts reads package.json, so this is what the binary prints now). CHANGELOG lands the release-summary entry in the GStack voice + the full itemized change list (11 new modules, 3 new tables, queue correctness fixes, trust-model additions, 159 new unit tests). Voice rules respected — no em dashes, no AI vocabulary, real file names + real numbers. README gets a "Durable agents: `gbrain agent` (v0.15)" section next to the Minions block, with the three canonical CLI shapes (single run, fanout-manifest, logs --follow) and a pointer to plugin-authors.md. docs/UPGRADING_DOWNSTREAM_AGENTS.md gets a full v0.15.0 section covering the four adoption steps downstream agents (Wintermute and similar) need: (1) worker opt-in via GBRAIN_ALLOW_LLM_JOBS, (2) moving custom subagent defs to a plugin repo, (3) replacing ephemeral subagent runs with durable `gbrain agent run`, (4) the put_page namespace rule for agent-driven writes. CLAUDE.md updated with concise per-file descriptions for every new module: the handler, aggregator, audit, rate-leases, wait-for-completion, transcript, plugin-loader, brain-allowlist, tool-defs extraction, agent CLI + logs CLI, and the registerBuiltinHandlers wiring for subagent handlers + plugin-loader. Verified: binary builds (940 modules, 89ms compile), prints `gbrain 0.15.0`, `gbrain agent --help` shows the new subcommand shape. 170 new tests pass (full v0.15 surface). Full unit suite passes bar one parallel-load flake on a pre-existing E2E (graph-quality, passes in isolation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): drop GBRAIN_ALLOW_LLM_JOBS flag — subagent handlers always-on The env flag was ceremony. Shell jobs need the flag because they execute arbitrary CLI commands (RCE surface). Subagent jobs don't — they call the Anthropic API with whatever ANTHROPIC_API_KEY is in env, so the key is already the cost gate (no key → SDK fails on the first turn). And who-can-submit is already protected by PROTECTED_JOB_NAMES + TrustedSubmitOpts: MCP callers get permission_denied; only `gbrain agent run` with allowProtectedSubmit can insert subagent / subagent_aggregator rows. The flag added nothing the existing guards didn't already give us. registerBuiltinHandlers now always registers subagent + subagent_aggregator and loads GBRAIN_PLUGIN_PATH plugins. Worker startup prints: [minion worker] subagent handlers enabled instead of the conditional enabled/disabled pair. Plugin discovery runs unconditionally — empty PATH is a no-op. README, CHANGELOG, docs/UPGRADING_DOWNSTREAM_AGENTS, CLAUDE.md, agent CLI help text, and subagent handler docstring all updated to drop the flag reference. Shell handler's GBRAIN_ALLOW_SHELL_JOBS gate is untouched — separate concern (RCE, not billing). Full suite: 1859 pass, 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: scrub private agent-fork name from all public artifacts Enforces the rule added to CLAUDE.md (privacy section): never say `Wintermute` in any CHANGELOG, README, doc, PR, or commit message. Reader-facing copy says `your OpenClaw` (the term covers every downstream OpenClaw deployment — Wintermute, Hermes, AlphaClaw — in one umbrella the reader already recognizes). First-person / origin-story copy says `Garry's OpenClaw` (honest that this is the production deployment driving the feature, without exposing the private agent's name). Swept across: CHANGELOG.md (v0.15 entry + 4 historical mentions) README.md TODOS.md docs/UPGRADING_DOWNSTREAM_AGENTS.md docs/guides/plugin-authors.md (including example plugin names) docs/guides/plugin-handlers.md docs/guides/minions-fix.md docs/designs/KNOWLEDGE_RUNTIME.md (27 refs, mostly analytical) docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md skills/migrations/v0.11.0.md skills/skillpack-check/SKILL.md scripts/skillify-check.ts src/commands/doctor.ts src/commands/migrations/v0_15_0.ts src/commands/skillpack-check.ts src/core/enrichment/completeness.ts src/core/minions/plugin-loader.ts src/core/operations.ts src/core/output/scaffold.ts Intentionally kept (these mentions define/test the rule itself): CLAUDE.md — the privacy rule section necessarily uses the literal name to define the restriction and examples test/plugin-loader.test.ts — fixture name in a plugin-loading test; renaming risks breaking assertion logic test/integrations.test.ts — the word appears in a privacy-regex test that explicitly enforces name redaction test/doctor-minions-check.test.ts — a comment referencing the rule CEO plan artifact at ~/.gstack/projects/… — private, not distributed Binary builds (941 modules), 198/198 relevant tests pass, `gbrain --version` prints `0.15.0`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: gitignore bun --compile artifacts with a glob, not specific hashes Each `bun build --compile` emits a fresh hash-named `.-.bun-build` file in cwd. The prior entries listed two specific hashes that were already stale, so every build after those created a new untracked file requiring manual cleanup. Replace the two stale entries with `.bun-build` so any current or future compile artifact is ignored automatically. Verified: ran `bun build --compile`, got two new `.-.bun-build` files, `git status` stays clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore(ship): rename v0.15.0 → v0.16.0 gbrain master is at 0.14.2. Other 0.15.x PRs may land before/after this one — we bump the minor (new capability) and lock to 0.16.0 so ordering with concurrent work doesn't matter. Touches: - VERSION: 0.15.0 → 0.16.0 - package.json: 0.15.0 → 0.16.0 - Rename src/commands/migrations/v0_15_0.ts → v0_16_0.ts (+ all version strings inside + import in index.ts registry) - Rename test/migrations-v0_15_0.test.ts → migrations-v0_16_0.test.ts - test/apply-migrations.test.ts: skippedFuture lists now reference '0.16.0' - test/put-page-namespace.test.ts + test/mcp-tool-defs.test.ts: Lane comment refs updated - src/schema.sql + src/core/pglite-schema.ts: "v0.15.0" section comment updated; src/core/schema-embedded.ts regenerated - CHANGELOG.md: top entry renamed to [0.16.0]; inline v0_15_0 / v0.15.0 refs swept - docs/UPGRADING_DOWNSTREAM_AGENTS.md: section heading v0.15.0 → v0.16.0 Verified: `gbrain --version` prints 0.16.0, migration registry / buildPlan / put_page / mcp-tool-defs / handlers tests all green (49/49). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: reframe v0.16 durability headline around OpenClaw crashes "Laptop closed mid-run" framing implied a consumer workflow. Real pain is OpenClaw subagents dying daily on worker kill, memory blip, or timeout. Headline + README copy match the body now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate llms-full.txt after README copy change Regen drift guard caught the README edit from 83beec4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:14:17 -07:00
Garry Tan	a4df40fe5c	feat: v0.15.2 - bulk-action progress streaming (stderr reporter, agent-visible heartbeats) (#293 ) * feat(progress): step 1 - shared ProgressReporter + CliOptions Adds the foundation for v0.14.2's bulk-action progress streaming work: - src/core/progress.ts: dependency-free reporter with auto/human/json/quiet modes, TTY-aware rendering, time+item rate gating, heartbeat helper for slow single queries, dot-composed child phases, EPIPE defense (both sync throw and async 'error' event), and a singleton module-level signal coordinator so SIGINT/SIGTERM emits abort events for all live phases without leaking per-instance listeners. - src/core/cli-options.ts: parseGlobalFlags() for --quiet / --progress-json / --progress-interval=<ms> (both space and = forms), plus cliOptsToProgressOptions() that resolves to the right mode. Non-TTY default is human-plain one-line-per-event; JSON is explicit opt-in so shell pipelines don't suddenly see structured noise. - test/progress.test.ts (17 cases): mode resolution, rate gating, no-fake- totals on heartbeat paths, EPIPE paths, SIGINT singleton, child phase composition. - test/cli-options.test.ts (14 cases): flag parsing, invalid values, interleaved flags, mode resolution. Follow-ups wire doctor/embed/files/export/extract/import/sync/migrate/ repair-jsonb/backlinks/orphans/lint/integrity/eval/autopilot/jobs plus the apply-migrations orchestrators through this reporter, and route Minion handlers to job.updateProgress instead of stderr. See the plan in ~/.claude/plans/. 1682 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 2 - wire global flags into cli.ts Parse --quiet / --progress-json / --progress-interval from argv BEFORE command dispatch, strip them, stash resolved CliOptions on a module-level singleton (same pattern as Commander's program.opts()) and on every OperationContext created for shared-op dispatch. - src/cli.ts: parseGlobalFlags(rawArgs) at the top of main(); setCliOptions once; dispatch sees only the stripped argv. Fixes the "gbrain --progress-json doctor" unknown-command case that Codex flagged. - src/core/cli-options.ts: expose setCliOptions/getCliOptions/ _resetCliOptionsForTest singleton. Commands that want progress call getCliOptions() to construct their reporter. - src/core/operations.ts: OperationContext gains optional cliOpts field so shared-op handlers (and MCP-invoked ops that need a reporter) can read the same settings. MCP callers leave it undefined and consumers default to quiet. - test/cli-options.test.ts: +4 cases covering singleton round-trip and an integration smoke spawning `bun src/cli.ts --progress-json --version` to prove the global flag survives dispatch. 45 relevant unit tests pass (progress + cli-options + cli.test.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 3a - doctor + orphans heartbeat streaming Doctor on a 52K-page brain used to sit silent for 10+ minutes while the DB checks ran, then get killed by an agent timeout. Wired through the new reporter so agents see which check is running and the slow ones heartbeat every second. doctor.ts: - Start a single `doctor.db_checks` phase around the DB section, with a per-check heartbeat before each step (connection, pgvector, rls, schema_version, embeddings, graph_coverage, integrity, jsonb_integrity, markdown_body_completeness). - jsonb_integrity now scans 5 targets, not 4: added page_versions. frontmatter so the check surface matches `repair-jsonb` (per Codex review of the plan — the old 4-target scan missed a known repair site). Per-target heartbeat so 50K-row scans show incremental progress. - markdown_body_completeness: wrap the existing query in a 1s heartbeat timer. The regex scan over rd.data ->> 'content' can't be paginated usefully; this just lets agents see life during the sequential scan. No fake totals — the LIMIT 100 query has no meaningful total count. - integrity sample: same heartbeat pattern around the 500-page scan. orphans.ts: - findOrphans() wraps the NOT EXISTS anti-join in a 1s heartbeat. Keyset pagination was considered and rejected: without an index on links.to_page_id it's no faster than the full scan, and may re-plan the anti-join per batch. A schema migration adding that index is the right fix and is queued for v0.14.3. Follow-ups: - Step 3b: wire embed/files/export (the \r-only stdout offenders). - Step 5: end-to-end progress test spawning `gbrain doctor --progress-json` against a fixture brain, asserting stderr events and clean stdout. All existing unit tests continue to pass (76/76 in doctor + orphans + progress + cli-options). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 3b - embed + files + export stderr progress Replaces the \r-on-stdout progress pattern in the three worst offenders (embed, files sync, export) with the shared reporter on stderr. Stdout now carries only final summaries, so scripts and tests that grep for counts ("Embedded N chunks", "Files sync complete", "Exported N pages") still work when output is piped. - embed.ts: runEmbedCore accepts an optional onProgress callback. The CLI wrapper builds a reporter and passes reporter.tick(); Minion handlers will pass job.updateProgress in Step 4. Worker-pool is single-threaded JS so no rate-gate race (per Codex review #18). - files.ts syncFiles(): tick per file; summary preserved on stdout. - export.ts: tick per page; summary preserved on stdout. Also fixes a --quiet flag collision. `skillpack-check` has its own --quiet mode (suppress all stdout). parseGlobalFlags strips --quiet globally now, and skillpack-check reads the resolved CliOptions singleton via getCliOptions() instead of re-parsing argv. Test updated to match the stripping behavior. 1686 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 3c - extract + import + sync reporter streaming Extract, import, and sync now stream per-file progress to stderr through the shared reporter. All three kept their stdout summaries + JSON action-events intact so existing tests + agent scripts are unaffected. - extract.ts (4 paths: links/timeline × fs/db): replaced the ad-hoc `process.stderr.write({event:"progress"...})` lines with reporter ticks. Same channel (stderr), canonical schema now, visible in both text and --json modes. Stdout action-events (`add_link` / `add_timeline`) untouched — tests grep them. - import.ts: the logProgress() function that printed every 100 files to stdout is now a progress.tick() call per file. Rate-gated by the reporter. Stdout still gets the final "Import complete (Xs)" summary and the --json payload. - sync.ts: three new phases (`sync.deletes`, `sync.renames`, `sync.imports`) tick per file, so big syncs show each step rather than a single end-of-run summary. Phase hierarchy ready to be child()-chained into runImport / runEmbed later, per Codex review #26. Updated the #132 nested-transaction regression test in test/sync.test.ts to also accept the new hoisted-loop shape — the guarantee (this loop is not wrapped in engine.transaction) still holds. 1686 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 3d - migrate/repair/backlinks/lint/integrity/eval Wires the remaining bulk commands through the reporter: - migrate-engine: phase starts (migrate.copy_pages, migrate.copy_links), per-page tick. Old \"Progress: N/total\" stdout logs replaced by stderr ticks; final stdout summary preserved. - repair-jsonb: per-column start + a heartbeat timer while each UPDATE runs (minutes on 50K-row tables). CRITICAL: stdout stays clean so migrations/v0_12_2.ts's JSON.parse(child.stdout) still works. Per Codex review #12. - backlinks: 1s heartbeat around findBacklinkGaps() (sync double-walk of the brain dir). - lint: tick per page; per-issue stdout output preserved. - integrity auto: tick per page in the main resolver loop. The separate ~/.gbrain/integrity-progress.jsonl resume marker is untouched (its role shifts from live progress reporting to resume-only). - eval: add an onProgress option to core's runEval(), CLI wraps with a reporter. Phases: eval.single / eval.ab. Tick per query. core/search/eval.ts gains a RunEvalOptions type so future callers (MCP eval op, Minion handlers) can also hook in without the reporter. 1686 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 3e - onProgress callbacks on core libs - src/core/embedding.ts: embedBatch() gains an optional EmbedBatchOptions.onBatchComplete callback, fired after each 100-item sub-batch. CLI wrappers pass reporter.tick; Minion handlers can pass job.updateProgress. - src/core/enrichment-service.ts: enrichEntities() config gains onProgress(done, total, name) fired after each entity. Same split: CLI -> reporter, Minion -> DB-backed progress. No CLI behavior change on its own. Wiring these callbacks into the Minion handlers is Step 4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 4 - orchestrators + upgrade + minion handlers - cli-options.ts: childGlobalFlags() returns the flag suffix to append to child gbrain subprocesses. Empty string by default, " --quiet --progress-json" when the parent has them set, so child behavior inherits the parent's progress-mode without scattering string-concat logic across every execSync site. - migrations/v0_12_2.ts: each execSync inherits the parent's global flags. Phase C (repair-jsonb --dry-run --json) pins explicit stdio to ['ignore','pipe','inherit'] so child stderr streams straight through while stdout stays captured for JSON.parse. Per Codex review #12. - migrations/v0_12_0.ts + v0_11_0.ts: same childGlobalFlags wiring at each gbrain-subcommand execSync. - upgrade.ts: post-upgrade timeout bumped 300s → 30min (1_800_000 ms) with GBRAIN_POST_UPGRADE_TIMEOUT_MS override. The old 300s cap killed v0.12.0 graph-backfill migrations on 50K+ brains; the heartbeat wiring added in v0.14.2 makes long waits observable, so a generous ceiling no longer means users stare at a silent terminal. - jobs.ts: the embed Minion handler passes job.updateProgress as the onProgress callback, so per-job progress is durable in minion_jobs and readable via `gbrain jobs get <id>`. Primary Minion progress channel is DB-backed — stderr from `jobs work` stays coarse for daemon liveness only. Per Codex review #20. 1686 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 5 - E2E doctor-progress test + CI guard scripts/check-progress-to-stdout.sh greps src/ for the banned `process.stdout.write('\r…')` pattern that v0.14.2 removed from the bulk-action codepaths. Wired into the `bun run test` script so any future regression that puts progress back on stdout fails fast. An empty allowlist documents the position: every known call site was migrated; new exceptions need a rationale in the allowlist. test/e2e/doctor-progress.test.ts (Tier 1, needs Postgres + pgvector): - `gbrain --progress-json doctor --json`: stderr carries JSONL progress events with the canonical {event, phase, ts} shape, starts + finishes for `doctor.db_checks`. Stdout stays parseable JSON — no progress pollution. - `gbrain doctor` (no flag): human-plain progress goes to stderr only, stdout stays free of `[doctor.db_checks]`. - `gbrain --quiet doctor`: reporter emits nothing; doctor still runs to completion. test/cli-options.test.ts: +2 spawning integration tests. One verifies `gbrain --progress-json --version` keeps stdout clean of progress events (single-shot commands that don't use a reporter aren't affected). One guards the skillpack-check --quiet regression — --quiet suppresses stdout by reading the resolved CliOptions singleton, not re-parsing argv. Full test matrix: bun run test -> 1726 pass / 184 skipped (no DB) / 0 fail bun run test:e2e -> 136 pass / 13 skipped / 0 fail Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(progress): step 6 - docs + v0.14.2 release bump - VERSION + package.json bumped to 0.14.2. - docs/progress-events.md (new): canonical JSON event schema reference. Stable from v0.14.2, additive only. Lists every phase name shipped in this release, the five event types (start/tick/heartbeat/finish/ abort), the TTY/non-TTY rendering rules, subprocess inheritance semantics, and the Minion DB-backed progress model. - CLAUDE.md: "Bulk-action progress reporting" section under the build instructions; Key files entries for src/core/progress.ts, src/core/cli-options.ts, scripts/check-progress-to-stdout.sh, and docs/progress-events.md; doctor.ts entry updated to note the v0.14.2 5-target jsonb_integrity scan + heartbeat wiring. - CHANGELOG.md v0.14.2: full release summary per project voice rules. The "numbers that matter" table, per-command before/after grid, backward-compat warnings for stdout→stderr moves, and an itemized changes section covering reporter/CLI plumbing/schema/Minion handlers/doctor fixes/upgrade timeout/CI guard/tests. No em dashes. Real file paths, real commands, real numbers. - skills/migrations/v0.14.2.md (new): agent migration note. Mechanical step is "nothing" since v0.14.2 is purely additive. Walks agents through the three new global flags, the 14 wired commands, the event schema cheat sheet, Minion progress via job.updateProgress, and scripts/verification commands. Full test matrix: bun run test (unit + guards) -> 1726 pass / 184 skipped / 0 fail bun run test:e2e (Postgres) -> 141 pass / 8 skipped / 0 fail Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version to 0.15.2, restore master's [0.14.2] CHANGELOG entry Master sits at 0.14.2 (reliability wave). This PR lands on top as 0.15.2 (progress streaming wave). Splits the merge-time combined CHANGELOG entry back into two discrete release sections so history stays honest: - [0.15.2] = progress reporter, CliOptions, 14 wired commands, Minion embed handler, doctor jsonb_integrity 5-target fix, upgrade timeout bump, CI guard, progress unit+E2E tests. - [0.14.2] = master's eight root-cause bug fixes, restored verbatim from origin/master. Touched files: - VERSION + package.json: 0.14.2 -> 0.15.2 (next patch off master). - skills/migrations/v0.14.2.md -> skills/migrations/v0.15.2.md (rename + rewrite frontmatter + body to v0.15.2). - CHANGELOG.md: split into two entries; progress-wave refs renamed v0.14.2 -> v0.15.2; reliability-wave entry restored from master. - src/core/progress.ts, src/commands/doctor.ts, src/commands/sync.ts, src/commands/upgrade.ts, docs/progress-events.md, test/sync.test.ts: progress-wave v0.14.2 references -> v0.15.2. The remaining v0.14.2 references in test/e2e/migration-flow.test.ts (Bug 3 context) and CLAUDE.md (reliability-wave key commands, Bug 3 ledger move) correctly point at master's 0.14.2 release. Test matrix after version bump: bun run test -> 1780 pass / 179 skipped / 0 fail Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 17:54:13 -07:00
Garry Tan	ff10796a00	fix(wave): v0.15.1 - 4 hot issues + scope expansion (#248 ) * fix(wave): 4 hot issues + 3 scope expansions (v0.13.1) Addresses four user-filed regressions after v0.13.0 plus three adjacent footgun closures. * #170 — CREATE INDEX [CONCURRENTLY] IF NOT EXISTS idx_pages_updated_at_desc on pages (updated_at DESC). Engine-aware migration v12 with invalid-index cleanup on Postgres, plain CREATE on PGLite. ~700x on 30k+ row brains. Contributed by @fuleinist (#215). * #219 — Minions schema default max_stalled 1 -> 5. v13 migration ALTERs the default and UPDATEs existing non-terminal rows (waiting/active/ delayed/waiting-children/paused) so live queues get rescued on upgrade. Adds MinionJobInput.max_stalled with [1,100] clamp. New --max-stalled CLI flag on `jobs submit`. Reported by @macbotmini-eng. * #218 — package.json postinstall surfaces errors instead of silencing. trustedDependencies whitelists @electric-sql/pglite. doctor schema_version check fails loudly when migrations never ran and links to #218. README + INSTALL_FOR_AGENTS warn against `bun install -g`. Reported by @gopalpatel. * #223 — @electric-sql/pglite pinned to exactly 0.4.3 (was ^0.4.4). PGLiteEngine.connect() wraps PGlite.create() errors with a message pointing at the issue + gbrain doctor. Does NOT suggest 'missing migrations' as a cause (create-time abort happens before migrations run). Pin is unverified against macOS 26.3; error-wrap is the safety net. Reported by @AndreLYL. * Scope: `gbrain jobs submit` gains --backoff-type/--backoff-delay/ --backoff-jitter/--timeout-ms/--idempotency-key (MinionJobInput audit). * Scope: `gbrain jobs smoke --sigkill-rescue` regression case (opt-in, CI-only) that simulates a killed worker and asserts the new default rescues. * Scope: `gbrain doctor --index-audit` reports zero-scan Postgres indexes as drop candidates (informational; no auto-drop). Infrastructure: * Migration interface extended with sqlFor: { postgres?, pglite? } and transaction: boolean. Runner picks the engine-specific branch and bypasses engine.transaction() when transaction:false (required for CONCURRENTLY). BrainEngine.kind readonly discriminator added. * scripts/check-jsonb-pattern.sh CI guard extended to block `max_stalled DEFAULT 1` from regressing. Tests: * 15 new unit tests: v12/v13 structural + behavioral assertions, max_stalled default/clamp/backfill, PGLite error-wrap source guard, engine kind discriminator. * 3 regression tests pinned by IRON RULE. * Full unit suite: 1416 pass. * Full E2E suite against Postgres 16 + pgvector: 126 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.13.1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: sync documentation for v0.13.1 CLAUDE.md "Key files" and "Commands" sections refreshed to match the v0.13.1 fix wave: - Note `BrainEngine.kind` discriminator on engine.ts - Document v0.13.1 connect() error-wrap on pglite-engine.ts - Refresh src/core/minions/ layout (no shell handler, no protected-names, no quiet-hours/stagger — that was v0.13-development scaffolding that did not ship) - Add src/core/migrate.ts entry with `Migration` interface extensions (`sqlFor`, `transaction: false`) - Document new `gbrain jobs submit` flags (--max-stalled, --backoff-type, --backoff-delay, --backoff-jitter, --timeout-ms, --idempotency-key) - Document `gbrain jobs smoke --sigkill-rescue` regression guard - Document `gbrain doctor --index-audit` and the schema_version=0 surface that catches #218 postinstall failures - Extend check-jsonb-pattern.sh note with the max_stalled DEFAULT 1 regression guard - Touch up test file blurbs for migrate.test.ts, pglite-engine.test.ts, minions.test.ts with v0.13.1 coverage Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): run files sequentially to eliminate shared-DB race The E2E suite was flaky. ~3 of every 5 runs had 4-10 failures clustered in Links, Timeline, Versions, Minions resilience, Parallel Import, and Page CRUD tests. Symptoms included "expected 16 pages, got 8" (half), "expected 1 link inserted, got 0", timeline entries missing after round-trip, and similar data-shape mismatches. Root cause: bun test runs test FILES in parallel (each in a worker process). 13 E2E files share one DATABASE_URL, and `setupDB()` in `test/e2e/helpers.ts` does `TRUNCATE ... CASCADE` on all tables before each file's `importFixtures()`. File A's TRUNCATE would race with file B's in-flight INSERT stream, producing the observed half-populated or wrong-count states. An earlier attempt used a Postgres advisory lock held on a dedicated single-connection client for the lifetime of each file's run. It broke because bun's default 5000 ms hook timeout fires on queued beforeAll() calls: with 13 files serializing through the lock, files 2-13 would time out waiting for file 1 to finish. This commit switches to sequential file execution at the harness level via scripts/run-e2e.sh, which loops through test/e2e/.test.ts one at a time, tracks aggregate pass/fail counts, and exits non-zero on the first failing file. No lock, no timeout issues, no changes to any test file. package.json test:e2e points at the new script. Verified: 5 back-to-back runs against the same Postgres container, each completing in ~5 min. Every run: 13 files, 138 tests, 0 fails. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore: bump version to 0.15.1 (fix wave locked to MINOR line) Master v0.14.2 was the last /investigate root-cause wave on the v0.14.x line. This fix wave opens v0.15.x: four hot issues (#170, #218, #219, #223) close v0.13.x regressions that v0.14.x didn't cover, so the MINOR bump reflects the semantic shift — new schema migrations (v14, v15), a new CLI surface (`--max-stalled`, `--sigkill-rescue`, `--index-audit`), a new BrainEngine contract (`kind` discriminator + extended `Migration` interface), and a new install-time contract (PGLite 0.4.3 pin + `trustedDependencies`). Locked to 0.15.1 in advance: other work may land before/after this PR, but the version is fixed so reviewers can cite a stable number. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 13:19:23 -07:00
Garry Tan	7f156c8873	feat: v0.15.0 llms.txt + llms-full.txt + AGENTS.md (#294 ) * feat: llms.txt + llms-full.txt + AGENTS.md (v0.15.0) Ship three new public artifacts at the repo root so agents that aren't Claude Code can discover GBrain documentation cleanly: - AGENTS.md — ~45-line install + operating protocol for non-Claude agents (Codex, Cursor, OpenClaw, Aider). Covers install, read order, trust boundary, config/debug/migration pointers, fork regeneration. Uses relative links so it survives fork/rename. - llms.txt — llmstxt.org-spec index (H1 + blockquote + Core entry points / Configuration / Debugging / Migrations / Philosophy / Optional H2s). - llms-full.txt — same index with core docs inlined for single-fetch ingestion. ~225KB, well under the 600KB FULL_SIZE_BUDGET. Generator-driven via scripts/build-llms.ts + scripts/llms-config.ts. LLMS_REPO_BASE env var makes it fork-friendly. bun run build:llms regenerates both outputs deterministically. test/build-llms.test.ts has 7 cases: paths resolve on disk, generator idempotent, llms.txt spec shape, checked-in files match generator output (drift guard), content contract (RESOLVER / AGENTS / INSTALL referenced), AGENTS mirrors README + INSTALL_FOR_AGENTS install path, llms-full.txt under size budget. Leverage point per Codex review: README.md + INSTALL_FOR_AGENTS.md install prompts now tell agents to fetch AGENTS.md first. Without this, the new files were invisible. Drive-by fix: INSTALL_FOR_AGENTS.md:136 had `git pull origin main` while the repo's default branch is master (origin/HEAD -> master). Corrected. Plan + reviews: /plan-eng-review CLEARED, /codex adversarial review found 15 issues — 7 folded in directly, 3 user tension decisions, 5 stayed as NOT-in-scope with reasoning. Version bumps to 0.15.0 (new public-artifact feature surface per Step 12 of /ship feature-signal heuristic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: normalize VERSION to 3-digit to match master master uses 3-digit semver (0.14.2); my earlier /ship bumped VERSION to the 4-digit gstack format (0.15.0.0). Revert to 0.15.0 to match package.json (already 3-digit) and master's convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 11:51:32 -07:00
Garry Tan	c0b621923b	fix: JSONB double-encode + splitBody wiki + parseEmbedding (v0.12.1) (#196 ) * fix: splitBody and inferType for wiki-style markdown content - splitBody now requires explicit timeline sentinel (<!-- timeline -->, --- timeline ---, or --- directly before ## Timeline / ## History). A bare --- in body text is a markdown horizontal rule, not a separator. This fixes the 83% content truncation @knee5 reported on a 1,991-article wiki where 4,856 of 6,680 wikilinks were lost. - serializeMarkdown emits <!-- timeline --> sentinel for round-trip stability. - inferType extended with /writing/, /wiki/analysis/, /wiki/guides/, /wiki/hardware/, /wiki/architecture/, /wiki/concepts/. Path order is most-specific-first so projects/blog/writing/essay.md → writing, not project. - PageType union extended: writing, analysis, guide, hardware, architecture. Updates test/import-file.test.ts to use the new sentinel. Co-Authored-By: @knee5 (PR #187) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: JSONB double-encode bug on Postgres + parseEmbedding NaN scores Two related Postgres-string-typed-data bugs that PGLite hid: 1. JSONB double-encode (postgres-engine.ts:107,668,846 + files.ts:254): ${JSON.stringify(value)}::jsonb in postgres.js v3 stringified again on the wire, storing JSONB columns as quoted string literals. Every frontmatter->>'key' returned NULL on Postgres-backed brains; GIN indexes were inert. Switched to sql.json(value), which is the postgres.js-native JSONB encoder (Parameter with OID 3802). Affected columns: pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata. page_versions.frontmatter is downstream via INSERT...SELECT and propagates the fix. 2. pgvector embeddings returning as strings (utils.ts): getEmbeddingsByChunkIds returned "[0.1,0.2,...]" instead of Float32Array on Supabase, producing [NaN] cosine scores. Adds parseEmbedding() helper handling Float32Array, numeric arrays, and pgvector string format. Throws loud on malformed vectors (per Codex's no-silent-NaN requirement); returns null for non-vector strings (treated as "no embedding here"). rowToChunk delegates to parseEmbedding. E2E regression test at test/e2e/postgres-jsonb.test.ts asserts jsonb_typeof = 'object' AND col->>'k' returns expected scalar across all 5 affected columns — the test that should have caught the original bug. Runs in CI via the existing pgvector service. Co-Authored-By: @knee5 (PR #187 — JSONB triple-fix) Co-Authored-By: @leonardsellem (PR #175 — parseEmbedding) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: extract wikilink syntax with ancestor-search slug resolution extractMarkdownLinks now handles [[page]] and [[page\|Display Text]] alongside standard [text](page.md). For wiki KBs where authors omit leading ../ (thinking in wiki-root-relative terms), resolveSlug walks ancestor directories until it finds a matching slug. Without this, wikilinks under tech/wiki/analysis/ targeting [[../../finance/wiki/concepts/foo]] silently dangled when the correct relative depth was 3 × ../ instead of 2. Co-Authored-By: @knee5 (PR #187) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: gbrain repair-jsonb + v0.12.1 migration + CI grep guard - New gbrain repair-jsonb command. Detects rows where jsonb_typeof(col) = 'string' and rewrites them via (col #>> '{}')::jsonb across 5 affected columns: pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata, page_versions.frontmatter. Idempotent — re-running is a no-op. PGLite engines short-circuit cleanly (the bug never affected the parameterized encode path PGLite uses). --dry-run shows what would be repaired; --json for scripting. - New v0_12_1.ts migration orchestrator. Phases: schema → repair → verify. Modeled on v0_12_0 pattern, registered in migrations/index.ts. Runs automatically via gbrain upgrade / apply-migrations. - CI grep guard at scripts/check-jsonb-pattern.sh fails the build if anyone reintroduces the ${JSON.stringify(x)}::jsonb interpolation pattern. Wired into bun test via package.json. Best-effort static analysis (multi-line and helper-wrapped variants are caught by the E2E round-trip test instead). - Updates apply-migrations.test.ts expectations to account for the new v0.12.1 entry in the registry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.12.1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.12.1 - CLAUDE.md: document repair-jsonb command, v0_12_1 migration, splitBody sentinel contract, inferType wiki subtypes, CI grep guard, new test files (repair-jsonb, migrations-v0_12_1, markdown) - README.md: add gbrain repair-jsonb to ADMIN command reference - INSTALL_FOR_AGENTS.md: fix verification count (6 -> 7), add v0.12.1 upgrade guidance for Postgres brains - docs/GBRAIN_VERIFY.md: add check #8 for JSONB integrity on Postgres-backed brains - docs/UPGRADING_DOWNSTREAM_AGENTS.md: add v0.12.1 section with migration steps, splitBody contract, wiki subtype inference - skills/migrate/SKILL.md: document native wikilink extraction via gbrain extract links (v0.12.1+) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 07:14:24 +08:00
Garry Tan	d8613366a5	Minions v7 + v0.11.1 canonical migration + skillify (#130 ) * feat: add minion_jobs schema, migration v5, and executeRaw to BrainEngine Foundation for the Minions job queue system. Adds: - minion_jobs table (20 columns) with CHECK constraints, partial indexes, and RLS. Inspired by BullMQ's job model, adapted for Postgres. - Migration v5 creates the table for existing databases. - executeRaw<T>() method on BrainEngine interface for raw SQL access, needed by the Minions module for claim queries (FOR UPDATE SKIP LOCKED), token-fenced writes, and atomic stall detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Minions job queue — queue, worker, backoff, types BullMQ-inspired Postgres-native job queue built into GBrain. No Redis. No external dependencies. Postgres transactions replace Lua scripts. - MinionQueue: submit, claim (FOR UPDATE SKIP LOCKED), complete/fail (token-fenced), atomic stall detection (CTE), delayed promotion, parent-child resolution, prune, stats - MinionWorker: handler registry, lock renewal, graceful SIGTERM, exponential backoff with jitter, UnrecoverableError bypass - MinionJobContext: updateProgress(), log(), isActive() for handlers - 8-state machine: waiting/active/completed/failed/delayed/dead/ cancelled/waiting-children Patterns stolen from: BullMQ (lock tokens, stall detection, flows), Sidekiq (dead set, backoff formula), Inngest (checkpoint/resume). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: 43 tests for Minions job queue Full coverage of the Minions module against PGLite in-memory: - Queue CRUD (9): submit, get, list, remove, cancel, retry, duplicate - State machine (6): waiting→active→completed/failed, retry→delayed→waiting - Backoff (4): exponential, fixed, jitter range, attempts_made=0 edge - Stall detection (3): detect stalled, counter increment, max→dead - Dependencies (5): parent waits, fail_parent, continue, remove_dep, orphan - Worker lifecycle (5): register, start-without-handlers, claim+execute, non-Error throws, UnrecoverableError bypass - Lock management (3): renewal, token mismatch, claim sets lock fields - Claim mechanics (4): empty queue, priority ordering, name filtering, delayed promotion timing - Cancel & retry (2): cancel active, retry dead Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Minions CLI commands and MCP operations Wire Minions into the GBrain CLI and MCP layer: CLI (gbrain jobs): submit <name> [--params JSON] [--follow] [--dry-run] list [--status S] [--queue Q] [--limit N] get <id> — detailed view with attempt history cancel/retry/delete <id> prune [--older-than 30d] stats — job health dashboard work [--queue Q] [--concurrency N] — Postgres-only worker daemon 6 MCP operations (contract-first, auto-exposed via MCP server): submit_job, get_job, list_jobs, cancel_job, retry_job, get_job_progress Built-in handlers: sync, embed, lint, import. --follow runs inline. Worker daemon blocked on PGLite (exclusive file lock). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for Minions job queue CLAUDE.md: added Minions files to key files, updated operation count (36), BrainEngine method count (38), test file count (45), added jobs CLI commands. CHANGELOG.md: added Minions entry to v0.10.0 (background jobs, retry, stall detection, worker daemon). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Minions v2 — agent orchestration primitives (pause/resume, inbox, tokens, replay) Adds the foundation for Minions as universal agent orchestration infrastructure. GBrain's Postgres-native job queue now supports durable, observable, steerable background agents. The OpenClaw plugin (separate repo) will consume these via library import, not MCP, for zero-latency local integration. ## New capabilities - Concurrent worker — Promise pool replaces sequential loop. Per-job AbortController for cooperative cancellation. Graceful shutdown waits for all in-flight jobs via Promise.allSettled. - Pause/resume — pauseJob clears the lock and fires AbortSignal on active jobs. Handlers check ctx.signal.aborted and exit cleanly. resumeJob returns paused jobs to waiting. Catch block skips failJob when signal.aborted. - Inbox (separate table) — minion_inbox table for sidechannel messages. sendMessage with sender validation (parent job or admin). readInbox is token-fenced and marks read_at atomically. Separate table avoids row bloat from rewriting JSONB on every send. - Token accounting — tokens_input/tokens_output/tokens_cache_read columns. updateTokens accumulates; completeJob rolls child tokens up to parent. USD cost computed at read time (no cost_usd column — pricing too volatile). - Job replay — replayJob clones a terminal job with optional data overrides. New job, fresh attempts, no parent link. ## Handler contract additions MinionJobContext now provides: - `signal: AbortSignal` — cooperative cancellation - `updateTokens(tokens)` — accumulate token usage - `readInbox()` — check for sidechannel messages - `log()` — now accepts string or TranscriptEntry ## MCP operations added pause_job, resume_job, replay_job, send_job_message — all auto-generate CLI commands and MCP server endpoints. ## Library exports package.json exports map adds ./minions and ./engine-factory paths so plugins can `import { MinionQueue } from 'gbrain/minions'` for direct library use. ## Instruction layer (the teaching) - skills/minion-orchestrator/SKILL.md — when/how to use Minions, decision matrix, lifecycle management, anti-patterns - skills/conventions/subagent-routing.md — cross-cutting rule: all background work goes through Minions - RESOLVER.md — trigger entries for agent orchestration - manifest.json — registered ## Schema migration v6 Additive: 3 token columns, paused status, minion_inbox table with unread index. Full Postgres + PGLite support. No backfill needed. ## Tests 65 tests (was 43): pause/resume (5), inbox (6), tokens (4), replay (4), concurrent worker context (3), plus all existing coverage. ## What's NOT in this commit Deferred to follow-up PRs: - LISTEN/NOTIFY subscribe (needs real Postgres E2E) - Resource governor (depends on concurrent worker stress testing) - Routing eval harness (needs API keys + benchmark data) - OpenClaw plugin (separate @gbrain/openclaw-minions-plugin repo) See docs/designs/MINIONS_AGENT_ORCHESTRATION.md for full CEO-approved design. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(minions): migration v7 — agent_parity_layer schema Adds columns on minion_jobs (depth, max_children, timeout_ms, timeout_at, remove_on_complete, remove_on_fail, idempotency_key) plus the new minion_attachments table. Three partial indexes for bounded scans: idx_minion_jobs_timeout, idx_minion_jobs_parent_status, and uniq_minion_jobs_idempotency. Check constraints enforce non-negative depth and positive child cap / timeout. Additive migration — existing installs pick it up via ensureSchema on next use. No user action required. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(minions): extend types for v7 parity layer Extends MinionJob with depth/max_children/timeout_ms/timeout_at/ remove_on_complete/remove_on_fail/idempotency_key. Extends MinionJobInput with the same options plus max_spawn_depth override. Adds MinionQueueOpts (maxSpawnDepth default 5, maxAttachmentBytes default 5 MiB). Adds AttachmentInput/Attachment shapes and ChildDoneMessage in the InboxMessage union. rowToMinionJob updated to pick up the new columns. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(minions): attachments validator New module validateAttachment() gates every attachment write. Rejects empty filenames, path traversal (.., /, \), null bytes, oversized content (5 MiB default, per-queue override), invalid base64, and implausible content_type headers. Returns normalized { filename, content_type, content (Buffer), sha256, size } on success. The DB also enforces UNIQUE (job_id, filename) as defense-in-depth for concurrent addAttachment races — JS-only checks are not sufficient. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(minions): queue v7 — depth, child cap, timeouts, cascade, idempotency, child_done Wraps completeJob and failJob in engine.transaction() so parent hook invocations (resolveParent, failParent, removeChildDependency) fold into the same transaction as the child update. A process crash between child and parent can't strand the parent in waiting-children anymore. Adds v7 behaviors: - Depth tracking. add() computes depth = parent.depth + 1 and rejects past maxSpawnDepth (default 5). - Per-parent child cap. add() takes SELECT ... FOR UPDATE on the parent, counts non-terminal children, rejects when count >= max_children. NULL max_children = no cap. - Per-job wall-clock timeout. claim() populates timeout_at when timeout_ms is set. New handleTimeouts() dead-letters expired rows with error_text='timeout exceeded'. Terminal — no retry. - Cascade cancel. cancelJob() walks descendants via recursive CTE with depth-100 runaway cap. Returns the root row. Re-parented descendants (parent_job_id NULL) are naturally excluded. - Idempotency. add() uses INSERT ... ON CONFLICT (idempotency_key) DO NOTHING RETURNING; falls back to SELECT when RETURNING is empty. Same key always yields the same job id. - child_done inbox. completeJob inserts {type:'child_done', child_id, job_name, result} into the parent's inbox in the same transaction as the token rollup, guarded by EXISTS so terminal/deleted parents skip without FK violation. New readChildCompletions(parent_id, lock_token, since?) helper; token-fenced like readInbox. - removeOnComplete / removeOnFail. Deletes the row after the parent hook fires, so parent policy sees consistent state. - Attachment methods. addAttachment validates via validateAttachment then INSERTs; UNIQUE (job_id, filename) backs the JS dup check. listAttachments, getAttachment, deleteAttachment round out the API. Fixes pre-existing inverted status bug: add() now puts children in waiting/delayed (not waiting-children) and atomically flips the parent to waiting-children in the same transaction. Tests no longer need manual UPDATE workarounds. Two correctness fixes: - Sibling completion race. Under READ COMMITTED, two grandchildren completing concurrently each saw the other as still-active in the pre-commit snapshot and neither flipped the parent. Fixed by taking SELECT ... FOR UPDATE on the parent row at the start of completeJob and failJob transactions, serializing siblings on the parent lock. - JSONB double-encode. postgres.js conn.unsafe(sql, params) auto- JSON-encodes parameters. Calling JSON.stringify(obj) first stored a JSON string literal (jsonb_typeof=string) and broke payload->>'key' queries silently. Removed JSON.stringify from three call sites (child_done inbox post, updateProgress, sendMessage). PGLite tolerated both forms so unit tests missed it — real-PG E2E caught it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(minions): worker — timeout safety net + handleTimeouts tick Worker tick now calls handleStalled() first, then handleTimeouts() — stall requeue wins over timeout dead-letter when both could fire in the same cycle. handleTimeouts() guards on lock_until > now() so stalled jobs take the retryable path. launchJob schedules a per-job setTimeout(timeout_ms) that fires ctx.signal as a best-effort handler interrupt. The timer is always cleared in .finally so process exit isn't delayed by a dangling timer. Handlers that respect AbortSignal stop cleanly; handlers that ignore it still get dead-lettered by the DB-side handleTimeouts. Removed post-completeJob and post-failJob parent-hook calls from the worker — those are now inside the queue method transactions. Worker becomes simpler and crash-safer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(minions): 33 new unit tests for v7 parity layer Covers depth cap, per-parent child cap, timeout dead-letter, cascade cancel (including the re-parent edge case), removeOnComplete / removeOnFail, idempotency (single + concurrent), child_done inbox (posted in txn + survives child removeOnComplete + since cursor), attachment validation (oversize, path traversal, null byte, duplicates, base64), AbortSignal firing on pause mid-handler, catch-block skipping failJob when aborted, worker in-flight bookkeeping, token-rollup guard when parent already terminal, and setTimeout safety-net cleanup. Existing tests updated to remove the inverted-status manual UPDATE workarounds that the add() fix made obsolete. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(e2e): Minions v7 concurrency + OpenClaw resilience coverage minions-concurrency.test.ts spins two MinionWorker instances against the test Postgres, submits 20 jobs, and asserts zero double-claims (every job runs exactly once). This is the only test that actually proves FOR UPDATE SKIP LOCKED under real concurrency — PGLite runs on a single connection and can't exercise the race. minions-resilience.test.ts covers the six OpenClaw daily pains: 1. Spawn storm caps enforce under concurrent submit. 2. Agent stall → handleStalled() requeues; handleTimeouts() skips (lock_until guard). 3. Forgotten dispatches recoverable via child_done inbox. 4. Cascade cancel stops grandchildren mid-flight. 5. Deep tree fan-in (parent → 3 children → 2 grandchildren each) completes with the full inbox chain. 6. Parent crash/recovery resumes from persisted state. helpers.ts extends ALL_TABLES with minion_attachments, minion_inbox, and minion_jobs (FK dependents first) so E2E teardown doesn't leak rows between runs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: release v0.11.0 — Minions v7 agent orchestration primitives Bumps VERSION / package.json to 0.11.0. Adds CHANGELOG entry covering depth tracking, max_children, per-job timeouts, cascade cancel, idempotency keys, child_done inbox, removeOnComplete/Fail, attachments, migration v7, plus the two correctness fixes (sibling completion race and JSONB double-encode). TODOS.md captures the four v7 follow-ups: per-queue rate limiting, repeat/cron scheduler, worker event emitter, and waitForChildren convenience helpers. 1066 unit + 105 E2E = 1171 tests passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(minions): unify JSONB inserts, tighten nullish coalescing Three non-blocker cleanups from post-ship review of v0.11.0: - queue.ts add() and completeJob(): pre-stringifying with JSON.stringify while other sites pass raw objects with $n::jsonb casts. postgres.js double-encodes if you stringify first — works on PGLite (text→JSONB auto-cast), fails silently on real PG. Unify on raw object + explicit $n::jsonb cast. - queue.ts readChildCompletions: since clause used sent_at > $2 relying on PG's implicit text→TIMESTAMPTZ coercion. Explicit $2::timestamptz is safer and clearer. - types.ts rowToMinionJob: parent_job_id used \|\| which coerces 0 to null. Harmless today (SERIAL IDs start at 1) but ?? is semantically correct. All 110 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(minions): updateProgress missed $1::jsonb cast in unification Residual from c502b7e — updateProgress was the only remaining JSONB write without the explicit ::jsonb cast. Not broken (implicit cast works) but breaks the convention the prior commit unified everywhere else. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * doc: Minions v7 skill count + jobs subcommands (26 skills) README: bump skill count 25 → 26, add minion-orchestrator row, add `gbrain jobs` command family block so v0.11.0's headline feature is actually discoverable from the top-level commands reference. CLAUDE.md: unit test count 48 → 49 (minions.test.ts expanded), skill count 25 → 26, add minion-orchestrator to Key files + skills categorization, expand MinionQueue one-liner to cover v7 primitives (depth/child-cap, timeouts, idempotency, child_done inbox, removeOnComplete/Fail). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat: Minions adoption UX — smoke test + migration + pain-triggered routing Teach OpenClaw when to reach for Minions vs native subagents. Ship three pieces so upgrading from v0.10.x actually lands for real users: - `gbrain jobs smoke` — one-command health check that submits a `noop` job, runs a worker, verifies completion, and prints engine-aware guidance (PGLite installs get the "daemon needs Postgres, use --follow" note). Fails loud if schema's below v7 so the user knows to `gbrain init`. - `skills/migrations/v0.11.0.md` — post-upgrade migration file the auto-update agent reads. Six steps: apply schema, run smoke, ask user via AskUserQuestion which mode they want (always / pain_triggered / off), write to `~/.gbrain/preferences.json`, sanity-check handlers, mark done. Completeness scores on each option so the recommendation is explicit. - `skills/conventions/subagent-routing.md` rewritten — was a "MUST use Minions for ALL background work" mandate, now reads preferences.json on every routing decision and branches on three modes. Mode B (pain_triggered) is the default: keep subagents until gateway drops state, parallel > 3, runtime > 5min, or user expresses frustration. Then pitch the switch in-session with a specific script. Rename pass: "Minions v7" → "Minions" in README (JOBS block), TODOS.md (P1 section header + depends-on), CHANGELOG.md v0.11.0 entry. v7 stays as the internal schema version in code/migration contexts. The product name is just Minions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * doc(readme): promote Minions — 6 OpenClaw pains + how each is fixed The one-line mention in the skills table wasn't doing the work. Added a dedicated section between "How It Works" and "Getting Data In" that leads with the six multi-agent failures every OpenClaw user hits daily (spawn storms, hung handlers, forgotten dispatches, unstructured debugging, gateway crashes, runaway grandchildren) and maps each pain to the specific Minions primitive that fixes it. Includes the smoke test command, the adoption default (pain_triggered), and a pointer to skills/minion-orchestrator for the full patterns. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(bench): add harness for Minions vs OpenClaw subagent dispatch Shared harness (openclawDispatch + minionsHandler) using matching claude-haiku-4-5 calls on both sides so the delta measures queue+ dispatch overhead on top of identical LLM work. Includes statsFromResults (p50/p95/p99) and formatStats helpers. Uses `openclaw agent --local` embedded mode; does not test gateway multi-agent fan-out (documented in the harness header). * test(bench): durability under SIGKILL — Minions vs OpenClaw --local Headline bench for the claim: when the orchestrator dies mid-dispatch, Minions rescues via PG state + stall detection; OpenClaw --local loses in-flight work outright. Minions side: seed 10 active+expired-lock rows (exact state a SIGKILLed worker leaves) then run a rescue worker. Expect 10/10 completed. OpenClaw side: spawn 10 `openclaw agent --local` in parallel, SIGKILL each at 500ms, count pre-kill delivered output. Expect 0/10 — no persistence layer, nothing to recover. Budget: ~$0 (Minions handlers sleep 10ms; OC calls die at 500ms so partial LLM billing is negligible). * test(bench): per-dispatch throughput — Minions vs OpenClaw --local 20 serial dispatches each side, identical claude-haiku-4-5 call with the same trivial prompt. p50/p95/p99 reported via statsFromResults. Serial (not parallel) so the per-dispatch cost is measured honestly and LLM token spend stays bounded (~$0.08 total). Minions: one queue, one worker, one concurrency. Submit → poll to completion before next submit. OpenClaw: N sequential `openclaw agent --local` spawns. * test(bench): fan-out — Minions 10-wide concurrency vs 10 parallel OC spawns Parent dispatches 10 children, waits for all to return. Minions uses worker concurrency=10 sharing one warm process; OpenClaw parallel `openclaw agent --local` spawns, each boots its own runtime. 3 runs × 10 children per run. Reports ok count and wall time per run plus summary. Honest caveat documented: does not test OC gateway multi-agent fan-out — that needs a custom WS client and LLM-backed parent agent. This measures what users script today. Budget: ~$0.12 LLM spend. * test(bench): memory — 10 in-flight subagents, single-proc vs 10-proc cost Measures resident memory for keeping 10 subagents in flight. Minions: one worker process, concurrency=10 with handlers that park on a promise — sample RSS of the test process via process.memoryUsage(). OpenClaw: 10 parallel `openclaw agent --local` processes, sum their RSS via `ps -o rss=`. Handlers are cheap sleeps, no LLM — we want harness memory, not LLM client state. Budget: $0. * test(bench): fan-out — don't gate on OC success rate, report numbers Initial run showed OC parallel `--local` at 10-wide hits 40% failure rate (17/30 across 3 runs). That's the finding, not a test bug — process startup stampede + LLM rate limits. Bench now prints error samples and reports the numbers instead of gating. Minions side still gates at 90% (30/30 observed in practice). * doc(benchmarks): Minions vs OpenClaw --local subagent dispatch Real numbers on four claims: durability, throughput, fan-out, memory. Same claude-haiku-4-5 call on both sides so the delta is queue+dispatch+ process cost on top of identical LLM work. Headline: Minions rescues 10/10 from a SIGKILLed worker in 458ms while OpenClaw --local loses all 10; ~10× faster per dispatch (778ms p50 vs 8086ms p50); ~21× faster at 10-wide fan-out AND 100% reliable vs OC's 43% failure rate; 2 MB vs 814 MB to keep 10 subagents in flight. Honest caveats section covers what this doesn't test (OC gateway multi-agent, load tests, other models). Fully reproducible via test/e2e/bench-vs-openclaw/. * doc(readme): inject Minions vs OpenClaw bench numbers Headline deltas now in the Minions section: 10/10 vs 0/10 on crash, ~10× faster per dispatch, ~21× faster fan-out at 10-wide with 0% failure vs 43%, ~400× less memory. Links to the full bench doc. Prose first said Minions "fixes all six pains." Now it shows the numbers that prove it. * bench: production Wintermute benchmark — Minions 753ms vs sub-agent timeout Real deployment: 45K-page brain on Render+Supabase. Task: pull 99 tweets, write brain page, commit, sync. Minions: 753ms, $0. Sub-agent: gateway timeout (>10s, couldn't even spawn under production load). Also: 19,240 tweets backfilled across 36 months in 15 min at $0. Sub-agents would cost $1.08 and fail 40% of spawns. * bench: tweet ingestion — Minions 719ms vs OpenClaw 12.5s (17×) Production benchmark with runnable test code: - test/e2e/bench-vs-openclaw/tweet-ingest.bench.ts (reusable) - docs/benchmarks/2026-04-18-tweet-ingestion.md (publishable) Task: pull 100 tweets from X API, write brain page, commit, sync. Minions: 719ms mean, $0, 100% success. OpenClaw: 12,480ms mean, $0.03/run, 60% success (gateway timeouts). At scale: 36-month backfill, 19K tweets, 15 min, $0 vs est. $1.08. * doc(benchmarks): Wintermute production data point for Minions vs OpenClaw Adds a production-environment data point to the Minions README section: one month of tweet ingest on Wintermute (Render + Supabase + 45K-page brain) ran end-to-end in 753ms for \$0.00 via Minions, while the equivalent sessions_spawn hit the 10s gateway timeout and produced nothing. Full methodology + logs in docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(core): preferences.ts + cli-util.ts — foundations for v0.11.1 Adds two foundational modules that apply-migrations (Lane A-4), the v0.11.0 orchestrator (Lane C-1), and the stopgap script (Lane C-4) all depend on. - src/core/preferences.ts: atomic-write ~/.gbrain/preferences.json (mktemp + rename, 0o600, forward-compatible for unknown keys) with validateMinionMode, loadPreferences, savePreferences. Plus appendCompletedMigration + loadCompletedMigrations for the ~/.gbrain/migrations/completed.jsonl log (tolerates malformed lines). Uses process.env.HOME \|\| homedir() so $HOME overrides work in CI and tests; Bun's os.homedir() caches the initial value and ignores later mutations. - src/core/cli-util.ts: promptLine(prompt) helper, extracted from src/commands/init.ts:212-224. Shared so init, apply-migrations, and the v0.11.0 orchestrator's mode prompt don't each reinvent it. test/preferences.test.ts: 21 unit tests covering load/save atomicity, 0o600 perms, forward-compat for unknown keys, minion_mode validation, completed.jsonl JSONL append idempotence, auto-ts population, malformed- line tolerance in loadCompletedMigrations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(init): add --migrate-only flag (schema-only, no saveConfig) Context: v0.11.0 migration orchestrators need a safe way to re-apply the schema against an existing brain without risking a config flip. Today running bare `gbrain init` with no flags defaults to PGLite and calls saveConfig, which would silently overwrite an existing Postgres database_url — caught by Codex in the v0.11.1 plan review as a show-stopper data-loss bug. The new --migrate-only path: - loadConfig() reads the existing config (does NOT call saveConfig) - errors out with a clear "run gbrain init first" if no config exists - connects via the already-configured engine, calls engine.initSchema(), disconnects - --json emits structured success/error payloads Everything downstream in the v0.11.1 migration chain (apply-migrations, the stopgap bash script, the package.json postinstall hook) will invoke this flag rather than bare gbrain init. test/init-migrate-only.test.ts: 4 tests covering the no-config error path, --json error payload shape, happy-path with a PGLite fixture (verifies config.json content is byte-identical after the call — the real invariant), and idempotent rerun. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(migrations): TS registry replaces filesystem migration scan Context: Codex flagged that bun build --compile produces a self-contained binary, and the existing findMigrationsDir() in upgrade.ts:145 walks skills/migrations/v.md on disk — which fails on a compiled install because the markdown files aren't bundled. The plan's fix is a TS registry: migrations are code, imported directly, visible to both source installs and compiled binaries. - src/commands/migrations/types.ts: shared Migration, OrchestratorOpts, OrchestratorResult types. - src/commands/migrations/index.ts: exports the migrations[] array, getMigration(version), and compareVersions() (semver comparator). The feature_pitch data that lived in the MD file frontmatter now lives here as a code constant on each Migration, so runPostUpgrade's post-upgrade pitch printer can consume it without a filesystem read. - src/commands/migrations/v0_11_0.ts: stub orchestrator + pitch. The full phase implementation lands in Lane C-1; for now the stub throws a clear "not yet implemented" so apply-migrations --list (Lane A-4) can still enumerate the migration. test/migrations-registry.test.ts: 9 tests covering ascending-semver ordering, feature_pitch shape invariants, getMigration lookup, and compareVersions edge cases (equal / newer / older / single-digit across major bumps). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(cli): gbrain apply-migrations — migration runner CLI Reads ~/.gbrain/migrations/completed.jsonl, diffs against the TS migration registry, runs pending orchestrators. Resumes status:"partial" entries (the stopgap bash script writes these so v0.11.1 apply-migrations can pick up where it left off). Idempotent: rerunning when up-to-date exits 0. Flags: --list Show applied + partial + pending + future. --dry-run Print the plan; take no action. --yes / --non-interactive Skip prompts (used by runPostUpgrade + postinstall). --mode <a\|p\|o> Preset minion_mode (bypasses the Phase C TTY prompt). --migration vX.Y.Z Force-run one specific version. --host-dir <path> Include $PWD in host-file walk (default is $HOME/.claude + $HOME/.openclaw only). --no-autopilot-install Skip Phase F. Diff rule (Codex H9): apply when no status:"complete" entry exists AND migration.version ≤ installed VERSION. Previously proposed rule was "version > currentVersion", which would SKIP v0.11.0 when running v0.11.1; regression test in apply-migrations.test.ts pins the correct semantics. Registered in src/cli.ts CLI_ONLY Set; dispatched before connectEngine so each phase owns its own engine/subprocess lifecycle (no double-connect when the orchestrator shells out to init --migrate-only or jobs smoke). test/apply-migrations.test.ts: 18 unit tests covering parseArgs for every flag, indexCompleted/statusForVersion correctness (including stopgap-then- complete transition), and buildPlan's four buckets (applied / partial / pending / skippedFuture) with the Codex H9 regression pinned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(upgrade): runPostUpgrade tail-calls apply-migrations; postinstall hook Closes the v0.11.0 mega-bug: migration skills never fired on upgrade. `runPostUpgrade` now does two things: 1. Cosmetic: prints feature_pitch headlines for migrations newer than the prior binary. Uses the TS registry (Codex K) instead of walking skills/migrations/.md on disk — compiled binaries see the same list source installs do. 2. Mechanical: invokes apply-migrations --yes --non-interactive in the same process so Phase F (autopilot install) doesn't hit a subprocess timeout wall. Catches + surfaces errors without failing the upgrade. Also: - Drops the early-return on missing upgrade-state.json (Codex H8). runPostUpgrade now runs apply-migrations unconditionally; it's cheap when nothing is pending. This repairs every broken-v0.11.0 install on their next upgrade attempt. - Bumps the `gbrain post-upgrade` subprocess timeout in runUpgrade from 30s → 300s (Codex H7). A v0.11.0→v0.11.1 migration that has to schema-init + smoke + prefs + host-rewrite + launchd-install exceeds 30s trivially. - Removes now-dead findMigrationsDir + extractFeaturePitch helpers and their filesystem-reading imports (readdirSync, resolve). - src/cli.ts post-upgrade dispatch now awaits the async runPostUpgrade. apply-migrations (Lane A-4): - First-install guard: loadConfig() check at the top. No brain configured = exit silently for --yes / --non-interactive (postinstall stays quiet on fresh `bun add gbrain`); explicit message on --list / --dry-run. package.json: - New `postinstall` script: gbrain --version >/dev/null 2>&1 && gbrain apply-migrations --yes --non-interactive 2>/dev/null \|\| true. The --version sanity check guards against a half-written binary (Codex review criticism). \|\| true prevents `bun update gbrain` failure mid-upgrade. Manual smoke verified: fresh $HOME with no config → apply-migrations --yes silently exits 0; --dry-run prints the one-liner "No brain configured... Nothing to migrate." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> refactor(commands): extract library-level Core functions that throw not exit Codex architecture finding #5: reusing CLI entry-point functions as Minions handler bodies is wrong. If a Minion invokes runExtract / runEmbed / runBacklinks / runLint and the handler hits a process.exit(1), the ENTIRE WORKER process dies — killing every other in-flight job. Handlers need library-level APIs that throw, and the CLI stays a thin wrapper that catches + exits. Per-command shape: - runXxxCore(opts): throws on validation errors, returns structured result. Handler-safe. - runXxx(args): arg parser; calls Core; catches; process.exit(1) on thrown errors. CLI-safe. Shipped: - runExtractCore({ mode, dir, dryRun?, jsonMode? }) → ExtractResult - runEmbedCore({ slug? \| slugs? \| all? \| stale? }) → void - runBacklinksCore({ action, dir, dryRun? }) → BacklinksResult - runLintCore({ target, fix?, dryRun? }) → LintResult sync.ts is already correct — performSync throws; runSync wraps. No change. import.ts deferred to v0.12.0 (its one process.exit fires only on a missing dir arg; handlers always pass a dir, so worker-kill risk is zero in practice). Noted in the plan's Out-of-scope. Smoke verified: all four Core functions throw on invalid mode / missing dir / not-found target instead of exiting the process. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(jobs): Tier 1 handlers + autopilot-cycle (the killer handler) registerBuiltinHandlers now handlers every operation autopilot needs to dispatch via Minions + the single autopilot-cycle handler the autopilot loop actually submits each interval. Existing handlers (sync, embed, lint) rewired to call library-level Core functions directly instead of the CLI wrappers. CLI wrappers call process.exit(1) on validation errors; if a worker claimed a badly-formed job, the WORKER PROCESS would die — killing every in-flight job. Cores throw, so one bad job fails one job. New handlers: - extract → runExtractCore (mode: links\|timeline\|all, dir) - backlinks → runBacklinksCore (action: check\|fix, dir) - autopilot-cycle → THE killer handler. Runs sync → extract → embed → backlinks inline. Each step wrapped in try/catch; returns { partial: true, failed_steps: [...] } when any step fails. Does NOT throw on partial failure — that would trigger Minion retry, and an intermittent extract bug would block every future cycle. Replaces the 4-job parent-child DAG proposed in early plan drafts (Codex H3/H4: parent/child is NOT a depends_on primitive in Minions). import.ts handler still uses the CLI wrapper (runImport) — import's one process.exit fires only on a missing dir arg and the handler always passes a dir; Core extraction deferred to v0.12.0 when Tier 2 refactors happen. registerBuiltinHandlers promoted from private to exported for testability. test/handlers.test.ts: 4 tests. Asserts every expected handler name registers. Asserts autopilot-cycle against a nonexistent repo returns { partial: true, failed_steps: ['sync', 'extract', 'backlinks'] } — does NOT throw. Asserts autopilot-cycle against an empty (but real) git repo returns a result with a steps map, never throws. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(autopilot): Minions dispatch + worker spawn supervisor + async shutdown Autopilot now dispatches each cycle as a single `autopilot-cycle` Minion job (with idempotency_key on the cycle slot) instead of running steps inline. A forked `gbrain jobs work` child drains the queue durably, supervised by autopilot. The user runs ONE install step (`gbrain autopilot --install`) and gets sync + extract + embed + backlinks + durable job processing, with no separate worker daemon to manage. Mode selection: - minion_mode=always OR pain_triggered (default), engine=postgres → Minions dispatch. Spawn child, submit autopilot-cycle each interval. - minion_mode=off, OR engine=pglite, OR `--inline` flag → run steps inline in-process, same as pre-v0.11.1. PGLite has an exclusive file lock that blocks a second worker process, so the inline path is the only path that works there. Worker supervision: - spawn(resolveGbrainCliPath(), ['jobs', 'work'], { stdio: 'inherit' }). stdio:'inherit' avoids pipe-buffer blocking (Codex architecture #2). - On worker exit: 10s backoff + restart. Crash counter caps at 5 → autopilot stops with a clear error. - resolveGbrainCliPath() prefers argv[1] (cli.ts / /gbrain), then process.execPath (compiled binary suffix check), then `which gbrain` (installed to $PATH). NEVER blindly uses process.execPath, which on source installs is the Bun runtime, not `gbrain` (Codex architecture #1). Shutdown: - Async SIGTERM/SIGINT handler: sends SIGTERM to worker, awaits its exit for up to 35s (the worker's own drain is 30s; we add buffer for signal-delivery latency), then SIGKILL if still alive. - Drops the old `process.on('exit')` lock-cleanup handler — its callback runs synchronously and can't wait for the worker drain. Lock file cleanup moved inside the async shutdown. Lock-file mtime refresh every cycle (Codex C) so a long-lived autopilot doesn't get declared "stale" by the next cron-fired invocation after 10 minutes. Inline fallback path calls the new Core fns (runExtractCore, runEmbedCore) instead of the CLI wrappers. That way a bad arg from inside the loop can't process.exit() the autopilot itself (matches Codex #5). test/autopilot-resolve-cli.test.ts: 3 tests covering argv[1]-as-gbrain, argv[1]-as-cli.ts, and graceful error when no path resolves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(autopilot): env-aware install + OpenClaw bootstrap injection Expand installDaemon from 2 targets (macOS launchd, Linux crontab) to 4: - macos → launchd plist (unchanged) - linux-systemd → ~/.config/systemd/user/gbrain-autopilot.service with Restart=on-failure, RestartSec=30, and an is-system-running probe to confirm the user bus actually works (Codex architecture #7 hardened — the naive /run/systemd/system existence check was a false-positive magnet) - ephemeral-container → detects RENDER / RAILWAY_ENVIRONMENT / FLY_APP_NAME / /.dockerenv. Crontab is unreliable here (wiped on deploy), so we write ~/.gbrain/start-autopilot.sh and tell the user to source it from their agent's bootstrap - linux-cron → existing crontab path (unchanged) detectInstallTarget() + --target flag for explicit override. Also: - --inject-bootstrap / --no-inject control OpenClaw ensure-services.sh auto-injection. Default is ON when OpenClaw is detected (OPENCLAW_HOME env var, openclaw.json in CWD or $HOME, or an ensure-services.sh found). Injection adds ONE line with a `# gbrain:autopilot v0.11.0` marker and writes .bak.<ISO-timestamp> before touching the file. Idempotent — the marker check prevents double injection. uninstallDaemon mirrors all four targets. A user can now run `gbrain autopilot --uninstall` after moving hosts (macOS laptop → Linux server) and the uninstall will find + remove every artifact. writeWrapperScript now uses resolveGbrainCliPath() instead of blindly baking process.execPath into the wrapper script — on source installs that path is the Bun runtime, not gbrain (Codex architecture #1 fix propagated to the install path too). test/autopilot-install.test.ts: 4 tests covering detectInstallTarget's platform + env-var branches. Deeper E2E coverage (systemd unit file contents, ephemeral start-script contents + exec bit, OpenClaw marker injection + .bak) lives in Task 14's E2E fixture test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(migrations): v0.11.0 orchestrator — phases A through G, full implementation Replaces the stub from commit de027ce. The orchestrator runs all seven phases of the v0.11.0 Minions adoption migration idempotently, resumable from any prior status:"partial" run (the stopgap bash script writes those). Phases: A. Schema — `gbrain init --migrate-only` (NEVER bare `gbrain init`, which defaults to PGLite and clobbers existing configs — Codex H1 show-stopper). B. Smoke — `gbrain jobs smoke`. Abort loudly on non-zero. C. Mode — --mode flag wins. Preserved from prefs on resume. Non-TTY or --yes defaults pain_triggered with explicit print. Interactive: numbered 1/2/3 menu via shared promptLine. D. Prefs — savePreferences({minion_mode, set_at, set_in_version}). E. Host — AGENTS.md marker injection + cron manifest rewrites. For cron entries whose skill matches a gbrain builtin (sync/embed/lint/import/extract/backlinks/autopilot-cycle) rewrites kind:agentTurn → kind:shell with a gbrain jobs submit command. PGLite branch keeps --follow (inline execution, the only path that works without a worker daemon); Postgres branch drops --follow + adds --idempotency-key ${handler}:${slot} so long cron jobs don't stack up (same Codex fix as the autopilot-cycle dispatch). For non-builtin handlers (host-specific, like ea-inbox-sweep, frameio-scan, x-dm-triage) emits a structured TODO row to ~/.gbrain/migrations/pending-host-work.jsonl so the host agent can walk through plugin-contract work per skills/migrations/v0.11.0.md. F. Install — `gbrain autopilot --install --yes`. Best-effort (failure doesn't abort; user can run manually). G. Record — append to completed.jsonl. status:"complete" unless pending_host_work > 0, in which case status:"partial" + apply_migrations_pending: true. Safety guards (Codex code-quality tension #3: strict-skip, no rollback): - Scope: $HOME/.claude + $HOME/.openclaw only by default. --host-dir must be explicit to include $PWD or any other path. - Symlink escape: SKIP if the resolved target leaves the scoped root. - >1 MB files: SKIP with warning. - Permission denied: SKIP with warning; other files continue. - Malformed JSON manifest: SKIP with parse error logged; continue. - mtime re-check right before write: bail the file if changed between read + write; other files continue. - Every edit writes a .bak.<ISO-timestamp> sibling first (second- precision so two same-day runs don't collide). - Idempotency: `_gbrain_migrated_by: "v0.11.0"` JSON property marker on each rewritten cron entry (JSON can't have comments — Codex G); AGENTS.md marker `<!-- gbrain:subagent-routing v0.11.0 -->`. - TODO dedupe: JSONL appends deduped by (handler, manifest_path) so reruns don't grow the file. Post-run summary: when pending_host_work > 0, prints a one-liner pointing the user at the JSONL path + the v0.11.0 skill file. The skill (Lane C-3 / C-4) is the host-agent instruction manual. test/migrations-v0_11_0.test.ts: 18 tests covering: - AGENTS.md injection: happy path, .bak creation, idempotent rerun, --dry-run no-op, symlink-escape SKIP, >1MB SKIP. - Cron rewrite: builtin handlers rewrite to shell+gbrain jobs submit, non-builtins emit JSONL TODOs without touching the manifest, mixed manifests get both treatments in one pass, idempotent rerun, TODO dedupe, malformed JSON SKIP, no-entries-array SKIP, --dry-run no-op. - findAgentsMdFiles + findCronManifests: scoped walk to $HOME/.claude + $HOME/.openclaw, --host-dir opt-in for $PWD. - BUILTIN_HANDLERS frozen at the canonical 7 names. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(skill): port skillify from Wintermute, pair with check-resolvable Skillify is the "meta skill": turn any raw feature or script into a properly-skilled, tested, resolvable, evaled unit of agent-visible capability. Proven in production on Wintermute; paired with gbrain's existing `check-resolvable` it becomes a user-controllable equivalent of Hermes' auto-skill-creation — you decide when and what, the tooling keeps the checklist honest. Shipped: - skills/skillify/SKILL.md — ported from ~/git/wintermute/workspace/ skills/skillify/SKILL.md. Genericized: * /data/.openclaw/workspace → \${PROJECT_ROOT} (runtime-detected). * services/voice-agent/__tests__/ → test/ (detected from repo). * Manual `grep skills/... AGENTS.md` replaced with a reference to `gbrain check-resolvable`, which does reachability + MECE + DRY + gap detection properly instead of grep-matching a path string. - scripts/skillify-check.ts — ported from ~/git/wintermute/workspace/scripts/skillify-check.mjs. Preserves the --recent flag and --json output shape. Detects project root via package.json walkup; detects test dir (test/ → __tests__/ → tests/ → spec/). Runs the 10-item checklist per target and exits non-zero if any required item is missing. - test/skillify-check.test.ts — 4 CLI tests: happy-path against publish.ts (known-skilled), --json shape + schema, --recent smoke, bogus-target exit code. - skills/RESOLVER.md — adds the trigger row ("Skillify this", "is this a skill?", "make this proper") → skills/skillify/SKILL.md. - skills/manifest.json — adds the skillify entry so the conformance test passes. Why the pair: * Hermes auto-creates skills in the background. Fine until you don't know what the agent shipped — checklists decay silently. * gbrain ships the same capability as two user-controlled tools: /skillify builds the checklist, gbrain check-resolvable validates reachability + MECE + DRY across the whole skill tree. * Human keeps judgment. Tooling keeps the checklist honest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(v0.11.1): cron-via-minions convention, plugin-handlers guide, minions-fix, skill updates New reference docs: - skills/conventions/cron-via-minions.md — the rewrite convention for cron manifests. Shows the Postgres (fire-and-forget + idempotency- key) vs PGLite (--follow inline) branch; explains why builtin-only auto-rewrite is safe + how host-specific handlers get the plugin contract. - docs/guides/plugin-handlers.md — the plugin contract for host- specific Minion handlers. Code-level registration via import + worker.register(), not a data file (Codex D: handlers.json was an RCE surface). Concrete TypeScript skeleton + handler contract (ctx.data, ctx.signal, ctx.inbox) + full migration flow from TODO JSONL to a rewritten cron entry. - docs/guides/minions-fix.md — user-facing troubleshooting for half-migrated v0.11.0 installs. Paste-one-liner for the stopgap, gbrain apply-migrations path for v0.11.1+, verification commands, failure-mode recipes. Rewrites + updates: - skills/migrations/v0.11.0.md — body restored as the host-agent instruction manual. Audience is the host agent reading ~/.gbrain/migrations/pending-host-work.jsonl after the CLI orchestrator has done the mechanical phases. Walks each TODO type through the 10-item skillify checklist (plugin contract, ship bootstrap, unit tests, integration tests, LLM evals, resolver trigger, trigger eval, E2E smoke, brain filing, check-resolvable). Reverses the earlier "delete the body" decision (1B) because the body serves a different audience now — host-agent, not CLI documentation. - skills/cron-scheduler/SKILL.md — Phase 4 ("Register with host scheduler") now references cron-via-minions + plugin-handlers. - skills/maintain/SKILL.md — new "Fix a half-migrated install" section with the apply-migrations recipe. - skills/setup/SKILL.md — new Phase C.5 "One-step autopilot + Minions install (v0.11.1+)" explaining the four install targets + the OpenClaw auto-injection default. - docs/GBRAIN_SKILLPACK.md — Operations section adds the three new guides + the subagent-routing and cron-routing SKILLPACK notes (v0.11.0+). All 167 related tests (conformance + resolver + skillify-check + v0_11_0 orchestrator) stay green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(v0.11.1): stopgap script + CLAUDE.md directive + README + CHANGELOG + version bump scripts/fix-v0.11.0.sh — the paste-command for broken-v0.11.0 installs. Released on the v0.11.1 tag so: curl -fsSL https://raw.githubusercontent.com/garrytan/gbrain/v0.11.1/scripts/fix-v0.11.0.sh \| bash always works (master branch could be renamed). 8 steps: schema apply, smoke, mode prompt (non-TTY defaults pain_triggered), atomic write of preferences.json (0o600), append completed.jsonl with status:"partial" and apply_migrations_pending:true so the v0.11.1 apply-migrations run resumes correctly (does NOT poison the permanent migration path — Codex H2 avoidance), AGENTS.md + cron/jobs.json detection with guidance printed as text only (never auto-edits from a curl-piped script), and a closing line telling the user to run `gbrain autopilot --install` as the one-stop finisher. CLAUDE.md — new "Migration is canonical, not advisory" section pinning the design principle. Any host-repo change (AGENTS.md, cron manifests, launchctl units) is GBrain's responsibility via the migration; the exception is host-specific handler registration, which goes via the code-level plugin contract in docs/guides/plugin-handlers.md. README.md — new sections: - "v0.11.0 migration didn't fire on your upgrade?" with both repair paths (v0.11.1 binary and pre-v0.11.1 stopgap). - "Skillify + check-resolvable: user-controllable auto-skill-creation" explaining why the user-controlled pair beats Hermes-style auto generation. Includes the scripts/skillify-check.ts invocation. CHANGELOG.md — v0.11.1 entry (per CLAUDE.md voice: lead with what the user can now do that they couldn't before; frame as benefits, not files changed). Covers: mega-bug fix + apply-migrations + postinstall + stopgap, autopilot-supervises-worker + single-install-step + env-aware targets, Core fn extraction so handlers don't kill workers, skillify + check-resolvable pair, host-agnostic plugin contract replacing handlers.json (RCE concern), gbrain init --migrate-only, TS migration registry + H8/H9 diff-rule fixes, CLAUDE.md directive. All Codex hard blockers (H1, H3/H4, H5, H6, H7, H8, H9, K) + architecture issues (#1/#2/#4/#5/#7) resolved. package.json — version bump 0.11.0 → 0.11.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): migration-flow E2E against live Postgres + Bun env quirk fix Ships test/e2e/migration-flow.test.ts — the end-to-end integration test for the v0.11.0 orchestrator. Spins up against a live Postgres (gated on DATABASE_URL per CLAUDE.md lifecycle) and exercises four scenarios: - Fresh install: schema apply (Phase A via `gbrain init --migrate-only`) → smoke (Phase B) → mode resolution (C) → prefs (D) → host rewrite (E, empty fixture) → record (G). Asserts preferences.json exists with 0o600, completed.jsonl has a v0.11.0 entry, autopilot install was skipped per --no-autopilot-install. - Idempotent rerun: second orchestrator invocation on a completed install doesn't blow up; mode stays stable. - Host rewrite mixed manifest: 4-entry cron/jobs.json with 2 gbrain- builtin handlers (sync, embed) + 2 non-builtin (ea-inbox-sweep, morning-briefing). Asserts builtins rewrite to `gbrain jobs submit` kind:shell, non-builtins are LEFT on kind:agentTurn, and 2 JSONL TODOs are emitted with correct shape. AGENTS.md gets the marker injected. Status is "partial" because pending-host-work > 0. - Resumable: stopgap writes a partial completed.jsonl row first; orchestrator re-runs successfully against it and appends a new post-orchestrator entry. 1 partial + 1 complete = 2 rows total. Critical fix surfaced by the E2E: src/commands/migrations/v0_11_0.ts's three execSync calls (gbrain init --migrate-only, gbrain jobs smoke, gbrain autopilot --install) now explicitly pass `env: process.env`. Bun's execSync default does NOT propagate post-start `process.env.PATH` mutations to subprocesses — only the initial PATH snapshot. Without the explicit env, any user-side env tweak (e.g. setting GBRAIN_DATABASE_URL in a script before calling the orchestrator) would be invisible to the orchestrator's subprocesses. This is also the reason the E2E needs a PATH shim installed at module-load time to expose the `gbrain` command. test/init-migrate-only.test.ts: subprocess env now strips DATABASE_URL and GBRAIN_DATABASE_URL. The "no config" error-path tests need loadConfig() to return null, which it won't if the env-var fallback at src/core/config.ts:30 fires. Before this fix, running the unit tests with DATABASE_URL set (e.g. during an E2E run) caused false failures because `gbrain init --migrate-only` saw the env var and succeeded. Full test totals with live Postgres: 1265 pass, 0 fail, 3497 expect calls, 67 files, ~95s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump VERSION file to 0.11.1 Commit 5c4cf1d bumped package.json version to 0.11.1 but missed the root VERSION file. src/version.ts reads from package.json so `gbrain --version` prints 0.11.1 correctly, but any tool or script that reads the VERSION file directly (like /ship's idempotency check) saw the stale 0.11.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(v0.11.1): doctor self-heal check + skillpack-check command for cron health reports Closes the discoverability hole from the v0.11.0 mega-bug: once a user is on v0.11.1 (or later), every `gbrain doctor` invocation immediately surfaces a half-migrated state, and `gbrain skillpack-check` gives host agents (Wintermute's morning-briefing, any OpenClaw cron) a single exit-coded JSON pipe to check from their own skills. gbrain doctor — two new checks: 1. Filesystem-only (fires on every `doctor` invocation, even --fast): if `~/.gbrain/migrations/completed.jsonl` has any status:"partial" entry with no matching status:"complete" for the same version, print `MINIONS HALF-INSTALLED (partial migration: vX.Y.Z). Run: gbrain apply-migrations --yes`. Typical cause is the stopgap wrote a partial record but nobody ran `apply-migrations` afterward. 2. DB-path: if schema version is v7+ (Minions present) AND `~/.gbrain/preferences.json` is missing, print the same banner. Catches installs that never ran the stopgap or apply-migrations at all — the classic v0.11.0 "upgrade landed, migration never fired" state. Both checks status:"fail" so doctor exits non-zero when either fires. Test `test/doctor-minions-check.test.ts` pins the five branches (partial present → FAIL, partial+complete → quiet, no-jsonl → quiet, multiple versions named correctly, human-readable banner contains the exact "MINIONS HALF-INSTALLED" phrase Wintermute's cron can grep for). gbrain skillpack-check — new command + skill: - `src/commands/skillpack-check.ts` wraps `doctor --fast --json` + `apply-migrations --list` into one JSON report with `{healthy, summary, actions[], doctor, migrations}`. Exit 0 on healthy, 1 on action-needed, 2 on determine-failure. `--quiet` flag for cron pipes that want exit-code-only behavior. - `actions[]` is the remediation list. Doctor messages of the form `... Run: <cmd>` get their command extracted (regex fixed to match the full remainder of the line, not just the first word). Pending or partial migrations push `gbrain apply-migrations --yes` to the front of actions[]. - `gbrainSpawn()` helper resolves the gbrain invocation correctly on compiled binary installs (`argv[1] = /usr/local/bin/gbrain`) AND source installs (`argv[1] = src/cli.ts`, prefix with `bun run`). Same Codex #1 fix pattern as autopilot's resolveGbrainCliPath. - `skills/skillpack-check/SKILL.md` teaches agents when to run it, what to do with the output, and anti-patterns (don't run without --quiet in a cron that emails; don't ignore exit 2). - Registered in skills/RESOLVER.md and skills/manifest.json. Test `test/skillpack-check.test.ts` (5 tests) covers healthy fresh install, half-migrated exit-1 with apply-migrations in actions[], --quiet suppresses stdout in both states, --help prints usage, summary includes top action when multiple are present. 1192 unit tests pass (+15 new). The 38 failing tests are all DATABASE_URL E2Es — same pre-existing pattern, unchanged by this commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * doc(v0.11.1): reframe README + minions-fix — v0.11.0 was never released v0.11.0 was cut but never released publicly. v0.11.1 is the first public Minions ship, and fixes the upgrade-migration mega-bug so it self-heals on every future `gbrain upgrade` + `bun update gbrain`. The README was wrongly framing the fix as a retrospective for v0.11.0 users — none exist, so remove it. README changes: - Delete the "v0.11.0 migration didn't fire on your upgrade?" section. Replace with "Health check and self-heal": the `gbrain doctor`, `gbrain skillpack-check --quiet`, and `gbrain skillpack-check \| jq` recipes that ship in v0.11.1. Still links to docs/guides/minions-fix.md for deeper troubleshooting. - Promote the production benchmark to top billing. The previous section led with the lab benchmark (same LLM, localhost) and buried the production data point as a single follow-up sentence. Real deployment numbers are the stronger signal: * 753ms vs >10s gateway timeout (sub-agent couldn't even spawn) * $0.00 vs ~$0.03 per run * 100% vs 0% success rate under 19-cron production load * 36-month tweet backfill: 19,240 tweets, ~15 min, $0.00 Lab numbers stay (separate table, labeled "controlled environment") so readers can see both layers. - Add the "The routing rule" closer: Deterministic → Minions, Judgment → Sub-agents. This is the clearest framing in the production benchmark doc and belongs in the README so readers leave with the right mental model. `minion_mode: pain_triggered` automates it. docs/guides/minions-fix.md rewrite: - Reframe as: v0.11.0 never released, v0.11.1 is the first ship, `gbrain apply-migrations --yes` is canonical. Stopgap stays documented for pre-v0.11.1 branch builds (e.g. Wintermute's minions-jobs checkout before v0.11.1 tags). - Add the detection + verification commands (doctor + skillpack-check) at the top. - Cross-reference skills/skillpack-check/SKILL.md as the agent-facing health-check pattern. Zero lingering "v0.11.0 released" references in README or minions-fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(doctor): remove "schema v7+ no prefs → FAIL" check (too aggressive) CI failure in Tier 1 Mechanical E2E: (fail) E2E: Doctor Command > gbrain doctor exits 0 on healthy DB Root cause: the doctor half-migration detection added two checks. The second check (`schema v7+ AND ~/.gbrain/preferences.json missing → minions_config FAIL`) was too aggressive. It treated a valid fresh- install state as broken. `gbrain init` against Postgres applies schema v7 but doesn't write preferences.json — that's the migration orchestrator's Phase D, which only runs via `apply-migrations`. Between `init` finishing and the user running `apply-migrations`, the install is legitimately in a "schema-applied, no prefs" state. Doctor was exiting 1 on this valid state, breaking the pre-existing CI test that init's + docters a healthy DB. Fix: drop the check. The filesystem check (step 3 — partial-completed without a matching complete) is sufficient signal for genuine half- migration. Added a regression test pinning the exact CI scenario: no completed.jsonl present, no preferences.json, doctor must not fail any minions_* check. Also removes the now-unused `preferencesPaths` import. Verified against live Postgres: CI-equivalent `gbrain doctor` + `gbrain doctor --json` both pass. Full suite: 1281/1281 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * doc(readme): Minions section — lead with the story, compress the rest The previous section opened with "six daily pains" as a numbered list before the hook, buried the production numbers halfway down, and had a table explaining how each pain gets fixed. Fine for a spec doc; wrong for a README that needs to land the impact fast. Rewrite: - Lead with "your sub-agents won't drop work anymore" — the reason a reader is here. - Production numbers promoted, framed as a story: "Here's my personal OpenClaw deployment: one Render container, Supabase Postgres holding a 45,000-page brain, 19 cron jobs firing on schedule, the X Enterprise API on the wire..." Gives the reader the setup before the punchline. - The routing rule (deterministic → Minions, judgment → sub-agents) survives unchanged. It's the clearest framing in the whole section. - Lose the "how each pain gets fixed" table. Compress the six pains + their fixes into one paragraph that names the primitives by name (max_children, timeout_ms, child_done inbox, cascade cancel, idempotency keys, attachment validation). Readers who want depth click through to skills/minion-orchestrator/SKILL.md. - Close with "not incrementally better — categorically different" and the three headline numbers. - Drop the separate Lab Numbers table; the production numbers are stronger and the lab data is one click away via the link. Lines: 75 → 42. Same signal, less scroll. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * doc: scrub X Enterprise API + @garrytan references from user-facing docs User feedback: shouldn't name the specific enterprise-tier API product or the account in the README or benchmark docs. Genericize: - "X Enterprise API on the wire" → drop entirely; the 19-cron load story carries the setup without naming the vendor - "X Enterprise API ($50K/mo firehose)" → "external API" - "@garrytan tweets" → "my social posts" - "Pull ~100 @garrytan tweets" → "Pull ~100 of my social posts" - "X Enterprise API (full-archive)" env var comment → "external API bearer token" Scope: - README.md — the Minions production story line + scaling callout - docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md - docs/benchmarks/2026-04-18-tweet-ingestion.md Plain "X API" references in the tweet-ingestion methodology stay — those describe which public HTTP endpoint was called, not the enterprise-tier product. Benchmark doc filenames (tweet-ingestion.md) stay to preserve inbound links; content is genericized. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * doc(readme): Skillify section — match Minions energy, land the category shift The previous section was competent but undersold what skillify actually is. Rewrite matches the Minions section's shape: lead with the hook, tell the story, land the punchline. Key changes: - Title: "your skills tree stops being a black box." Names the thing skillify actually solves. - Open with the problem: Hermes auto-creates skills as a background behavior. Six months later you have an opaque pile nobody's read or tested. Make the liability concrete. - Promote the 10 items by name (SKILL.md + script + unit tests + integration tests + LLM evals + resolver trigger + trigger eval + E2E + brain filing + check-resolvable audit). Showing the list makes the scope of the unlock visible. - New subsection "Why this is the right answer for OpenClaw" names the debugging-the-black-box pain directly. Skillify makes the tree legible: when something breaks, you know which layer (contract, test, eval, trigger, or route) to inspect. When anything goes stale, check-resolvable flags it. - Close with "compounding quality instead of compounding entropy" + "not a nice-to-have. It's the piece that makes the skills tree survive six months." - Expand the code block to include `gbrain check-resolvable` (the other half of the pair) so readers see the whole workflow. Length goes from 17 to 34 lines — still shorter than Minions, still one section. Worth the space because this is a category shift for how agent skills get built, not a feature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: root <root@localhost>	2026-04-18 16:57:38 +08:00
Garry Tan	91ced664b6	feat: Voice v0.8.0 + feature discovery + Edge Function removal (#55 ) * chore: remove Supabase Edge Function MCP deployment The Edge Function never worked reliably. All MCP traffic goes through self-hosted server + ngrok tunnel. Removes deploy-remote.sh, edge-entry.ts, supabase/functions/, .env.production.example, and CHATGPT.md (OAuth not implemented). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite MCP docs for self-hosted + ngrok deployment All per-client guides updated from Edge Function URLs to self-hosted server + ngrok tunnel pattern. DEPLOY.md rewritten with local vs remote paths. ALTERNATIVES.md now shows self-hosted as primary, with ngrok, Tailscale, and Fly.io/Railway comparison. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: voice recipe v0.8.0 — 25 production patterns from real deployment Identity separation, pre-computed bid system, conversation timing fix, proactive advisor mode, radical prompt compression, OpenAI Realtime Prompting Guide structure, auth-before-speech, brain escalation, stuck watchdog, never-hang-up rule, thinking sounds, fallback TwiML, tool set architecture, trusted user auth, caller routing, dynamic VAD, on-screen debug UI, live moment capture, belt-and-suspenders post-call, mandatory 3-step post-call, WebRTC parity, dual API events, report-aware query routing. WebRTC pseudocode updated with native FormData and 6 gotchas. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: post-upgrade feature discovery framework upgrade.ts captures old version before upgrading, then execs gbrain post-upgrade (new binary) to read migration files and print feature pitches. Migration files get YAML frontmatter with feature_pitch field (headline, description, recipe, tiers). CLI prints excited builder tone post-upgrade. v0.8.0 migration offers voice setup with environment detection (server vs local) and 3-tier progressive disclosure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Voice section to README with WebRTC screenshot + tweet link Her out of the box: voice-to-brain with 25 production patterns. WebRTC client screenshot embedded. Remote MCP section rewritten for self-hosted + ngrok. Setup block genericized. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add recipe validation tests + genericize personal refs 5 new integration tests: secrets completeness, semver version, requires resolution, all-recipes-parse, no-personal-references. Test fixture genericized. CLAUDE.md/TODOS.md/SKILLPACK updated for v0.8.0. build:edge script removed from package.json. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.8.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 10:52:30 -10:00
Garry Tan	3e21e9b69b	feat: GBrain v0.6.0 — Remote MCP Server + 12 Bug Fixes (#28 ) * fix: 7 bug fixes from Issue #9 and #22 - fix(mcp): use ListToolsRequestSchema/CallToolRequestSchema instead of string literals (Issue #9, PR #25) - fix(mcp): handleToolCall reads dry_run from params instead of hardcoding false (#22 Bug #11) - fix(search): keyword search returns best chunk per page via DISTINCT ON, not all chunks (#22 Bug #8) - fix(search): dedup layer 1 keeps top 3 chunks per page instead of collapsing to 1 (#22 Bug #12) - fix(engine): transaction uses scoped engine via Object.create, no shared state mutation (#22 Bug #2) - fix(engine): upsertChunks uses UPSERT instead of DELETE+INSERT, preserves existing embeddings (#22 Bug #1) - fix(slugs): validateSlug normalizes to lowercase, pathToSlug lowercases consistently (#22 Bug #4) - schema: add unique index on content_chunks(page_id, chunk_index) for UPSERT support - schema: add access_tokens and mcp_request_log tables via migration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: embed schema.sql at build time, remove fs dependency from initSchema initSchema() previously read schema.sql from disk at runtime via readFileSync, which broke in compiled Bun binaries and Deno Edge Functions. Now uses a generated schema-embedded.ts constant (run `bun run build:schema` to regenerate). - Removes fs and path imports from postgres-engine.ts and db.ts - Adds scripts/build-schema.sh for one-source-of-truth generation - Adds build:schema npm script Fixes Issue #22 Bug #6. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: 5 more bug fixes from Issue #22 - fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9) - fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3) - fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10) - fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5) - deps: add @aws-sdk/client-s3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: remote MCP server via Supabase Edge Functions Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase instance. One brain, accessible from Claude Desktop, Claude Code, Cowork, Perplexity Computer, and any MCP client. Zero new infrastructure. New files: - supabase/functions/gbrain-mcp/index.ts — Edge Function with Hono + MCP SDK - supabase/functions/gbrain-mcp/deno.json — Deno import map - src/edge-entry.ts — curated bundle entry point (excludes fs-dependent modules) - src/commands/auth.ts — standalone token management (create/list/revoke/test) - scripts/deploy-remote.sh — one-script deployment - .env.production.example — 3-value config template Changes: - config.ts: lazy-evaluate CONFIG_DIR (no homedir() at module scope) - schema.sql: add access_tokens + mcp_request_log tables - package.json: add build:edge script Auth: bearer tokens via access_tokens table (SHA-256 hashed, per-client, revocable) Transport: WebStandardStreamableHTTPServerTransport (stateless, Streamable HTTP) Health: /health endpoint (unauth: 200/503, auth: postgres/pgvector/openai checks) Excluded from remote: sync_brain, file_upload (may exceed 60s timeout) Setup: clone, fill .env.production, run scripts/deploy-remote.sh, create token, done. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: per-client MCP setup guides - docs/mcp/DEPLOY.md — deployment walkthrough, auth, troubleshooting, latency table - docs/mcp/CLAUDE_CODE.md — claude mcp add command - docs/mcp/CLAUDE_DESKTOP.md — Settings > Integrations (NOT JSON config!) - docs/mcp/CLAUDE_COWORK.md — remote + local bridge paths - docs/mcp/PERPLEXITY.md — Perplexity Computer connector setup - docs/mcp/CHATGPT.md — coming soon (requires OAuth 2.1, P0 TODO) - docs/mcp/ALTERNATIVES.md — Tailscale Funnel + ngrok self-hosted options Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.6.0) GBrain v0.6.0: Remote MCP server via Supabase Edge Functions + 12 bug fixes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Remote MCP Server section to README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: make document-release mandatory in CLAUDE.md, add MCP key files Post-ship requirements section: document-release is NOT optional. Lists every file that must be checked on every ship. A ship without updated docs is incomplete. Also adds remote MCP server files to Key files section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: batch upsertChunks into single statement to prevent deadlocks The per-chunk UPSERT loop caused deadlocks under parallel workers because each INSERT ON CONFLICT acquired row-level locks sequentially. Multiple workers upserting different pages could deadlock on the shared unique index. Fix: batch all chunks into a single multi-row INSERT ON CONFLICT statement. One round-trip, one lock acquisition. COALESCE preserves existing embeddings when the new value is NULL. Fixes CI failure: "E2E: Parallel Import > parallel import with --workers 4" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: advisory lock in initSchema() prevents deadlock on concurrent DDL When multiple processes call initSchema() concurrently (e.g., test setup + CLI subprocess, or parallel workers during E2E tests), the schema SQL's DROP TRIGGER + CREATE TRIGGER statements acquire AccessExclusiveLock on different tables, causing deadlocks. Fix: pg_advisory_lock(42) serializes all initSchema() calls within the same database. The lock is session-scoped and released in a finally block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add explicit test timeouts for CLI subprocess E2E tests CLI subprocess tests (Setup Journey, Doctor Command, Parallel Import) spawn `bun run src/cli.ts` which takes several seconds to JIT compile + connect. The Bun test framework default 5000ms per-test timeout is too tight for CI. Added 30-60s timeouts matching each subprocess's own timeout to prevent false failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: infinite recursion in config.ts exported getConfigDir/getConfigPath The replace_all refactor created recursive functions: the exported getConfigDir() called the private getConfigDir() which called itself. Renamed exports to configDir()/configPath() to avoid shadowing. Also adds scripts/smoke-test-mcp.ts — verified all 8 MCP tool calls work against a real Postgres database. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 15:23:00 -10:00

10 Commits