fix(wave): v0.15.1 - 4 hot issues + scope expansion (#248)

* fix(wave): 4 hot issues + 3 scope expansions (v0.13.1)

Addresses four user-filed regressions after v0.13.0 plus three adjacent
footgun closures.

* #170 — CREATE INDEX [CONCURRENTLY] IF NOT EXISTS idx_pages_updated_at_desc
  on pages (updated_at DESC). Engine-aware migration v12 with invalid-index
  cleanup on Postgres, plain CREATE on PGLite. ~700x on 30k+ row brains.
  Contributed by @fuleinist (#215).

* #219 — Minions schema default max_stalled 1 -> 5. v13 migration ALTERs
  the default and UPDATEs existing non-terminal rows (waiting/active/
  delayed/waiting-children/paused) so live queues get rescued on upgrade.
  Adds MinionJobInput.max_stalled with [1,100] clamp. New --max-stalled
  CLI flag on `jobs submit`. Reported by @macbotmini-eng.

* #218 — package.json postinstall surfaces errors instead of silencing.
  trustedDependencies whitelists @electric-sql/pglite. doctor
  schema_version check fails loudly when migrations never ran and links
  to #218. README + INSTALL_FOR_AGENTS warn against `bun install -g`.
  Reported by @gopalpatel.

* #223 — @electric-sql/pglite pinned to exactly 0.4.3 (was ^0.4.4).
  PGLiteEngine.connect() wraps PGlite.create() errors with a message
  pointing at the issue + gbrain doctor. Does NOT suggest 'missing
  migrations' as a cause (create-time abort happens before migrations
  run). Pin is unverified against macOS 26.3; error-wrap is the safety
  net. Reported by @AndreLYL.

* Scope: `gbrain jobs submit` gains --backoff-type/--backoff-delay/
  --backoff-jitter/--timeout-ms/--idempotency-key (MinionJobInput audit).
* Scope: `gbrain jobs smoke --sigkill-rescue` regression case (opt-in,
  CI-only) that simulates a killed worker and asserts the new default
  rescues.
* Scope: `gbrain doctor --index-audit` reports zero-scan Postgres indexes
  as drop candidates (informational; no auto-drop).

Infrastructure:
* Migration interface extended with sqlFor: { postgres?, pglite? } and
  transaction: boolean. Runner picks the engine-specific branch and
  bypasses engine.transaction() when transaction:false (required for
  CONCURRENTLY). BrainEngine.kind readonly discriminator added.
* scripts/check-jsonb-pattern.sh CI guard extended to block
  `max_stalled DEFAULT 1` from regressing.

Tests:
* 15 new unit tests: v12/v13 structural + behavioral assertions,
  max_stalled default/clamp/backfill, PGLite error-wrap source guard,
  engine kind discriminator.
* 3 regression tests pinned by IRON RULE.
* Full unit suite: 1416 pass.
* Full E2E suite against Postgres 16 + pgvector: 126 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.1)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: sync documentation for v0.13.1

CLAUDE.md "Key files" and "Commands" sections refreshed to match the
v0.13.1 fix wave:

- Note `BrainEngine.kind` discriminator on engine.ts
- Document v0.13.1 connect() error-wrap on pglite-engine.ts
- Refresh src/core/minions/ layout (no shell handler, no protected-names,
  no quiet-hours/stagger — that was v0.13-development scaffolding that
  did not ship)
- Add src/core/migrate.ts entry with `Migration` interface extensions
  (`sqlFor`, `transaction: false`)
- Document new `gbrain jobs submit` flags (--max-stalled, --backoff-type,
  --backoff-delay, --backoff-jitter, --timeout-ms, --idempotency-key)
- Document `gbrain jobs smoke --sigkill-rescue` regression guard
- Document `gbrain doctor --index-audit` and the schema_version=0
  surface that catches #218 postinstall failures
- Extend check-jsonb-pattern.sh note with the max_stalled DEFAULT 1
  regression guard
- Touch up test file blurbs for migrate.test.ts, pglite-engine.test.ts,
  minions.test.ts with v0.13.1 coverage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): run files sequentially to eliminate shared-DB race

The E2E suite was flaky. ~3 of every 5 runs had 4-10 failures clustered
in Links, Timeline, Versions, Minions resilience, Parallel Import, and
Page CRUD tests. Symptoms included "expected 16 pages, got 8" (half),
"expected 1 link inserted, got 0", timeline entries missing after
round-trip, and similar data-shape mismatches.

Root cause: bun test runs test FILES in parallel (each in a worker
process). 13 E2E files share one DATABASE_URL, and `setupDB()` in
`test/e2e/helpers.ts` does `TRUNCATE ... CASCADE` on all tables before
each file's `importFixtures()`. File A's TRUNCATE would race with file
B's in-flight INSERT stream, producing the observed half-populated or
wrong-count states.

An earlier attempt used a Postgres advisory lock held on a dedicated
single-connection client for the lifetime of each file's run. It broke
because bun's default 5000 ms hook timeout fires on queued beforeAll()
calls: with 13 files serializing through the lock, files 2-13 would
time out waiting for file 1 to finish.

This commit switches to sequential file execution at the harness level
via scripts/run-e2e.sh, which loops through test/e2e/*.test.ts one at
a time, tracks aggregate pass/fail counts, and exits non-zero on the
first failing file. No lock, no timeout issues, no changes to any test
file. package.json test:e2e points at the new script.

Verified: 5 back-to-back runs against the same Postgres container,
each completing in ~5 min. Every run: 13 files, 138 tests, 0 fails.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version to 0.15.1 (fix wave locked to MINOR line)

Master v0.14.2 was the last /investigate root-cause wave on the
v0.14.x line. This fix wave opens v0.15.x: four hot issues (#170,
#218, #219, #223) close v0.13.x regressions that v0.14.x didn't
cover, so the MINOR bump reflects the semantic shift — new schema
migrations (v14, v15), a new CLI surface (`--max-stalled`,
`--sigkill-rescue`, `--index-audit`), a new BrainEngine contract
(`kind` discriminator + extended `Migration` interface), and a new
install-time contract (PGLite 0.4.3 pin + `trustedDependencies`).

Locked to 0.15.1 in advance: other work may land before/after this
PR, but the version is fixed so reviewers can cite a stable number.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-21 13:19:23 -07:00
committed by GitHub
parent 7f156c8873
commit ff10796a00
25 changed files with 797 additions and 94 deletions

View File

@@ -2,6 +2,83 @@
All notable changes to GBrain will be documented in this file.
## [0.15.1] - 2026-04-21
## **Fix wave: 4 hot issues that blocked real brains, landed together.**
## **PGLite survives macOS 26.3. Minions actually rescues SIGKILL'd jobs. Autopilot dashboards stop the 14.6s seqscan. `bun install -g` tells you when it's broken.**
v0.15.1 is the hotfix wave on top of the v0.14.x stack (shell job type in v0.14.0, doctor DRY + `--fix` in v0.14.1, 8 deferred bug fixes in v0.14.2) plus v0.15.0 (llms.txt + AGENTS.md): four user-filed issues against v0.13.x, fixed and verified together, plus three scope expansions that close adjacent footguns. Upgrade is automatic. If `gbrain upgrade` runs clean, your brain gets faster and more reliable on the next sync cycle.
### The numbers that matter
The four issues this release closes, with measured impact:
| Issue | Before v0.15.1 | After v0.15.1 | Δ |
|-------|----------------|----------------|---|
| #170 `SELECT * FROM pages ORDER BY updated_at DESC` on 31k rows (Postgres) | ~14.6s seqscan | <20ms index scan | ~700x |
| #219 `max_stalled` default on `minion_jobs` | 3 (three rescues before dead, v0.14.2 set this) | 5 (four rescues before dead) | extra headroom for flaky deploys |
| #219 existing waiting/active jobs with `max_stalled<5` | would still dead-letter earlier than expected | backfilled to 5 on upgrade | closes the pain today |
| #218 `bun install -g github:garrytan/gbrain` postinstall failure | silent `|| true` | visible stderr warning with recovery URL | users know it's broken |
| #223 PGLite WASM crash on macOS 26.3 | raw `Aborted()`, no hint | pinned `@electric-sql/pglite` to `0.4.3` + actionable error message naming the issue | users can route to #223 |
### What this means for you
If you run autopilot against a Supabase brain with 30k+ pages, your health/dashboard cycle was silently burning 14.6 seconds on every iteration. The new index drops that to single-digit milliseconds without locking writes (Postgres gets `CREATE INDEX CONCURRENTLY` with an invalid-index cleanup DO block; PGLite gets plain `CREATE INDEX` since it has no concurrent writers). Your agent stops blocking on list-pages-by-date queries.
If you use Minions, the "SIGKILL mid-flight, 10/10 rescued" claim is now actually true out-of-the-box with generous headroom. Default `max_stalled=5` means a kill -9'd worker gets picked up by the next worker instead of dead-lettered early. v15 migration backfills existing non-terminal rows (`waiting/active/delayed/waiting-children/paused`) so upgrading doesn't leave a queue full of doomed jobs.
If you install via `bun install -g github:...` (not recommended but people try it), you'll now see a loud stderr warning with a link to #218 instead of a broken CLI that fails on next invocation. The real fix is `git clone + bun link`, documented in README and INSTALL_FOR_AGENTS.md.
If you're on macOS 26.3 and PGLite was crashing with `Aborted()`, the pin to 0.4.3 gives us the best shot at avoiding the WASM regression (noting: 0.4.3 is unverified against 26.3 in CI — the error-wrap at `pglite-engine.ts connect()` is the safety net if the pin doesn't hold). Any PGLite init failure now shows the #223 link instead of a raw runtime error.
## To take advantage of v0.15.1
`gbrain upgrade` should do this automatically. If it didn't, or if `gbrain doctor` warns about a partial migration:
1. **Run the orchestrator manually:**
```bash
gbrain apply-migrations --yes
```
2. **Verify the outcome:**
```bash
psql "$DATABASE_URL" -c "\d minion_jobs" | grep max_stalled # DEFAULT should be 5
psql "$DATABASE_URL" -c "\d pages" | grep idx_pages_updated_at_desc # index should exist
gbrain doctor
```
3. **If any step fails or the numbers look wrong,** file an issue with `gbrain doctor` output and the contents of `~/.gbrain/upgrade-errors.jsonl` if it exists. https://github.com/garrytan/gbrain/issues
### Itemized changes
#### Added
- Schema migration **v14** — `CREATE INDEX [CONCURRENTLY] IF NOT EXISTS idx_pages_updated_at_desc ON pages (updated_at DESC)` (engine-aware; Postgres uses CONCURRENTLY with an invalid-index DO-block cleanup, PGLite uses plain CREATE). Closes #170. Contributed by @fuleinist (#215).
- Schema migration **v15** — `ALTER TABLE minion_jobs ALTER COLUMN max_stalled SET DEFAULT 5` (bumps v0.14.2's default of 3 to 5 for extra flaky-deploy headroom) + `UPDATE` backfill scoped to non-terminal statuses (`waiting/active/delayed/waiting-children/paused`) so existing queued work benefits on upgrade. Closes #219. Reported by @macbotmini-eng.
- `MinionJobInput.max_stalled` — new optional field, plumbed through `queue.add()` with `[1, 100]` clamp.
- `gbrain jobs submit --max-stalled N` — CLI flag to set per-job stall tolerance.
- `gbrain jobs submit --backoff-type`, `--backoff-delay`, `--backoff-jitter`, `--timeout-ms`, `--idempotency-key` — scope-expansion audit exposing existing `MinionJobInput` fields as first-class CLI flags.
- `gbrain jobs smoke --sigkill-rescue` — opt-in regression smoke case that simulates a killed worker and asserts the v0.15.1 default actually rescues.
- `gbrain doctor --index-audit` — new opt-in Postgres check that reports zero-scan indexes from `pg_stat_user_indexes`. Informational only (no auto-drop). PGLite no-ops.
- `BrainEngine.kind` readonly discriminator (`'postgres' | 'pglite'`) — lets migrations and consumers branch on engine without `instanceof` + dynamic imports.
- `package.json trustedDependencies: ["@electric-sql/pglite"]` — lets Bun run PGLite's dep postinstall on global installs.
#### Changed
- `@electric-sql/pglite` pinned to exactly `0.4.3` (was `^0.4.4`) — best-available mitigation for the macOS 26.3 WASM abort. Reported by @AndreLYL (#223). Flagged as unverified; reproduce on a 26.3 machine and file a follow-up if it still aborts.
- `package.json postinstall` — now warns loudly on stderr with a recovery URL instead of silencing errors with `2>/dev/null || true`. `bun install -g` hitting a migration failure now tells you what to do. Reported by @gopalpatel (#218).
- `src/core/pglite-engine.ts connect()` — wraps `PGlite.create()` with a friendly error pointing at #223 and `gbrain doctor`. Nests the original error for debuggability.
- `doctor` `schema_version` check — now fails loudly when `version=0` (migrations never ran), linking #218.
- `README.md` + `INSTALL_FOR_AGENTS.md` — explicit warning against `bun install -g github:garrytan/gbrain`.
#### Fixed
- **The "SIGKILL mid-flight, 10/10 rescued" claim is now accurate** out-of-the-box with headroom (#219). Schema default 3 → 5.
- **Autopilot dashboards stop blocking on list-pages queries** on 30k+ row Postgres brains (#170).
- **PGLite error on macOS 26.3** is now actionable instead of a raw `Aborted()` (#223).
- **`bun install -g` no longer produces a silently broken CLI** (#218) — postinstall surfaces failures.
#### Internal
- `Migration` interface extended with `sqlFor: { postgres?, pglite? }` + `transaction: boolean` fields. Runner picks the engine-specific SQL branch and (on Postgres only) bypasses `engine.transaction()` when `transaction: false` (required for CONCURRENTLY).
- `scripts/check-jsonb-pattern.sh` extended with a CI guard against `max_stalled DEFAULT 1` regressing.
- ~15 new unit tests covering max_stalled default/clamp/backfill/v14/v15 semantics. 3 regression tests pinned by IRON RULE.
- `test/e2e/` now runs test files sequentially via `scripts/run-e2e.sh` to eliminate shared-DB races that caused ~3/5 runs to have 4-10 flaky fails. Every run post-fix: 13 files, 138 tests, 0 fails.
## [0.15.0] - 2026-04-21
## **GBrain now talks to LLMs the way modern docs sites do.**
@@ -126,7 +203,7 @@ Your agent's feedback loops tighten. When sync blocks, doctor surfaces the exact
#### Reliability
- **Bug 2: `GBRAIN_POOL_SIZE` env knob** (`src/core/db.ts`, `src/commands/import.ts`). Honored by both the singleton pool and the parallel-import worker pool. Defaults to 10; lower for Supabase transaction pooler. `initPostgres` / `initPGLite` now wrap lifecycle in `try { ... } finally { await engine.disconnect() }`.
- **Bug 3: Migration ledger centralization + wedge cap** (`src/commands/apply-migrations.ts`, `src/core/preferences.ts`). Runner owns all ledger writes. 3 consecutive partials = wedged, skipped with a loud message. New `--force-retry <version>` flag writes a `'retry'` marker without faking success. `complete` status never regresses. `appendCompletedMigration` is idempotent on double-complete.
- **Bug 8: `max_stalled` default 1 → 3** (`src/core/schema-embedded.ts`, `src/core/pglite-schema.ts`, `src/schema.sql`). First lock-lost tick no longer dead-letters. `v0_14_0` Phase A ALTERs existing installs. `autopilot-cycle` handler yields to the event loop between phases so the worker's lock-renewal timer fires.
- **Bug 8: `max_stalled` default 1 → 3** (`src/core/schema-embedded.ts`, `src/core/pglite-schema.ts`, `src/schema.sql`). First lock-lost tick no longer dead-letters. `v0_14_0` Phase A ALTERs existing installs. `autopilot-cycle` handler yields to the event loop between phases so the worker's lock-renewal timer fires. (v0.15.1 further bumps this to 5 and adds a non-terminal row backfill — see #219.)
- **Bug 9: Sync gate + acknowledge mechanism** (`src/commands/sync.ts`, `src/commands/import.ts`, `src/core/sync.ts`). All 3 sync paths (incremental, full via `runImport`, `gbrain import` git continuity) gate `sync.last_commit` on no-failures. Failures append to `~/.gbrain/sync-failures.jsonl` with dedup key. New `gbrain sync --skip-failed` + `--retry-failed` flags. Doctor surfaces unacknowledged failures.
#### Observability

View File

@@ -23,9 +23,9 @@ strict behavior when unset.
## Key files
- `src/core/operations.ts` — Contract-first operation definitions (the foundation). Also exports upload validators: `validateUploadPath`, `validatePageSlug`, `validateFilename`. `OperationContext.remote` flags untrusted callers.
- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`).
- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). As of v0.13.1, `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator so migrations (`src/core/migrate.ts`) and other consumers can branch on engine without `instanceof` + dynamic imports.
- `src/core/engine-factory.ts` — Engine factory with dynamic imports (`'pglite'` | `'postgres'`)
- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders.
- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders. As of v0.13.1, `connect()` wraps `PGlite.create()` in a try/catch that emits an actionable error naming the macOS 26.3 WASM bug (#223) and pointing at `gbrain doctor`; the lock is released on failure so the next process can retry cleanly.
- `src/core/pglite-schema.ts` — PGLite-specific DDL (pgvector, pg_trgm, triggers)
- `src/core/postgres-engine.ts` — Postgres + pgvector implementation (Supabase / self-hosted). `addLinksBatch` / `addTimelineEntriesBatch` use `INSERT ... SELECT FROM unnest($1::text[], ...) JOIN pages ON CONFLICT DO NOTHING RETURNING 1` — 4-5 array params regardless of batch size, sidesteps the 65535-parameter cap. As of v0.12.3, `searchKeyword` / `searchVector` scope `statement_timeout` via `sql.begin` + `SET LOCAL` so the GUC dies with the transaction instead of leaking across the pooled postgres.js connection (contributed by @garagon). `getEmbeddingsByChunkIds` uses `tryParseEmbedding` so one corrupt row skips+warns instead of killing the query.
- `src/core/utils.ts` — Shared SQL utilities extracted from postgres-engine.ts. Exports `parseEmbedding(value)` (throws on unknown input, used by migration + ingest paths where data integrity matters) and as of v0.12.3 `tryParseEmbedding(value)` (returns `null` + warns once per process, used by search/rescore paths where availability matters more than strictness).
@@ -52,25 +52,27 @@ strict behavior when unset.
- `src/commands/extract.ts``gbrain extract links|timeline|all [--source fs|db]`: batch link/timeline extraction. fs walks markdown files, db walks pages from the engine (mutation-immune snapshot iteration; use this for live brains with no local checkout). As of v0.12.1 there is no in-memory dedup pre-load — candidates are buffered 100 at a time and flushed via `addLinksBatch` / `addTimelineEntriesBatch`; `ON CONFLICT DO NOTHING` enforces uniqueness at the DB layer, and the `created` counter returns real rows inserted (truthful on re-runs).
- `src/commands/graph-query.ts``gbrain graph-query <slug> [--type T] [--depth N] [--direction in|out|both]`: typed-edge relationship traversal (renders indented tree)
- `src/core/link-extraction.ts` — shared library for the v0.12.0 graph layer. extractEntityRefs (canonical, replaces backlinks.ts duplicate) matches both `[Name](people/slug)` markdown links and Obsidian `[[people/slug|Name]]` wikilinks as of v0.12.3. extractPageLinks, inferLinkType heuristics (attended/works_at/invested_in/founded/advises/source/mentions), parseTimelineEntries, isAutoLinkEnabled config helper. `DIR_PATTERN` covers `people`, `companies`, `deals`, `topics`, `concepts`, `projects`, `entities`, `tech`, `finance`, `personal`, `openclaw`. Used by extract.ts, operations.ts auto-link post-hook, and backlinks.ts.
- `src/core/minions/` — Minions job queue: BullMQ-inspired, Postgres-native (queue, worker, backoff, types)
- `src/core/minions/queue.ts` — MinionQueue class (submit, claim, complete, fail, stall detection, parent-child, depth/child-cap, per-job timeouts, cascade-kill, attachments, idempotency keys, child_done inbox, removeOnComplete/Fail). `add()` takes a 4th `trusted` arg (separate from `opts` to prevent spread leakage); protected names in `PROTECTED_JOB_NAMES` require `{allowProtectedSubmit: true}` and the check runs trim-normalized (whitespace-bypass safe).
- `src/core/minions/` — Minions job queue: BullMQ-inspired, Postgres-native (queue, worker, backoff, types, protected-names, quiet-hours, stagger, handlers/shell).
- `src/core/minions/queue.ts` — MinionQueue class (submit, claim, complete, fail, stall detection, parent-child, depth/child-cap, per-job timeouts, cascade-kill, attachments, idempotency keys, child_done inbox, removeOnComplete/Fail). `add()` takes a 4th `trusted` arg (separate from `opts` to prevent spread leakage); protected names in `PROTECTED_JOB_NAMES` require `{allowProtectedSubmit: true}` and the check runs trim-normalized (whitespace-bypass safe). v0.14.1 #219: `add()` plumbs `max_stalled` through with a `[1, 100]` clamp; omitted values let the schema DEFAULT (5) kick in.
- `src/core/minions/worker.ts` — MinionWorker class (handler registry, lock renewal, graceful shutdown, timeout safety net). v0.14.0 abort-path fix: aborted jobs now call `failJob` with reason (`timeout`/`cancel`/`lock-lost`/`shutdown`) instead of returning silently. `shutdownAbort` (instance field) fires on process SIGTERM/SIGINT and propagates to `ctx.shutdownSignal` — shell handler listens to it; non-shell handlers don't.
- `src/core/minions/types.ts``MinionJobInput` + `MinionJobStatus` + handler context types. `MinionJobInput.max_stalled` (new in v0.14.1) is optional; omitted values let the schema DEFAULT (5) kick in, provided values are clamped to `[1, 100]`.
- `src/core/minions/protected-names.ts` — side-effect-free constant module exporting `PROTECTED_JOB_NAMES` + `isProtectedJobName()`. Kept pure so queue core can import without loading handler modules.
- `src/core/minions/handlers/shell.ts``shell` job handler. Spawns `/bin/sh -c cmd` (absolute path, PATH-override-safe) or `argv[0] argv[1..]` (no shell). Env allowlist: `PATH, HOME, USER, LANG, TZ, NODE_ENV` + caller `env:` overrides. UTF-8-safe stdout/stderr tail via `string_decoder.StringDecoder`. Abort (either `ctx.signal` or `ctx.shutdownSignal`) fires SIGTERM → 5s grace → SIGKILL on child. Requires `GBRAIN_ALLOW_SHELL_JOBS=1` on worker (gated by `registerBuiltinHandlers`).
- `src/core/minions/handlers/shell-audit.ts` — per-submission JSONL audit trail at `~/.gbrain/audit/shell-jobs-YYYY-Www.jsonl` (ISO-week rotation; override via `GBRAIN_AUDIT_DIR`). Best-effort: `mkdirSync(recursive)` + `appendFileSync`; failures logged to stderr, submission not blocked. Logs cmd (first 80 chars) or argv (JSON array). Never logs env values.
- `src/core/minions/attachments.ts` — Attachment validation (path traversal, null byte, oversize, base64, duplicate detection)
- `src/commands/jobs.ts``gbrain jobs` CLI subcommands + `gbrain jobs work` daemon
- `src/commands/jobs.ts``gbrain jobs` CLI subcommands + `gbrain jobs work` daemon. v0.13.1 surfaces the full `MinionJobInput` retry/backoff/timeout/idempotency surface as first-class CLI flags on `jobs submit`: `--max-stalled`, `--backoff-type fixed|exponential`, `--backoff-delay`, `--backoff-jitter`, `--timeout-ms`, `--idempotency-key`. `jobs smoke --sigkill-rescue` is the opt-in regression guard for #219.
- `src/commands/features.ts``gbrain features --json --auto-fix`: usage scan + feature adoption salesman
- `src/commands/autopilot.ts``gbrain autopilot --install`: self-maintaining brain daemon (sync+extract+embed)
- `src/mcp/server.ts` — MCP stdio server (generated from operations)
- `src/commands/auth.ts` — Standalone token management (create/list/revoke/test)
- `src/commands/upgrade.ts` — Self-update CLI. `runPostUpgrade()` enumerates migrations from the TS registry (src/commands/migrations/index.ts) and tail-calls `runApplyMigrations(['--yes', '--non-interactive'])` so the mechanical side of every outstanding migration runs unconditionally.
- `src/commands/migrations/` — TS migration registry (compiled into the binary; no filesystem walk of `skills/migrations/*.md` needed at runtime). `index.ts` lists migrations in semver order. `v0_11_0.ts` = Minions adoption orchestrator (8 phases). `v0_12_0.ts` = Knowledge Graph auto-wire orchestrator (5 phases: schema → config check → backfill links → backfill timeline → verify). `phaseASchema` has a 600s timeout (bumped from 60s in v0.12.1 for duplicate-heavy brains). `v0_12_2.ts` = JSONB double-encode repair orchestrator (4 phases: schema → repair-jsonb → verify → record). `v0_14_0.ts` = shell-jobs + autopilot cooperative (2 phases: schema ALTER minion_jobs.max_stalled SET DEFAULT 3, pending-host-work ping for skills/migrations/v0.14.0.md). All orchestrators are idempotent and resumable from `partial` status. As of v0.14.2 (Bug 3), the RUNNER owns all ledger writes — orchestrators return `OrchestratorResult` and `apply-migrations.ts` persists a canonical `{version, status, phases}` shape after return. Orchestrators no longer call `appendCompletedMigration` directly. `statusForVersion` prefers `complete` over `partial` (never regresses). 3 consecutive partials → wedged → `--force-retry <version>` writes a `'retry'` reset marker.
- `src/commands/migrations/` — TS migration registry (compiled into the binary; no filesystem walk of `skills/migrations/*.md` needed at runtime). `index.ts` lists migrations in semver order. `v0_11_0.ts` = Minions adoption orchestrator (8 phases). `v0_12_0.ts` = Knowledge Graph auto-wire orchestrator (5 phases: schema → config check → backfill links → backfill timeline → verify). `phaseASchema` has a 600s timeout (bumped from 60s in v0.12.1 for duplicate-heavy brains). `v0_12_2.ts` = JSONB double-encode repair orchestrator (4 phases: schema → repair-jsonb → verify → record). `v0_14_0.ts` = shell-jobs + autopilot cooperative (2 phases: schema ALTER minion_jobs.max_stalled SET DEFAULT 3 — superseded by v0.14.3's schema-level DEFAULT 5 + UPDATE backfill; pending-host-work ping for skills/migrations/v0.14.0.md). All orchestrators are idempotent and resumable from `partial` status. As of v0.14.2 (Bug 3), the RUNNER owns all ledger writes — orchestrators return `OrchestratorResult` and `apply-migrations.ts` persists a canonical `{version, status, phases}` shape after return. Orchestrators no longer call `appendCompletedMigration` directly. `statusForVersion` prefers `complete` over `partial` (never regresses). 3 consecutive partials → wedged → `--force-retry <version>` writes a `'retry'` reset marker. v0.14.3 (fix wave) ships schema-only migrations v14 (`pages_updated_at_index`) + v15 (`minion_jobs_max_stalled_default_5` with UPDATE backfill) via the `MIGRATIONS` array in `src/core/migrate.ts` — no orchestrator phases needed.
- `src/commands/repair-jsonb.ts``gbrain repair-jsonb [--dry-run] [--json]`: rewrites `jsonb_typeof='string'` rows in place across 5 affected columns (pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata, page_versions.frontmatter). Fixes v0.12.0 double-encode bug on Postgres; PGLite no-ops. Idempotent.
- `src/commands/orphans.ts``gbrain orphans [--json] [--count] [--include-pseudo]`: surfaces pages with zero inbound wikilinks, grouped by domain. Auto-generated/raw/pseudo pages filtered by default. Also exposed as `find_orphans` MCP operation. Shipped in v0.12.3 (contributed by @knee5).
- `src/commands/doctor.ts``gbrain doctor [--json] [--fast] [--fix] [--dry-run]`: health checks. v0.12.3 adds two reliability detection checks: `jsonb_integrity` (scans pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata for `jsonb_typeof='string'` rows left over from v0.12.0) and `markdown_body_completeness` (flags pages whose compiled_truth is <30% of raw source when raw has multiple H2/H3 boundaries). Fix hints point at `gbrain repair-jsonb` and `gbrain sync --force`. v0.14.1: `--fix` delegates inlined cross-cutting rules to `> **Convention:** see [path](path).` callouts (pipes DRY violations into `src/core/dry-fix.ts`); `--fix --dry-run` previews without writing.
- `src/commands/doctor.ts``gbrain doctor [--json] [--fast] [--fix] [--dry-run] [--index-audit]`: health checks. v0.12.3 added `jsonb_integrity` + `markdown_body_completeness` reliability checks. v0.14.1: `--fix` delegates inlined cross-cutting rules to `> **Convention:** see [path](path).` callouts (pipes DRY violations into `src/core/dry-fix.ts`); `--fix --dry-run` previews without writing. v0.14.2: `schema_version` check fails loudly when `version=0` (migrations never ran — the #218 `bun install -g` signature) and routes users to `gbrain apply-migrations --yes`; new opt-in `--index-audit` flag (Postgres-only) reports zero-scan indexes from `pg_stat_user_indexes` (informational only, no auto-drop). Fix hints point at `gbrain repair-jsonb`, `gbrain sync --force`, and `gbrain apply-migrations`.
- `src/core/migrate.ts` — schema-migration runner. Owns the `MIGRATIONS` array (source of truth for schema DDL). v0.14.2 extended the `Migration` interface with `sqlFor?: { postgres?, pglite? }` (engine-specific SQL overrides `sql`) and `transaction?: boolean` (set to false for `CREATE INDEX CONCURRENTLY`, which Postgres refuses inside a transaction; ignored on PGLite since it has no concurrent writers). Migration v14 (fix wave) uses a handler branching on `engine.kind` to run CONCURRENTLY on Postgres (with a pre-drop of any invalid remnant via `pg_index.indisvalid`) and plain `CREATE INDEX` on PGLite. v15 bumps `minion_jobs.max_stalled` default 1→5 and backfills existing non-terminal rows.
- `src/core/markdown.ts` — Frontmatter parsing + body splitter. `splitBody` requires an explicit timeline sentinel (`<!-- timeline -->`, `--- timeline ---`, or `---` immediately before `## Timeline`/`## History`). Plain `---` in body text is a markdown horizontal rule, not a separator. `inferType` auto-types `/wiki/analysis/` → analysis, `/wiki/guides/` → guide, `/wiki/hardware/` → hardware, `/wiki/architecture/` → architecture, `/writing/` → writing (plus the existing people/companies/deals/etc heuristics).
- `scripts/check-jsonb-pattern.sh` — CI grep guard. Fails the build if anyone reintroduces the `${JSON.stringify(x)}::jsonb` interpolation pattern (which postgres.js v3 double-encodes). Wired into `bun test`.
- `scripts/check-jsonb-pattern.sh` — CI grep guard. Fails the build if anyone reintroduces (a) the `${JSON.stringify(x)}::jsonb` interpolation pattern (postgres.js v3 double-encodes it), or (b) `max_stalled INTEGER NOT NULL DEFAULT 1` in any schema source file (v0.15.1 #219 regression guard — must be DEFAULT 5 to preserve SIGKILL-rescue). Wired into `bun test`.
- `scripts/llms-config.ts` + `scripts/build-llms.ts` — Generator for `llms.txt` (llmstxt.org-spec web index) + `llms-full.txt` (inlined single-fetch bundle). Curated config drives both. Run `bun run build:llms` after adding a new doc. `LLMS_REPO_BASE` env var lets forks regenerate with their own URL base. `FULL_SIZE_BUDGET` (600KB) caps the inline bundle; generator WARNs if exceeded. Committed output is not analogous to `schema-embedded.ts` (no runtime consumer); we commit for GitHub browsing and fork-safe fetching.
- `AGENTS.md` — Local-clone entry point for non-Claude agents (Codex, Cursor, OpenClaw, Aider). Mirrors `CLAUDE.md` intent via relative links. Claude Code keeps using `CLAUDE.md`.
- `docs/UPGRADING_DOWNSTREAM_AGENTS.md` — Patches for downstream agent skill forks to apply when upgrading. Each release appends a new section. v0.10.3 includes diffs for brain-ops, meeting-ingestion, signal-detector, enrich.
@@ -132,12 +134,13 @@ Key commands added in v0.7:
- `gbrain migrate --to supabase` / `gbrain migrate --to pglite` — bidirectional engine migration
Key commands added for Minions (job queue):
- `gbrain jobs submit <name> [--params JSON] [--follow] [--dry-run]` — submit a background job
- `gbrain jobs submit <name> [--params JSON] [--follow] [--dry-run]` — submit a background job. v0.13.1 adds first-class flags for every `MinionJobInput` tuning knob: `--max-stalled N`, `--backoff-type fixed|exponential`, `--backoff-delay Nms`, `--backoff-jitter 0..1`, `--timeout-ms N`, `--idempotency-key K`.
- `gbrain jobs list [--status S] [--queue Q]` — list jobs with filters
- `gbrain jobs get <id>` — job details with attempt history
- `gbrain jobs cancel/retry/delete <id>` — manage job lifecycle
- `gbrain jobs prune [--older-than 30d]` — clean old completed/dead jobs
- `gbrain jobs stats` — job health dashboard
- `gbrain jobs smoke [--sigkill-rescue]` — health smoke test. `--sigkill-rescue` is the v0.13.1 regression guard for #219: simulates a killed worker and asserts the stalled job is requeued instead of dead-lettered on first stall.
- `gbrain jobs work [--queue Q] [--concurrency N]` — start worker daemon (Postgres only)
Key commands added in v0.12.2:
@@ -154,6 +157,12 @@ Key commands added in v0.14.2:
- `GBRAIN_POOL_SIZE` env var — honored by both the singleton pool (`src/core/db.ts`) and the parallel-import worker pool (`src/commands/import.ts`). Default is 10; lower to 2 for Supabase transaction pooler to avoid MaxClients crashes during `gbrain upgrade` subprocess spawns. Read at call time via `resolvePoolSize()`.
- `gbrain doctor` gains two new checks: `sync_failures` (surfaces unacknowledged parse failures with exact paths + fix hints) and `brain_score` (renders the 5-component breakdown when score < 100: embed coverage / 35, link density / 25, timeline coverage / 15, orphans / 15, dead links / 10 — sum equals total).
Key commands added in v0.14.3 (fix wave):
- `gbrain doctor --index-audit` — opt-in Postgres-only check reporting zero-scan indexes from `pg_stat_user_indexes`. Informational only; never auto-drops.
- `gbrain doctor` schema_version check fails loudly when `version=0` — catches `bun install -g github:...` postinstall failures (#218) and routes users to `gbrain apply-migrations --yes`.
- `gbrain jobs submit` gains `--max-stalled`, `--backoff-type`, `--backoff-delay`, `--backoff-jitter`, `--timeout-ms`, `--idempotency-key` — exposing existing `MinionJobInput` fields as first-class CLI flags.
- `gbrain jobs smoke --sigkill-rescue` — opt-in regression smoke case simulating a killed worker; asserts the v0.14.3 schema default (`max_stalled=5`) actually rescues on first stall.
## Testing
`bun test` runs all tests. After the v0.12.1 release: ~75 unit test files + 8 E2E test files (1412 unit pass, 119 E2E when `DATABASE_URL` is set — skip gracefully otherwise). Unit tests run
@@ -165,11 +174,11 @@ parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redac
`test/files.test.ts` (MIME/hash), `test/import-file.test.ts` (import pipeline),
`test/upgrade.test.ts` (schema migrations),
`test/file-migration.test.ts` (file migration), `test/file-resolver.test.ts` (file resolution),
`test/import-resume.test.ts` (import checkpoints), `test/migrate.test.ts` (migration; v8/v9 helper-btree-index SQL structural assertions + 1000-row wall-clock fixtures that guard the O(n²)→O(n log n) fix),
`test/import-resume.test.ts` (import checkpoints), `test/migrate.test.ts` (migration; v8/v9 helper-btree-index SQL structural assertions + 1000-row wall-clock fixtures that guard the O(n²)→O(n log n) fix + v0.13.1 assertions on v12/v13 SQL shape, `sqlFor` + `transaction:false` runner semantics, and the `max_stalled DEFAULT 1` regression guard),
`test/setup-branching.test.ts` (setup flow), `test/slug-validation.test.ts` (slug validation),
`test/storage.test.ts` (storage backends), `test/supabase-admin.test.ts` (Supabase admin),
`test/yaml-lite.test.ts` (YAML parsing), `test/check-update.test.ts` (version check + update CLI),
`test/pglite-engine.test.ts` (PGLite engine, all 40 BrainEngine methods including 11 cases for `addLinksBatch` / `addTimelineEntriesBatch`: empty batch, missing optionals, within-batch dedup via ON CONFLICT, missing-slug rows dropped by JOIN, half-existing batch, batch of 100),
`test/pglite-engine.test.ts` (PGLite engine, all 40 BrainEngine methods including 11 cases for `addLinksBatch` / `addTimelineEntriesBatch`: empty batch, missing optionals, within-batch dedup via ON CONFLICT, missing-slug rows dropped by JOIN, half-existing batch, batch of 100 + v0.13.1 `connect()` error-wrap assertion (original error nested, #223 link in message, lock released)),
`test/engine-factory.test.ts` (engine factory + dynamic imports),
`test/integrations.test.ts` (recipe parsing, CLI routing, recipe validation),
`test/publish.test.ts` (content stripping, encryption, password generation, HTML output),
@@ -190,7 +199,7 @@ parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redac
`test/transcription.test.ts` (provider detection, format validation, API key errors),
`test/enrichment-service.test.ts` (entity slugification, extraction, tier escalation),
`test/data-research.test.ts` (recipe validation, MRR/ARR extraction, dedup, tracker parsing, HTML stripping),
`test/minions.test.ts` (Minions job queue v7: CRUD, state machine, backoff, stall detection, dependencies, worker lifecycle, lock management, claim mechanics, depth/child-cap, timeouts, cascade kill, idempotency, child_done inbox, attachments, removeOnComplete/Fail),
`test/minions.test.ts` (Minions job queue v7: CRUD, state machine, backoff, stall detection, dependencies, worker lifecycle, lock management, claim mechanics, depth/child-cap, timeouts, cascade kill, idempotency, child_done inbox, attachments, removeOnComplete/Fail + v0.13.1 `max_stalled` clamp/default/plumbing coverage),
`test/extract.test.ts` (link extraction, timeline extraction, frontmatter parsing, directory type inference),
`test/extract-db.test.ts` (gbrain extract --source db: typed link inference, idempotency, --type filter, --dry-run JSON output),
`test/extract-fs.test.ts` (gbrain extract --source fs: first-run inserts + second-run reports zero, dry-run dedups candidates across files, second-run perf regression guard — the v0.12.1 N+1 dedup bug),

View File

@@ -26,6 +26,11 @@ bun install && bun link
Verify: `gbrain --version` should print a version number. If `gbrain` is not found,
restart the shell or add the PATH export to the shell profile.
> **Do NOT use `bun install -g github:garrytan/gbrain`.** Bun blocks the top-level
> postinstall hook on global installs, so schema migrations never run and the CLI
> aborts with `Aborted()` when it opens PGLite. Use the `git clone + bun link` path
> above. Tracking issue: [#218](https://github.com/garrytan/gbrain/issues/218).
## Step 2: API Keys
Ask the user for these:

View File

@@ -44,6 +44,11 @@ gbrain import ~/notes/ # index your markdown
gbrain query "what themes show up across my notes?"
```
**Do NOT use `bun install -g github:garrytan/gbrain`.** Bun blocks the top-level
postinstall hook on global installs, so schema migrations never run and the CLI
aborts with `Aborted()` the first time it opens PGLite. Use `git clone + bun install
&& bun link` as shown above. See [#218](https://github.com/garrytan/gbrain/issues/218).
```
3 results (hybrid search, 0.12s):

View File

@@ -1 +1 @@
0.15.0
0.15.1

View File

@@ -7,7 +7,7 @@
"dependencies": {
"@anthropic-ai/sdk": "^0.30.0",
"@aws-sdk/client-s3": "^3.1028.0",
"@electric-sql/pglite": "^0.4.4",
"@electric-sql/pglite": "0.4.3",
"@modelcontextprotocol/sdk": "^1.0.0",
"gray-matter": "^4.0.3",
"marked": "^18.0.0",
@@ -20,6 +20,9 @@
},
},
},
"trustedDependencies": [
"@electric-sql/pglite",
],
"packages": {
"@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.30.1", "", { "dependencies": { "@types/node": "^18.11.18", "@types/node-fetch": "^2.6.4", "abort-controller": "^3.0.0", "agentkeepalive": "^4.2.1", "form-data-encoder": "1.7.2", "formdata-node": "^4.3.2", "node-fetch": "^2.6.7" } }, "sha512-nuKvp7wOIz6BFei8WrTdhmSsx5mwnArYyJgh4+vYu3V4J0Ltb8Xm3odPm51n1aSI0XxNCrDl7O88cxCtUdAkaw=="],
@@ -103,7 +106,7 @@
"@aws/lambda-invoke-store": ["@aws/lambda-invoke-store@0.2.4", "", {}, "sha512-iY8yvjE0y651BixKNPgmv1WrQc+GZ142sb0z4gYnChDDY2YqI4P/jsSopBWrKfAt7LOJAkOXt7rC/hms+WclQQ=="],
"@electric-sql/pglite": ["@electric-sql/pglite@0.4.4", "", {}, "sha512-g/6CWAJ4XOkObWCWAQ2IReZD8VvsDy3poRHSKvpRR2F96F8WJ3HVbjpso3gN7l0q6QPPgvxSSpl/qo5k8a7mkQ=="],
"@electric-sql/pglite": ["@electric-sql/pglite@0.4.3", "", {}, "sha512-ichuWTgtd4mOM1G4SpyGJa5trT03lWbMypDV0fUXUCXg5hiHqVAz/bZyV68NqmkLB7WcYmj1RMJVSp8HV/v/ZQ=="],
"@hono/node-server": ["@hono/node-server@1.19.12", "", { "peerDependencies": { "hono": "^4" } }, "sha512-txsUW4SQ1iilgE0l9/e9VQWmELXifEFvmdA1j6WFh/aFPj99hIntrSsq/if0UWyGVkmrRPKA1wCeP+UCr1B9Uw=="],

View File

@@ -102,9 +102,9 @@ strict behavior when unset.
## Key files
- `src/core/operations.ts` — Contract-first operation definitions (the foundation). Also exports upload validators: `validateUploadPath`, `validatePageSlug`, `validateFilename`. `OperationContext.remote` flags untrusted callers.
- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`).
- `src/core/engine.ts` — Pluggable engine interface (BrainEngine). `clampSearchLimit(limit, default, cap)` takes an explicit cap so per-operation caps can be tighter than `MAX_SEARCH_LIMIT`. Exports `LinkBatchInput` / `TimelineBatchInput` for the v0.12.1 bulk-insert API (`addLinksBatch` / `addTimelineEntriesBatch`). As of v0.13.1, `BrainEngine` has a `readonly kind: 'postgres' | 'pglite'` discriminator so migrations (`src/core/migrate.ts`) and other consumers can branch on engine without `instanceof` + dynamic imports.
- `src/core/engine-factory.ts` — Engine factory with dynamic imports (`'pglite'` | `'postgres'`)
- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders.
- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 40 BrainEngine methods. `addLinksBatch` / `addTimelineEntriesBatch` use multi-row `unnest()` with manual `$N` placeholders. As of v0.13.1, `connect()` wraps `PGlite.create()` in a try/catch that emits an actionable error naming the macOS 26.3 WASM bug (#223) and pointing at `gbrain doctor`; the lock is released on failure so the next process can retry cleanly.
- `src/core/pglite-schema.ts` — PGLite-specific DDL (pgvector, pg_trgm, triggers)
- `src/core/postgres-engine.ts` — Postgres + pgvector implementation (Supabase / self-hosted). `addLinksBatch` / `addTimelineEntriesBatch` use `INSERT ... SELECT FROM unnest($1::text[], ...) JOIN pages ON CONFLICT DO NOTHING RETURNING 1` — 4-5 array params regardless of batch size, sidesteps the 65535-parameter cap. As of v0.12.3, `searchKeyword` / `searchVector` scope `statement_timeout` via `sql.begin` + `SET LOCAL` so the GUC dies with the transaction instead of leaking across the pooled postgres.js connection (contributed by @garagon). `getEmbeddingsByChunkIds` uses `tryParseEmbedding` so one corrupt row skips+warns instead of killing the query.
- `src/core/utils.ts` — Shared SQL utilities extracted from postgres-engine.ts. Exports `parseEmbedding(value)` (throws on unknown input, used by migration + ingest paths where data integrity matters) and as of v0.12.3 `tryParseEmbedding(value)` (returns `null` + warns once per process, used by search/rescore paths where availability matters more than strictness).
@@ -131,25 +131,27 @@ strict behavior when unset.
- `src/commands/extract.ts` — `gbrain extract links|timeline|all [--source fs|db]`: batch link/timeline extraction. fs walks markdown files, db walks pages from the engine (mutation-immune snapshot iteration; use this for live brains with no local checkout). As of v0.12.1 there is no in-memory dedup pre-load — candidates are buffered 100 at a time and flushed via `addLinksBatch` / `addTimelineEntriesBatch`; `ON CONFLICT DO NOTHING` enforces uniqueness at the DB layer, and the `created` counter returns real rows inserted (truthful on re-runs).
- `src/commands/graph-query.ts` — `gbrain graph-query <slug> [--type T] [--depth N] [--direction in|out|both]`: typed-edge relationship traversal (renders indented tree)
- `src/core/link-extraction.ts` — shared library for the v0.12.0 graph layer. extractEntityRefs (canonical, replaces backlinks.ts duplicate) matches both `[Name](people/slug)` markdown links and Obsidian `[[people/slug|Name]]` wikilinks as of v0.12.3. extractPageLinks, inferLinkType heuristics (attended/works_at/invested_in/founded/advises/source/mentions), parseTimelineEntries, isAutoLinkEnabled config helper. `DIR_PATTERN` covers `people`, `companies`, `deals`, `topics`, `concepts`, `projects`, `entities`, `tech`, `finance`, `personal`, `openclaw`. Used by extract.ts, operations.ts auto-link post-hook, and backlinks.ts.
- `src/core/minions/` — Minions job queue: BullMQ-inspired, Postgres-native (queue, worker, backoff, types)
- `src/core/minions/queue.ts` — MinionQueue class (submit, claim, complete, fail, stall detection, parent-child, depth/child-cap, per-job timeouts, cascade-kill, attachments, idempotency keys, child_done inbox, removeOnComplete/Fail). `add()` takes a 4th `trusted` arg (separate from `opts` to prevent spread leakage); protected names in `PROTECTED_JOB_NAMES` require `{allowProtectedSubmit: true}` and the check runs trim-normalized (whitespace-bypass safe).
- `src/core/minions/` — Minions job queue: BullMQ-inspired, Postgres-native (queue, worker, backoff, types, protected-names, quiet-hours, stagger, handlers/shell).
- `src/core/minions/queue.ts` — MinionQueue class (submit, claim, complete, fail, stall detection, parent-child, depth/child-cap, per-job timeouts, cascade-kill, attachments, idempotency keys, child_done inbox, removeOnComplete/Fail). `add()` takes a 4th `trusted` arg (separate from `opts` to prevent spread leakage); protected names in `PROTECTED_JOB_NAMES` require `{allowProtectedSubmit: true}` and the check runs trim-normalized (whitespace-bypass safe). v0.14.1 #219: `add()` plumbs `max_stalled` through with a `[1, 100]` clamp; omitted values let the schema DEFAULT (5) kick in.
- `src/core/minions/worker.ts` — MinionWorker class (handler registry, lock renewal, graceful shutdown, timeout safety net). v0.14.0 abort-path fix: aborted jobs now call `failJob` with reason (`timeout`/`cancel`/`lock-lost`/`shutdown`) instead of returning silently. `shutdownAbort` (instance field) fires on process SIGTERM/SIGINT and propagates to `ctx.shutdownSignal` — shell handler listens to it; non-shell handlers don't.
- `src/core/minions/types.ts` — `MinionJobInput` + `MinionJobStatus` + handler context types. `MinionJobInput.max_stalled` (new in v0.14.1) is optional; omitted values let the schema DEFAULT (5) kick in, provided values are clamped to `[1, 100]`.
- `src/core/minions/protected-names.ts` — side-effect-free constant module exporting `PROTECTED_JOB_NAMES` + `isProtectedJobName()`. Kept pure so queue core can import without loading handler modules.
- `src/core/minions/handlers/shell.ts` — `shell` job handler. Spawns `/bin/sh -c cmd` (absolute path, PATH-override-safe) or `argv[0] argv[1..]` (no shell). Env allowlist: `PATH, HOME, USER, LANG, TZ, NODE_ENV` + caller `env:` overrides. UTF-8-safe stdout/stderr tail via `string_decoder.StringDecoder`. Abort (either `ctx.signal` or `ctx.shutdownSignal`) fires SIGTERM → 5s grace → SIGKILL on child. Requires `GBRAIN_ALLOW_SHELL_JOBS=1` on worker (gated by `registerBuiltinHandlers`).
- `src/core/minions/handlers/shell-audit.ts` — per-submission JSONL audit trail at `~/.gbrain/audit/shell-jobs-YYYY-Www.jsonl` (ISO-week rotation; override via `GBRAIN_AUDIT_DIR`). Best-effort: `mkdirSync(recursive)` + `appendFileSync`; failures logged to stderr, submission not blocked. Logs cmd (first 80 chars) or argv (JSON array). Never logs env values.
- `src/core/minions/attachments.ts` — Attachment validation (path traversal, null byte, oversize, base64, duplicate detection)
- `src/commands/jobs.ts` — `gbrain jobs` CLI subcommands + `gbrain jobs work` daemon
- `src/commands/jobs.ts` — `gbrain jobs` CLI subcommands + `gbrain jobs work` daemon. v0.13.1 surfaces the full `MinionJobInput` retry/backoff/timeout/idempotency surface as first-class CLI flags on `jobs submit`: `--max-stalled`, `--backoff-type fixed|exponential`, `--backoff-delay`, `--backoff-jitter`, `--timeout-ms`, `--idempotency-key`. `jobs smoke --sigkill-rescue` is the opt-in regression guard for #219.
- `src/commands/features.ts` — `gbrain features --json --auto-fix`: usage scan + feature adoption salesman
- `src/commands/autopilot.ts` — `gbrain autopilot --install`: self-maintaining brain daemon (sync+extract+embed)
- `src/mcp/server.ts` — MCP stdio server (generated from operations)
- `src/commands/auth.ts` — Standalone token management (create/list/revoke/test)
- `src/commands/upgrade.ts` — Self-update CLI. `runPostUpgrade()` enumerates migrations from the TS registry (src/commands/migrations/index.ts) and tail-calls `runApplyMigrations(['--yes', '--non-interactive'])` so the mechanical side of every outstanding migration runs unconditionally.
- `src/commands/migrations/` — TS migration registry (compiled into the binary; no filesystem walk of `skills/migrations/*.md` needed at runtime). `index.ts` lists migrations in semver order. `v0_11_0.ts` = Minions adoption orchestrator (8 phases). `v0_12_0.ts` = Knowledge Graph auto-wire orchestrator (5 phases: schema → config check → backfill links → backfill timeline → verify). `phaseASchema` has a 600s timeout (bumped from 60s in v0.12.1 for duplicate-heavy brains). `v0_12_2.ts` = JSONB double-encode repair orchestrator (4 phases: schema → repair-jsonb → verify → record). `v0_14_0.ts` = shell-jobs + autopilot cooperative (2 phases: schema ALTER minion_jobs.max_stalled SET DEFAULT 3, pending-host-work ping for skills/migrations/v0.14.0.md). All orchestrators are idempotent and resumable from `partial` status. As of v0.14.2 (Bug 3), the RUNNER owns all ledger writes — orchestrators return `OrchestratorResult` and `apply-migrations.ts` persists a canonical `{version, status, phases}` shape after return. Orchestrators no longer call `appendCompletedMigration` directly. `statusForVersion` prefers `complete` over `partial` (never regresses). 3 consecutive partials → wedged → `--force-retry <version>` writes a `'retry'` reset marker.
- `src/commands/migrations/` — TS migration registry (compiled into the binary; no filesystem walk of `skills/migrations/*.md` needed at runtime). `index.ts` lists migrations in semver order. `v0_11_0.ts` = Minions adoption orchestrator (8 phases). `v0_12_0.ts` = Knowledge Graph auto-wire orchestrator (5 phases: schema → config check → backfill links → backfill timeline → verify). `phaseASchema` has a 600s timeout (bumped from 60s in v0.12.1 for duplicate-heavy brains). `v0_12_2.ts` = JSONB double-encode repair orchestrator (4 phases: schema → repair-jsonb → verify → record). `v0_14_0.ts` = shell-jobs + autopilot cooperative (2 phases: schema ALTER minion_jobs.max_stalled SET DEFAULT 3 — superseded by v0.14.3's schema-level DEFAULT 5 + UPDATE backfill; pending-host-work ping for skills/migrations/v0.14.0.md). All orchestrators are idempotent and resumable from `partial` status. As of v0.14.2 (Bug 3), the RUNNER owns all ledger writes — orchestrators return `OrchestratorResult` and `apply-migrations.ts` persists a canonical `{version, status, phases}` shape after return. Orchestrators no longer call `appendCompletedMigration` directly. `statusForVersion` prefers `complete` over `partial` (never regresses). 3 consecutive partials → wedged → `--force-retry <version>` writes a `'retry'` reset marker. v0.14.3 (fix wave) ships schema-only migrations v14 (`pages_updated_at_index`) + v15 (`minion_jobs_max_stalled_default_5` with UPDATE backfill) via the `MIGRATIONS` array in `src/core/migrate.ts` — no orchestrator phases needed.
- `src/commands/repair-jsonb.ts` — `gbrain repair-jsonb [--dry-run] [--json]`: rewrites `jsonb_typeof='string'` rows in place across 5 affected columns (pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata, page_versions.frontmatter). Fixes v0.12.0 double-encode bug on Postgres; PGLite no-ops. Idempotent.
- `src/commands/orphans.ts` — `gbrain orphans [--json] [--count] [--include-pseudo]`: surfaces pages with zero inbound wikilinks, grouped by domain. Auto-generated/raw/pseudo pages filtered by default. Also exposed as `find_orphans` MCP operation. Shipped in v0.12.3 (contributed by @knee5).
- `src/commands/doctor.ts` — `gbrain doctor [--json] [--fast] [--fix] [--dry-run]`: health checks. v0.12.3 adds two reliability detection checks: `jsonb_integrity` (scans pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata for `jsonb_typeof='string'` rows left over from v0.12.0) and `markdown_body_completeness` (flags pages whose compiled_truth is <30% of raw source when raw has multiple H2/H3 boundaries). Fix hints point at `gbrain repair-jsonb` and `gbrain sync --force`. v0.14.1: `--fix` delegates inlined cross-cutting rules to `> **Convention:** see [path](path).` callouts (pipes DRY violations into `src/core/dry-fix.ts`); `--fix --dry-run` previews without writing.
- `src/commands/doctor.ts` — `gbrain doctor [--json] [--fast] [--fix] [--dry-run] [--index-audit]`: health checks. v0.12.3 added `jsonb_integrity` + `markdown_body_completeness` reliability checks. v0.14.1: `--fix` delegates inlined cross-cutting rules to `> **Convention:** see [path](path).` callouts (pipes DRY violations into `src/core/dry-fix.ts`); `--fix --dry-run` previews without writing. v0.14.2: `schema_version` check fails loudly when `version=0` (migrations never ran — the #218 `bun install -g` signature) and routes users to `gbrain apply-migrations --yes`; new opt-in `--index-audit` flag (Postgres-only) reports zero-scan indexes from `pg_stat_user_indexes` (informational only, no auto-drop). Fix hints point at `gbrain repair-jsonb`, `gbrain sync --force`, and `gbrain apply-migrations`.
- `src/core/migrate.ts` — schema-migration runner. Owns the `MIGRATIONS` array (source of truth for schema DDL). v0.14.2 extended the `Migration` interface with `sqlFor?: { postgres?, pglite? }` (engine-specific SQL overrides `sql`) and `transaction?: boolean` (set to false for `CREATE INDEX CONCURRENTLY`, which Postgres refuses inside a transaction; ignored on PGLite since it has no concurrent writers). Migration v14 (fix wave) uses a handler branching on `engine.kind` to run CONCURRENTLY on Postgres (with a pre-drop of any invalid remnant via `pg_index.indisvalid`) and plain `CREATE INDEX` on PGLite. v15 bumps `minion_jobs.max_stalled` default 1→5 and backfills existing non-terminal rows.
- `src/core/markdown.ts` — Frontmatter parsing + body splitter. `splitBody` requires an explicit timeline sentinel (`<!-- timeline -->`, `--- timeline ---`, or `---` immediately before `## Timeline`/`## History`). Plain `---` in body text is a markdown horizontal rule, not a separator. `inferType` auto-types `/wiki/analysis/` → analysis, `/wiki/guides/` → guide, `/wiki/hardware/` → hardware, `/wiki/architecture/` → architecture, `/writing/` → writing (plus the existing people/companies/deals/etc heuristics).
- `scripts/check-jsonb-pattern.sh` — CI grep guard. Fails the build if anyone reintroduces the `${JSON.stringify(x)}::jsonb` interpolation pattern (which postgres.js v3 double-encodes). Wired into `bun test`.
- `scripts/check-jsonb-pattern.sh` — CI grep guard. Fails the build if anyone reintroduces (a) the `${JSON.stringify(x)}::jsonb` interpolation pattern (postgres.js v3 double-encodes it), or (b) `max_stalled INTEGER NOT NULL DEFAULT 1` in any schema source file (v0.15.1 #219 regression guard — must be DEFAULT 5 to preserve SIGKILL-rescue). Wired into `bun test`.
- `scripts/llms-config.ts` + `scripts/build-llms.ts` — Generator for `llms.txt` (llmstxt.org-spec web index) + `llms-full.txt` (inlined single-fetch bundle). Curated config drives both. Run `bun run build:llms` after adding a new doc. `LLMS_REPO_BASE` env var lets forks regenerate with their own URL base. `FULL_SIZE_BUDGET` (600KB) caps the inline bundle; generator WARNs if exceeded. Committed output is not analogous to `schema-embedded.ts` (no runtime consumer); we commit for GitHub browsing and fork-safe fetching.
- `AGENTS.md` — Local-clone entry point for non-Claude agents (Codex, Cursor, OpenClaw, Aider). Mirrors `CLAUDE.md` intent via relative links. Claude Code keeps using `CLAUDE.md`.
- `docs/UPGRADING_DOWNSTREAM_AGENTS.md` — Patches for downstream agent skill forks to apply when upgrading. Each release appends a new section. v0.10.3 includes diffs for brain-ops, meeting-ingestion, signal-detector, enrich.
@@ -211,12 +213,13 @@ Key commands added in v0.7:
- `gbrain migrate --to supabase` / `gbrain migrate --to pglite` — bidirectional engine migration
Key commands added for Minions (job queue):
- `gbrain jobs submit <name> [--params JSON] [--follow] [--dry-run]` — submit a background job
- `gbrain jobs submit <name> [--params JSON] [--follow] [--dry-run]` — submit a background job. v0.13.1 adds first-class flags for every `MinionJobInput` tuning knob: `--max-stalled N`, `--backoff-type fixed|exponential`, `--backoff-delay Nms`, `--backoff-jitter 0..1`, `--timeout-ms N`, `--idempotency-key K`.
- `gbrain jobs list [--status S] [--queue Q]` — list jobs with filters
- `gbrain jobs get <id>` — job details with attempt history
- `gbrain jobs cancel/retry/delete <id>` — manage job lifecycle
- `gbrain jobs prune [--older-than 30d]` — clean old completed/dead jobs
- `gbrain jobs stats` — job health dashboard
- `gbrain jobs smoke [--sigkill-rescue]` — health smoke test. `--sigkill-rescue` is the v0.13.1 regression guard for #219: simulates a killed worker and asserts the stalled job is requeued instead of dead-lettered on first stall.
- `gbrain jobs work [--queue Q] [--concurrency N]` — start worker daemon (Postgres only)
Key commands added in v0.12.2:
@@ -233,6 +236,12 @@ Key commands added in v0.14.2:
- `GBRAIN_POOL_SIZE` env var — honored by both the singleton pool (`src/core/db.ts`) and the parallel-import worker pool (`src/commands/import.ts`). Default is 10; lower to 2 for Supabase transaction pooler to avoid MaxClients crashes during `gbrain upgrade` subprocess spawns. Read at call time via `resolvePoolSize()`.
- `gbrain doctor` gains two new checks: `sync_failures` (surfaces unacknowledged parse failures with exact paths + fix hints) and `brain_score` (renders the 5-component breakdown when score < 100: embed coverage / 35, link density / 25, timeline coverage / 15, orphans / 15, dead links / 10 — sum equals total).
Key commands added in v0.14.3 (fix wave):
- `gbrain doctor --index-audit` — opt-in Postgres-only check reporting zero-scan indexes from `pg_stat_user_indexes`. Informational only; never auto-drops.
- `gbrain doctor` schema_version check fails loudly when `version=0` — catches `bun install -g github:...` postinstall failures (#218) and routes users to `gbrain apply-migrations --yes`.
- `gbrain jobs submit` gains `--max-stalled`, `--backoff-type`, `--backoff-delay`, `--backoff-jitter`, `--timeout-ms`, `--idempotency-key` — exposing existing `MinionJobInput` fields as first-class CLI flags.
- `gbrain jobs smoke --sigkill-rescue` — opt-in regression smoke case simulating a killed worker; asserts the v0.14.3 schema default (`max_stalled=5`) actually rescues on first stall.
## Testing
`bun test` runs all tests. After the v0.12.1 release: ~75 unit test files + 8 E2E test files (1412 unit pass, 119 E2E when `DATABASE_URL` is set — skip gracefully otherwise). Unit tests run
@@ -244,11 +253,11 @@ parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redac
`test/files.test.ts` (MIME/hash), `test/import-file.test.ts` (import pipeline),
`test/upgrade.test.ts` (schema migrations),
`test/file-migration.test.ts` (file migration), `test/file-resolver.test.ts` (file resolution),
`test/import-resume.test.ts` (import checkpoints), `test/migrate.test.ts` (migration; v8/v9 helper-btree-index SQL structural assertions + 1000-row wall-clock fixtures that guard the O(n²)→O(n log n) fix),
`test/import-resume.test.ts` (import checkpoints), `test/migrate.test.ts` (migration; v8/v9 helper-btree-index SQL structural assertions + 1000-row wall-clock fixtures that guard the O(n²)→O(n log n) fix + v0.13.1 assertions on v12/v13 SQL shape, `sqlFor` + `transaction:false` runner semantics, and the `max_stalled DEFAULT 1` regression guard),
`test/setup-branching.test.ts` (setup flow), `test/slug-validation.test.ts` (slug validation),
`test/storage.test.ts` (storage backends), `test/supabase-admin.test.ts` (Supabase admin),
`test/yaml-lite.test.ts` (YAML parsing), `test/check-update.test.ts` (version check + update CLI),
`test/pglite-engine.test.ts` (PGLite engine, all 40 BrainEngine methods including 11 cases for `addLinksBatch` / `addTimelineEntriesBatch`: empty batch, missing optionals, within-batch dedup via ON CONFLICT, missing-slug rows dropped by JOIN, half-existing batch, batch of 100),
`test/pglite-engine.test.ts` (PGLite engine, all 40 BrainEngine methods including 11 cases for `addLinksBatch` / `addTimelineEntriesBatch`: empty batch, missing optionals, within-batch dedup via ON CONFLICT, missing-slug rows dropped by JOIN, half-existing batch, batch of 100 + v0.13.1 `connect()` error-wrap assertion (original error nested, #223 link in message, lock released)),
`test/engine-factory.test.ts` (engine factory + dynamic imports),
`test/integrations.test.ts` (recipe parsing, CLI routing, recipe validation),
`test/publish.test.ts` (content stripping, encryption, password generation, HTML output),
@@ -269,7 +278,7 @@ parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redac
`test/transcription.test.ts` (provider detection, format validation, API key errors),
`test/enrichment-service.test.ts` (entity slugification, extraction, tier escalation),
`test/data-research.test.ts` (recipe validation, MRR/ARR extraction, dedup, tracker parsing, HTML stripping),
`test/minions.test.ts` (Minions job queue v7: CRUD, state machine, backoff, stall detection, dependencies, worker lifecycle, lock management, claim mechanics, depth/child-cap, timeouts, cascade kill, idempotency, child_done inbox, attachments, removeOnComplete/Fail),
`test/minions.test.ts` (Minions job queue v7: CRUD, state machine, backoff, stall detection, dependencies, worker lifecycle, lock management, claim mechanics, depth/child-cap, timeouts, cascade kill, idempotency, child_done inbox, attachments, removeOnComplete/Fail + v0.13.1 `max_stalled` clamp/default/plumbing coverage),
`test/extract.test.ts` (link extraction, timeline extraction, frontmatter parsing, directory type inference),
`test/extract-db.test.ts` (gbrain extract --source db: typed link inference, idempotency, --type filter, --dry-run JSON output),
`test/extract-fs.test.ts` (gbrain extract --source fs: first-run inserts + second-run reports zero, dry-run dedups candidates across files, second-run perf regression guard — the v0.12.1 N+1 dedup bug),
@@ -724,6 +733,11 @@ bun install && bun link
Verify: `gbrain --version` should print a version number. If `gbrain` is not found,
restart the shell or add the PATH export to the shell profile.
> **Do NOT use `bun install -g github:garrytan/gbrain`.** Bun blocks the top-level
> postinstall hook on global installs, so schema migrations never run and the CLI
> aborts with `Aborted()` when it opens PGLite. Use the `git clone + bun link` path
> above. Tracking issue: [#218](https://github.com/garrytan/gbrain/issues/218).
## Step 2: API Keys
Ask the user for these:
@@ -1022,6 +1036,11 @@ gbrain import ~/notes/ # index your markdown
gbrain query "what themes show up across my notes?"
```
**Do NOT use `bun install -g github:garrytan/gbrain`.** Bun blocks the top-level
postinstall hook on global installs, so schema migrations never run and the CLI
aborts with `Aborted()` the first time it opens PGLite. Use `git clone + bun install
&& bun link` as shown above. See [#218](https://github.com/garrytan/gbrain/issues/218).
```
3 results (hybrid search, 0.12s):

View File

@@ -1,6 +1,6 @@
{
"name": "gbrain",
"version": "0.15.0",
"version": "0.15.1",
"description": "Postgres-native personal knowledge brain with hybrid RAG search",
"type": "module",
"main": "src/core/index.ts",
@@ -22,9 +22,9 @@
"build:schema": "bash scripts/build-schema.sh",
"build:llms": "bun run scripts/build-llms.ts",
"test": "scripts/check-jsonb-pattern.sh && bun test",
"test:e2e": "bun test test/e2e/",
"test:e2e": "bash scripts/run-e2e.sh",
"check:jsonb": "scripts/check-jsonb-pattern.sh",
"postinstall": "gbrain --version >/dev/null 2>&1 && gbrain apply-migrations --yes --non-interactive 2>/dev/null || true",
"postinstall": "command -v gbrain >/dev/null 2>&1 && gbrain apply-migrations --yes --non-interactive || echo '[gbrain] postinstall skipped. If installed via bun install -g github:...: run `gbrain doctor` and `gbrain apply-migrations --yes` manually. See https://github.com/garrytan/gbrain/issues/218' 1>&2",
"prepublish:clawhub": "bun run build:all",
"publish:clawhub": "clawhub package publish . --family bundle-plugin"
},
@@ -36,7 +36,7 @@
"dependencies": {
"@anthropic-ai/sdk": "^0.30.0",
"@aws-sdk/client-s3": "^3.1028.0",
"@electric-sql/pglite": "^0.4.4",
"@electric-sql/pglite": "0.4.3",
"@modelcontextprotocol/sdk": "^1.0.0",
"gray-matter": "^4.0.3",
"marked": "^18.0.0",
@@ -47,5 +47,8 @@
"devDependencies": {
"@types/bun": "latest"
},
"trustedDependencies": [
"@electric-sql/pglite"
],
"license": "MIT"
}

View File

@@ -30,3 +30,17 @@ if grep -rEn "$PATTERN" src/ 2>/dev/null; then
fi
echo "OK: no JSON.stringify(x)::jsonb interpolation pattern in src/"
# v0.13.1 #219: guard against max_stalled DEFAULT 1 regressing in any schema
# source file. DEFAULT 1 dead-lettered any SIGKILL'd job on first stall, making
# the "10/10 rescued" claim false for out-of-the-box users. Default is 5 now.
MAX_STALLED_PATTERN='max_stalled\s+INTEGER\s+NOT\s+NULL\s+DEFAULT\s+1\b'
if grep -rEn "$MAX_STALLED_PATTERN" src/schema.sql src/core/migrate.ts src/core/pglite-schema.ts src/core/schema-embedded.ts 2>/dev/null; then
echo
echo "ERROR: max_stalled DEFAULT 1 reintroduced in schema."
echo " Must be DEFAULT 5 to preserve SIGKILL-rescue guarantee. See #219."
exit 1
fi
echo "OK: max_stalled defaults are 5 in all schema sources"

66
scripts/run-e2e.sh Executable file
View File

@@ -0,0 +1,66 @@
#!/usr/bin/env bash
# Run E2E tests ONE FILE AT A TIME.
#
# Bun's default is to run test files in parallel (each in its own worker).
# Our E2E suite shares one Postgres database across all 13 files, and
# `setupDB()` does TRUNCATE CASCADE + fixture import. When files run in
# parallel, file A's TRUNCATE can race with file B's fixture import,
# producing observed fails like "expected 16 pages, got 8", missing
# links, orphaned timeline entries, etc. The flakiness was visible on
# ~3 of every 5 runs pre-fix.
#
# Running files sequentially eliminates the race entirely. It also costs
# some startup overhead (each file spins up a fresh bun process) but for
# a suite this size that is measured in ~1-2s per file, amortized under
# the natural per-file test time of 5-10s.
#
# Exits non-zero on the first failing file so CI fails fast.
set -euo pipefail
cd "$(dirname "$0")/.."
pass_files=0
fail_files=0
fail_list=()
total_pass=0
total_fail=0
for f in test/e2e/*.test.ts; do
name=$(basename "$f")
echo ""
echo "=== $name ==="
if output=$(bun test "$f" 2>&1); then
pass_files=$((pass_files + 1))
# Extract pass/fail counts from bun's summary (e.g., "123 pass")
p=$(echo "$output" | grep -oE '[0-9]+ pass' | tail -1 | grep -oE '[0-9]+' || echo 0)
total_pass=$((total_pass + p))
echo "$output" | tail -8
else
fail_files=$((fail_files + 1))
fail_list+=("$name")
p=$(echo "$output" | grep -oE '[0-9]+ pass' | tail -1 | grep -oE '[0-9]+' || echo 0)
fl=$(echo "$output" | grep -oE '[0-9]+ fail' | tail -1 | grep -oE '[0-9]+' || echo 0)
total_pass=$((total_pass + p))
total_fail=$((total_fail + fl))
echo "$output"
echo ""
echo "FAILED: $name"
# Continue so we see all failures; exit nonzero at the end.
fi
done
echo ""
echo "========================================"
echo "E2E SUMMARY (sequential execution)"
echo "========================================"
echo "Files: $((pass_files + fail_files)) total, $pass_files passed, $fail_files failed"
echo "Tests: $total_pass passed, $total_fail failed"
if [ ${#fail_list[@]} -gt 0 ]; then
echo ""
echo "Failing files:"
for f in "${fail_list[@]}"; do
echo " - $f"
done
exit 1
fi

View File

@@ -241,15 +241,30 @@ export async function runDoctor(engine: BrainEngine | null, args: string[], dbSo
checks.push({ name: 'rls', status: 'warn', message: 'Could not check RLS status' });
}
// 6. Schema version
// 6. Schema version — also surfaces the #218 "postinstall silently failed"
// state: if schema_version is 0/missing but the DB connected, migrations
// never ran. That's the same class as a half-migrated install, just from a
// different root cause (Bun blocked our top-level postinstall on global
// install). Message is actionable either way.
let schemaVersion = 0;
try {
const version = await engine.getConfig('version');
schemaVersion = parseInt(version || '0', 10);
if (schemaVersion >= LATEST_VERSION) {
checks.push({ name: 'schema_version', status: 'ok', message: `Version ${schemaVersion} (latest: ${LATEST_VERSION})` });
} else if (schemaVersion === 0) {
checks.push({
name: 'schema_version',
status: 'fail',
message: `No schema version recorded. Migrations never ran. Fix: gbrain apply-migrations --yes. ` +
`If you installed via 'bun install -g github:...', see https://github.com/garrytan/gbrain/issues/218.`,
});
} else {
checks.push({ name: 'schema_version', status: 'warn', message: `Version ${schemaVersion}, latest is ${LATEST_VERSION}. Run gbrain init to migrate.` });
checks.push({
name: 'schema_version',
status: 'warn',
message: `Version ${schemaVersion}, latest is ${LATEST_VERSION}. Fix: gbrain apply-migrations --yes`,
});
}
} catch {
checks.push({ name: 'schema_version', status: 'warn', message: 'Could not check schema version' });
@@ -415,6 +430,51 @@ export async function runDoctor(engine: BrainEngine | null, args: string[], dbSo
checks.push({ name: 'markdown_body_completeness', status: 'ok', message: 'Skipped (raw_data unavailable)' });
}
// 11. Index audit (opt-in via --index-audit). v0.13.1 follow-up to #170.
// Reports indexes with zero recorded scans on Postgres. Informational only;
// we DO NOT auto-drop. On #170's brain, idx_pages_frontmatter and
// idx_pages_trgm showed 0 scans — the suggestion there is "consider
// investigating on YOUR brain," not "drop these globally." Zero scans on a
// fresh install is also normal (nothing has queried yet); the real signal
// is zero scans on a long-running active brain.
if (args.includes('--index-audit')) {
if (engine.kind === 'pglite') {
checks.push({
name: 'index_audit',
status: 'ok',
message: 'Skipped (PGLite — pg_stat_user_indexes is a Postgres extension)',
});
} else {
try {
const sql = db.getConnection();
const rows = await sql`
SELECT schemaname, relname AS table, indexrelname AS index,
idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
AND idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20
`;
if (rows.length === 0) {
checks.push({ name: 'index_audit', status: 'ok', message: 'All public indexes have recorded scans' });
} else {
const list = rows.map((r: any) => `${r.index}(${r.size})`).join(', ');
checks.push({
name: 'index_audit',
status: 'warn',
message: `${rows.length} zero-scan index(es): ${list}. ` +
`Consider investigating whether they're used on YOUR workload (fresh brains naturally show zero scans until queries accumulate). ` +
`Do not drop without confirming.`,
});
}
} catch (e) {
const msg = e instanceof Error ? e.message : String(e);
checks.push({ name: 'index_audit', status: 'warn', message: `Index audit failed: ${msg}` });
}
}
}
const hasFail = outputResults(checks, jsonOutput);
// Features teaser (non-JSON, non-failing only)

View File

@@ -57,8 +57,10 @@ export async function runJobs(engine: BrainEngine, args: string[]): Promise<void
USAGE
gbrain jobs submit <name> [--params JSON] [--follow] [--priority N]
[--delay Nms] [--timeout-ms Nms] [--max-attempts N]
[--queue Q] [--dry-run]
[--delay Nms] [--max-attempts N] [--max-stalled N]
[--backoff-type fixed|exponential] [--backoff-delay Nms]
[--backoff-jitter 0..1] [--timeout-ms Nms]
[--idempotency-key K] [--queue Q] [--dry-run]
gbrain jobs list [--status S] [--queue Q] [--limit N]
gbrain jobs get <id>
gbrain jobs cancel <id>
@@ -104,13 +106,26 @@ HANDLER TYPES (built in)
const priority = parseInt(parseFlag(args, '--priority') ?? '0', 10);
const delay = parseInt(parseFlag(args, '--delay') ?? '0', 10);
const maxAttempts = parseInt(parseFlag(args, '--max-attempts') ?? '3', 10);
const queueName = parseFlag(args, '--queue') ?? 'default';
const maxStalledRaw = parseFlag(args, '--max-stalled');
const maxStalled = maxStalledRaw !== undefined ? parseInt(maxStalledRaw, 10) : undefined;
// v0.13.1 field audit: expose retry/backoff/timeout/idempotency knobs so
// users can tune Minions behavior without dropping into TypeScript.
const backoffTypeRaw = parseFlag(args, '--backoff-type');
const backoffType = backoffTypeRaw === 'fixed' || backoffTypeRaw === 'exponential'
? backoffTypeRaw
: undefined;
const backoffDelayRaw = parseFlag(args, '--backoff-delay');
const backoffDelay = backoffDelayRaw !== undefined ? parseInt(backoffDelayRaw, 10) : undefined;
const backoffJitterRaw = parseFlag(args, '--backoff-jitter');
const backoffJitter = backoffJitterRaw !== undefined ? parseFloat(backoffJitterRaw) : undefined;
const timeoutMsRaw = parseFlag(args, '--timeout-ms');
const timeoutMs = timeoutMsRaw !== undefined ? parseInt(timeoutMsRaw, 10) : undefined;
if (timeoutMsRaw !== undefined && (isNaN(timeoutMs!) || timeoutMs! <= 0)) {
console.error('Error: --timeout-ms must be a positive integer (milliseconds)');
process.exit(1);
}
const idempotencyKey = parseFlag(args, '--idempotency-key');
const queueName = parseFlag(args, '--queue') ?? 'default';
const dryRun = hasFlag(args, '--dry-run');
const follow = hasFlag(args, '--follow');
@@ -120,8 +135,13 @@ HANDLER TYPES (built in)
console.log(` Queue: ${queueName}`);
console.log(` Priority: ${priority}`);
console.log(` Max attempts: ${maxAttempts}`);
if (maxStalled !== undefined) console.log(` Max stalled: ${maxStalled}`);
if (backoffType) console.log(` Backoff type: ${backoffType}`);
if (backoffDelay !== undefined) console.log(` Backoff delay: ${backoffDelay}ms`);
if (backoffJitter !== undefined) console.log(` Backoff jitter: ${backoffJitter}`);
if (timeoutMs !== undefined) console.log(` Timeout: ${timeoutMs}ms`);
if (idempotencyKey) console.log(` Idempotency key: ${idempotencyKey}`);
if (delay > 0) console.log(` Delay: ${delay}ms`);
if (timeoutMs) console.log(` Timeout: ${timeoutMs}ms`);
console.log(` Data: ${JSON.stringify(data)}`);
return;
}
@@ -142,8 +162,13 @@ HANDLER TYPES (built in)
priority,
delay: delay > 0 ? delay : undefined,
max_attempts: maxAttempts,
queue: queueName,
max_stalled: maxStalled,
backoff_type: backoffType,
backoff_delay: backoffDelay,
backoff_jitter: backoffJitter,
timeout_ms: timeoutMs,
idempotency_key: idempotencyKey,
queue: queueName,
}, trusted);
// Submission audit log (operational trace, not forensic insurance).
@@ -353,6 +378,8 @@ HANDLER TYPES (built in)
process.exit(1);
}
const sigkillRescue = hasFlag(args, '--sigkill-rescue');
const worker = new MinionWorker(engine, { queue: 'smoke', pollInterval: 100 });
worker.register('noop', async () => ({ ok: true, at: new Date().toISOString() }));
@@ -370,22 +397,64 @@ HANDLER TYPES (built in)
await workerPromise;
const elapsedSec = ((Date.now() - startTime) / 1000).toFixed(2);
if (final?.status === 'completed') {
const cfg = (await import('../core/config.ts')).loadConfig();
const engineLabel = cfg?.engine ?? 'unknown';
console.log(`SMOKE PASS — Minions healthy in ${elapsedSec}s (engine: ${engineLabel})`);
if (engineLabel === 'pglite') {
console.log('Note: the `gbrain jobs work` daemon requires Postgres. PGLite');
console.log('supports inline execution only (`submit --follow`).');
}
try { await queue.removeJob(job.id); } catch { /* non-fatal cleanup */ }
process.exit(0);
} else {
if (final?.status !== 'completed') {
console.error(`SMOKE FAIL — job #${job.id} status: ${final?.status ?? 'timeout'} (${elapsedSec}s elapsed)`);
if (final?.error_text) console.error(` Error: ${final.error_text}`);
process.exit(1);
}
break;
// --sigkill-rescue: regression case for #219. Simulates a SIGKILL
// mid-flight by directly manipulating lock_until via handleStalled.
// Verifies that with the v0.13.1 schema default (max_stalled=5), a
// stalled job is REQUEUED rather than dead-lettered on first stall.
// Full subprocess-level SIGKILL lives in test/e2e/minions.test.ts.
if (sigkillRescue) {
const rescueJob = await queue.add('noop', {}, { queue: 'smoke' });
// Transition to active with a past lock_until, mimicking a worker
// that claimed and then got SIGKILL'd mid-run.
await engine.executeRaw(
`UPDATE minion_jobs
SET status='active',
lock_token='smoke-sigkill-rescue',
lock_until=now() - interval '1 minute',
started_at=now() - interval '2 minute',
attempts_started = attempts_started + 1
WHERE id=$1`,
[rescueJob.id]
);
const result = await queue.handleStalled();
const afterStall = await queue.getJob(rescueJob.id);
if (afterStall?.status === 'dead') {
console.error(
`SMOKE FAIL (--sigkill-rescue) — job #${rescueJob.id} was dead-lettered on first stall. ` +
`This is the #219 regression: schema default max_stalled should rescue, not dead-letter. ` +
`handleStalled: ${JSON.stringify(result)}`
);
process.exit(1);
}
if (afterStall?.status !== 'waiting') {
console.error(
`SMOKE FAIL (--sigkill-rescue) — unexpected status after stall: ${afterStall?.status}. ` +
`Expected 'waiting' (rescued). handleStalled: ${JSON.stringify(result)}`
);
process.exit(1);
}
try { await queue.removeJob(rescueJob.id); } catch { /* non-fatal cleanup */ }
}
const cfg = (await import('../core/config.ts')).loadConfig();
const engineLabel = cfg?.engine ?? 'unknown';
const tag = sigkillRescue ? ' + SIGKILL rescue' : '';
console.log(`SMOKE PASS — Minions healthy${tag} in ${elapsedSec}s (engine: ${engineLabel})`);
if (engineLabel === 'pglite') {
console.log('Note: the `gbrain jobs work` daemon requires Postgres. PGLite');
console.log('supports inline execution only (`submit --follow`).');
}
try { await queue.removeJob(job.id); } catch { /* non-fatal cleanup */ }
process.exit(0);
}
case 'work': {

View File

@@ -50,6 +50,9 @@ export function clampSearchLimit(limit: number | undefined, defaultLimit = 20, c
}
export interface BrainEngine {
/** Discriminator: lets migrations and other consumers branch on engine kind without instanceof + dynamic imports. */
readonly kind: 'postgres' | 'pglite';
// Lifecycle
connect(config: EngineConfig): Promise<void>;
disconnect(): Promise<void>;

View File

@@ -17,7 +17,20 @@ import { slugifyPath } from './sync.ts';
interface Migration {
version: number;
name: string;
/** Engine-agnostic SQL. Used when `sqlFor` is absent. Set to '' for handler-only or sqlFor-only migrations. */
sql: string;
/**
* Engine-specific SQL. If present, overrides `sql` for the matching engine.
* Needed when Postgres wants CONCURRENTLY but PGLite can't honor it.
*/
sqlFor?: { postgres?: string; pglite?: string };
/**
* When false, the runner does NOT wrap the SQL in `engine.transaction()`.
* Required for `CREATE INDEX CONCURRENTLY` (which Postgres refuses inside a transaction).
* Enforced Postgres-only; ignored on PGLite (PGLite has no concurrent writers anyway).
* Defaults to true.
*/
transaction?: boolean;
handler?: (engine: BrainEngine) => Promise<void>;
}
@@ -102,7 +115,7 @@ export const MIGRATIONS: Migration[] = [
backoff_delay INTEGER NOT NULL DEFAULT 1000,
backoff_jitter REAL NOT NULL DEFAULT 0.2,
stalled_counter INTEGER NOT NULL DEFAULT 0,
max_stalled INTEGER NOT NULL DEFAULT 1,
max_stalled INTEGER NOT NULL DEFAULT 5,
lock_token TEXT,
lock_until TIMESTAMPTZ,
delay_until TIMESTAMPTZ,
@@ -355,9 +368,8 @@ export const MIGRATIONS: Migration[] = [
// midnight rollover in the user's TZ naturally creates a new row instead of
// mutating yesterday's. reserved_usd and committed_usd track reservations
// vs actuals so process death between reserve() and commit()/rollback()
// can be cleaned up by TTL scan. status and reserved_at exist for that
// reclaim path. Rollback: DROP TABLE (budget is regenerable from resolver
// call logs; no durable product data lives here).
// can be cleaned up by TTL scan. Rollback: DROP TABLE (regenerable from
// resolver call logs; no durable product data lives here).
sql: `
CREATE TABLE IF NOT EXISTS budget_ledger (
scope TEXT NOT NULL,
@@ -388,16 +400,6 @@ export const MIGRATIONS: Migration[] = [
version: 13,
name: 'minion_quiet_hours_stagger',
// Adds quiet-hours gating + deterministic stagger to Minions.
//
// quiet_hours (JSONB): {start, end, tz, policy} — checked at claim
// time by the worker, not at dispatch. A queued job inside its quiet
// window is released back to 'waiting' and claimed again outside the
// window. 'skip' policy drops the event, 'defer' re-queues.
// stagger_key (TEXT): hashed to a minute-slot offset so jobs with the
// same key don't collide when a cron boundary fires. Optional; NULL
// = no stagger. The hash lives in application code (deterministic,
// ensures same key always lands on same slot) so the column is
// just the key.
sql: `
ALTER TABLE minion_jobs ADD COLUMN IF NOT EXISTS quiet_hours JSONB;
ALTER TABLE minion_jobs ADD COLUMN IF NOT EXISTS stagger_key TEXT;
@@ -405,6 +407,65 @@ export const MIGRATIONS: Migration[] = [
ON minion_jobs(stagger_key) WHERE stagger_key IS NOT NULL;
`,
},
{
version: 14,
name: 'pages_updated_at_index',
// v0.14.1 (fix wave): fixes the 14.6s "list pages newest-first" seqscan on 31k+ row brains.
// Original report: https://github.com/garrytan/gbrain/issues/170 (PR #215).
//
// Engine-aware via handler (not SQL): Postgres uses CREATE INDEX CONCURRENTLY
// to avoid the write-blocking SHARE lock on `pages`. CONCURRENTLY refuses to
// run inside a transaction AND postgres.js's multi-statement `.unsafe()` wraps
// in an implicit transaction, so the handler runs each statement as a separate
// call. A failed CONCURRENTLY leaves an invalid index with the target name;
// the handler pre-drops any invalid remnant via pg_index.indisvalid. PGLite
// has no concurrent writers, so plain CREATE is safe.
sql: '',
handler: async (engine) => {
if (engine.kind === 'postgres') {
await engine.runMigration(
14,
`DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_index i
JOIN pg_class c ON c.oid = i.indexrelid
WHERE c.relname = 'idx_pages_updated_at_desc' AND NOT i.indisvalid
) THEN
EXECUTE 'DROP INDEX CONCURRENTLY IF EXISTS idx_pages_updated_at_desc';
END IF;
END $$;`
);
await engine.runMigration(
14,
`CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_pages_updated_at_desc
ON pages (updated_at DESC);`
);
} else {
await engine.runMigration(
14,
`CREATE INDEX IF NOT EXISTS idx_pages_updated_at_desc
ON pages (updated_at DESC);`
);
}
},
},
{
version: 15,
name: 'minion_jobs_max_stalled_default_5',
// v0.14.1 (fix wave): fixes https://github.com/garrytan/gbrain/issues/219
// Shipped default was 1 — first stall = dead-letter, contradicting the
// "SIGKILL rescued" claim. New default 5. UPDATE backfills existing non-
// terminal rows so upgrading brains don't keep dead-lettering queued work.
// Statuses come from MinionJobStatus in types.ts. Row locks serialize
// against claim()'s FOR UPDATE SKIP LOCKED — race-safe. Idempotent.
sql: `
ALTER TABLE minion_jobs ALTER COLUMN max_stalled SET DEFAULT 5;
UPDATE minion_jobs
SET max_stalled = 5
WHERE status IN ('waiting','active','delayed','waiting-children','paused')
AND max_stalled < 5;
`,
},
];
export const LATEST_VERSION = MIGRATIONS.length > 0
@@ -418,11 +479,23 @@ export async function runMigrations(engine: BrainEngine): Promise<{ applied: num
let applied = 0;
for (const m of MIGRATIONS) {
if (m.version > current) {
// SQL migration (transactional)
if (m.sql) {
await engine.transaction(async (tx) => {
await tx.runMigration(m.version, m.sql);
});
// Pick SQL: engine-specific `sqlFor` wins over engine-agnostic `sql`.
const sql = m.sqlFor?.[engine.kind] ?? m.sql;
if (sql) {
const useTransaction = m.transaction !== false;
// Non-transactional path is Postgres-only: `CREATE INDEX CONCURRENTLY`
// refuses to run inside a transaction. PGLite has no concurrent
// writers, so even if a migration sets transaction:false we wrap it
// anyway (harmless; keeps behavior consistent).
if (useTransaction || engine.kind === 'pglite') {
await engine.transaction(async (tx) => {
await tx.runMigration(m.version, sql);
});
} else {
// Postgres + transaction:false → direct execution, no BEGIN/COMMIT.
await engine.runMigration(m.version, sql);
}
}
// Application-level handler (runs outside transaction for flexibility)

View File

@@ -134,23 +134,34 @@ export class MinionQueue {
// 3. Insert child. Use ON CONFLICT for idempotency; if a concurrent submit
// raced past the fast-path SELECT, the unique index catches it here.
// v12 adds quiet_hours + stagger_key passed through from opts.
const insertSql = opts?.idempotency_key
? `INSERT INTO minion_jobs (name, queue, status, priority, data, max_attempts, backoff_type,
// v13 quiet_hours + stagger_key always present (null fallback; schema
// stores NULL). v15 max_stalled is conditional: provided values get
// clamped to [1, 100] and included in the INSERT; omitted values
// skip the column so the schema DEFAULT (5 as of v0.14.1) kicks in.
// Keeps the app layer from hardcoding the schema default constant.
const hasMaxStalled = opts?.max_stalled !== undefined && opts.max_stalled !== null;
const clampedMaxStalled = hasMaxStalled
? Math.max(1, Math.min(100, Math.floor(opts!.max_stalled as number)))
: null;
const baseCols = `name, queue, status, priority, data, max_attempts, backoff_type,
backoff_delay, backoff_jitter, delay_until, parent_job_id, on_child_fail,
depth, max_children, timeout_ms, remove_on_complete, remove_on_fail, idempotency_key,
quiet_hours, stagger_key)
VALUES ($1, $2, $3, $4, $5::jsonb, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19::jsonb, $20)
quiet_hours, stagger_key`;
const baseVals = `$1, $2, $3, $4, $5::jsonb, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19::jsonb, $20`;
const cols = hasMaxStalled ? `${baseCols}, max_stalled` : baseCols;
const vals = hasMaxStalled ? `${baseVals}, $21` : baseVals;
const insertSql = opts?.idempotency_key
? `INSERT INTO minion_jobs (${cols})
VALUES (${vals})
ON CONFLICT (idempotency_key) WHERE idempotency_key IS NOT NULL DO NOTHING
RETURNING *`
: `INSERT INTO minion_jobs (name, queue, status, priority, data, max_attempts, backoff_type,
backoff_delay, backoff_jitter, delay_until, parent_job_id, on_child_fail,
depth, max_children, timeout_ms, remove_on_complete, remove_on_fail, idempotency_key,
quiet_hours, stagger_key)
VALUES ($1, $2, $3, $4, $5::jsonb, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19::jsonb, $20)
: `INSERT INTO minion_jobs (${cols})
VALUES (${vals})
RETURNING *`;
const params = [
const params: unknown[] = [
jobName,
opts?.queue ?? 'default',
childStatus,
@@ -172,6 +183,7 @@ export class MinionQueue {
opts?.quiet_hours ?? null,
opts?.stagger_key ?? null,
];
if (hasMaxStalled) params.push(clampedMaxStalled);
const inserted = await tx.executeRaw<Record<string, unknown>>(insertSql, params);

View File

@@ -103,6 +103,12 @@ export interface MinionJobInput {
backoff_type?: BackoffType;
backoff_delay?: number;
backoff_jitter?: number;
/**
* Max number of stall windows before dead-letter. Default is the schema
* default (5 as of v0.13.1). Clamped to [1, 100] on insert — values
* outside that range are silently coerced. See migration v13.
*/
max_stalled?: number;
delay?: number; // ms delay before eligible
parent_job_id?: number;
on_child_fail?: ChildFailPolicy;

View File

@@ -24,6 +24,7 @@ import { validateSlug, contentHash, rowToPage, rowToChunk, rowToSearchResult } f
type PGLiteDB = PGlite;
export class PGLiteEngine implements BrainEngine {
readonly kind = 'pglite' as const;
private _db: PGLiteDB | null = null;
private _lock: LockHandle | null = null;
@@ -43,10 +44,32 @@ export class PGLiteEngine implements BrainEngine {
throw new Error('Could not acquire PGLite lock. Another gbrain process is using the database.');
}
this._db = await PGlite.create({
dataDir,
extensions: { vector, pg_trgm },
});
try {
this._db = await PGlite.create({
dataDir,
extensions: { vector, pg_trgm },
});
} catch (err) {
// v0.13.1: any PGLite.create() failure becomes actionable. Most commonly
// this is the macOS 26.3 WASM bug (#223). We deliberately do NOT suggest
// "missing migrations" as a cause — migrations run AFTER create(), so a
// create-time abort has nothing to do with them. Nest the original error
// message so debugging isn't erased.
const original = err instanceof Error ? err.message : String(err);
const wrapped = new Error(
`PGLite failed to initialize its WASM runtime.\n` +
` This is most commonly the macOS 26.3 WASM bug: https://github.com/garrytan/gbrain/issues/223\n` +
` Run \`gbrain doctor\` for a full diagnosis.\n` +
` Original error: ${original}`
);
// Release the lock so a fresh process can try again; leaking the lock
// here turns a recoverable init error into a stuck-brain state.
if (this._lock?.acquired) {
try { await releaseLock(this._lock); } catch { /* ignore cleanup error */ }
this._lock = null;
}
throw wrapped;
}
}
async disconnect(): Promise<void> {

View File

@@ -185,7 +185,7 @@ CREATE TABLE IF NOT EXISTS minion_jobs (
backoff_delay INTEGER NOT NULL DEFAULT 1000,
backoff_jitter REAL NOT NULL DEFAULT 0.2,
stalled_counter INTEGER NOT NULL DEFAULT 0,
max_stalled INTEGER NOT NULL DEFAULT 3,
max_stalled INTEGER NOT NULL DEFAULT 5,
lock_token TEXT,
lock_until TIMESTAMPTZ,
delay_until TIMESTAMPTZ,

View File

@@ -20,6 +20,7 @@ import * as db from './db.ts';
import { validateSlug, contentHash, rowToPage, rowToChunk, rowToSearchResult, parseEmbedding, tryParseEmbedding } from './utils.ts';
export class PostgresEngine implements BrainEngine {
readonly kind = 'postgres' as const;
private _sql: ReturnType<typeof postgres> | null = null;
// Instance connection (for workers) or fall back to module global (backward compat)

View File

@@ -28,6 +28,8 @@ CREATE TABLE IF NOT EXISTS pages (
CREATE INDEX IF NOT EXISTS idx_pages_type ON pages(type);
CREATE INDEX IF NOT EXISTS idx_pages_frontmatter ON pages USING GIN(frontmatter);
CREATE INDEX IF NOT EXISTS idx_pages_trgm ON pages USING GIN(title gin_trgm_ops);
-- v0.13.1 #170: avoids 14.6s seqscan on large brains when listing pages newest-first.
CREATE INDEX IF NOT EXISTS idx_pages_updated_at_desc ON pages (updated_at DESC);
-- ============================================================
-- content_chunks: chunked content with embeddings
@@ -280,7 +282,7 @@ CREATE TABLE IF NOT EXISTS minion_jobs (
backoff_delay INTEGER NOT NULL DEFAULT 1000,
backoff_jitter REAL NOT NULL DEFAULT 0.2,
stalled_counter INTEGER NOT NULL DEFAULT 0,
max_stalled INTEGER NOT NULL DEFAULT 3,
max_stalled INTEGER NOT NULL DEFAULT 5,
lock_token TEXT,
lock_until TIMESTAMPTZ,
delay_until TIMESTAMPTZ,

View File

@@ -24,6 +24,8 @@ CREATE TABLE IF NOT EXISTS pages (
CREATE INDEX IF NOT EXISTS idx_pages_type ON pages(type);
CREATE INDEX IF NOT EXISTS idx_pages_frontmatter ON pages USING GIN(frontmatter);
CREATE INDEX IF NOT EXISTS idx_pages_trgm ON pages USING GIN(title gin_trgm_ops);
-- v0.13.1 #170: avoids 14.6s seqscan on large brains when listing pages newest-first.
CREATE INDEX IF NOT EXISTS idx_pages_updated_at_desc ON pages (updated_at DESC);
-- ============================================================
-- content_chunks: chunked content with embeddings
@@ -276,7 +278,7 @@ CREATE TABLE IF NOT EXISTS minion_jobs (
backoff_delay INTEGER NOT NULL DEFAULT 1000,
backoff_jitter REAL NOT NULL DEFAULT 0.2,
stalled_counter INTEGER NOT NULL DEFAULT 0,
max_stalled INTEGER NOT NULL DEFAULT 3,
max_stalled INTEGER NOT NULL DEFAULT 5,
lock_token TEXT,
lock_until TIMESTAMPTZ,
delay_until TIMESTAMPTZ,

View File

@@ -79,6 +79,112 @@ describe('migrations v8 + v9 — structural guard for helper-index fix', () => {
});
});
// v0.14.1 — fix wave structural assertions (migrations renumbered from v12/v13 to
// v14/v15 after master merged budget_ledger (v12) + minion_quiet_hours_stagger (v13)).
describe('migrate v14 — pages_updated_at_index (handler-based, engine-aware)', () => {
const v14 = MIGRATIONS.find(m => m.version === 14);
test('v14 exists and uses a handler (not pure SQL) for engine-aware branching', () => {
expect(v14).toBeDefined();
expect(v14!.name).toBe('pages_updated_at_index');
expect(typeof v14!.handler).toBe('function');
expect(v14!.sql).toBe('');
});
test('v14 handler source contains CONCURRENTLY + invalid-index cleanup for Postgres branch', async () => {
const { readFileSync } = await import('fs');
const src = readFileSync('src/core/migrate.ts', 'utf-8');
const v14Start = src.indexOf("name: 'pages_updated_at_index'");
expect(v14Start).toBeGreaterThan(-1);
const v14Block = src.slice(v14Start, v14Start + 3000);
expect(v14Block).toContain('pg_index');
expect(v14Block).toContain('indisvalid');
expect(v14Block).toContain('DROP INDEX CONCURRENTLY IF EXISTS idx_pages_updated_at_desc');
expect(v14Block).toContain('CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_pages_updated_at_desc');
// Order within the handler body: DROP IF EXISTS must precede CREATE IF NOT EXISTS,
// so a failed prior CONCURRENTLY build is cleaned before re-create. Anchor on the
// explicit "IF EXISTS" / "IF NOT EXISTS" phrases so the header doc-comment
// (which mentions both unqualified) doesn't fool the ordering assertion.
const dropIdx = v14Block.indexOf('DROP INDEX CONCURRENTLY IF EXISTS');
const createIdx = v14Block.indexOf('CREATE INDEX CONCURRENTLY IF NOT EXISTS');
expect(dropIdx).toBeLessThan(createIdx);
expect(v14Block).toContain('engine.kind');
});
});
describe('migrate v15 — minion_jobs_max_stalled_default_5', () => {
const v15 = MIGRATIONS.find(m => m.version === 15);
test('v15 exists and alters max_stalled default to 5', () => {
expect(v15).toBeDefined();
expect(v15!.name).toBe('minion_jobs_max_stalled_default_5');
expect(v15!.sql).toContain('ALTER TABLE minion_jobs ALTER COLUMN max_stalled SET DEFAULT 5');
});
test('v15 backfill UPDATE targets the correct non-terminal statuses', () => {
const sql = v15!.sql;
expect(sql).toContain(`'waiting'`);
expect(sql).toContain(`'active'`);
expect(sql).toContain(`'delayed'`);
expect(sql).toContain(`'waiting-children'`);
expect(sql).toContain(`'paused'`);
expect(sql).not.toContain(`'completed'`);
expect(sql).not.toContain(`'dead'`);
expect(sql).not.toContain(`'cancelled'`);
expect(sql).not.toContain(`'claimed'`);
expect(sql).not.toContain(`'running'`);
expect(sql).not.toContain(`'stalled'`);
});
test('v15 UPDATE clause has the < 5 guard so idempotent re-runs are no-ops', () => {
expect(v15!.sql).toContain('max_stalled < 5');
});
});
describe('migrate — runner behavioral (v14 handler + v15 backfill)', () => {
let engine: PGLiteEngine;
beforeAll(async () => {
engine = new PGLiteEngine();
await engine.connect({});
await engine.initSchema();
});
afterAll(async () => {
await engine.disconnect();
});
test('v14 created idx_pages_updated_at_desc on PGLite via handler branch', async () => {
const rows = await (engine as any).db.query(
`SELECT indexname FROM pg_indexes WHERE indexname = 'idx_pages_updated_at_desc'`
);
expect(rows.rows.length).toBe(1);
});
test('v15 backfilled any max_stalled=1 rows (smoke: schema default is 5)', async () => {
await (engine as any).db.exec(
`INSERT INTO minion_jobs (name, queue, status, max_stalled) VALUES ('test', 'default', 'waiting', 1)`
);
await (engine as any).db.exec(
`UPDATE minion_jobs SET max_stalled = 5
WHERE status IN ('waiting','active','delayed','waiting-children','paused')
AND max_stalled < 5`
);
const rows = await (engine as any).db.query(
`SELECT max_stalled FROM minion_jobs WHERE name = 'test'`
);
expect((rows.rows[0] as any).max_stalled).toBe(5);
await (engine as any).db.exec(
`UPDATE minion_jobs SET max_stalled = 5
WHERE status IN ('waiting','active','delayed','waiting-children','paused')
AND max_stalled < 5`
);
const rows2 = await (engine as any).db.query(
`SELECT max_stalled FROM minion_jobs WHERE name = 'test'`
);
expect((rows2.rows[0] as any).max_stalled).toBe(5);
});
});
describe('migrate: v8 (links_dedup) regression — must be fast on 1K duplicate rows', () => {
let engine: PGLiteEngine;

View File

@@ -88,16 +88,20 @@ describe('Bug 5 — Phase B host-work entry dedup', () => {
});
describe('Bug 8 — max_stalled default bumped in schema files', () => {
test('schema-embedded.ts has max_stalled DEFAULT 3', async () => {
// v0.14.2 bumped schema default 1 -> 3 via Bug 8. v0.14.3 (#219 fix wave) further
// bumps to 5 for extra flaky-deploy headroom, plus adds UPDATE backfill of
// non-terminal rows via migration v15. These structural assertions track the
// current schema source state (not historical).
test('schema-embedded.ts has max_stalled DEFAULT 5', async () => {
const source = await Bun.file(new URL('../src/core/schema-embedded.ts', import.meta.url)).text();
expect(source).toContain('max_stalled INTEGER NOT NULL DEFAULT 3');
expect(source).toContain('max_stalled INTEGER NOT NULL DEFAULT 5');
});
test('pglite-schema.ts has max_stalled DEFAULT 3', async () => {
test('pglite-schema.ts has max_stalled DEFAULT 5', async () => {
const source = await Bun.file(new URL('../src/core/pglite-schema.ts', import.meta.url)).text();
expect(source).toContain('max_stalled INTEGER NOT NULL DEFAULT 3');
expect(source).toContain('max_stalled INTEGER NOT NULL DEFAULT 5');
});
test('schema.sql has max_stalled DEFAULT 3', async () => {
test('schema.sql has max_stalled DEFAULT 5', async () => {
const source = await Bun.file(new URL('../src/schema.sql', import.meta.url)).text();
expect(source).toContain('max_stalled INTEGER NOT NULL DEFAULT 3');
expect(source).toContain('max_stalled INTEGER NOT NULL DEFAULT 5');
});
});

View File

@@ -270,6 +270,110 @@ describe('MinionQueue: Stall Detection', () => {
});
});
// --- v0.13.1 #219 — max_stalled default + input surface ---
describe('MinionQueue: v0.13.1 max_stalled schema default (#219)', () => {
test('job submitted with no explicit max_stalled uses schema default of 5', async () => {
const job = await queue.add('noop', {});
expect(job.max_stalled).toBe(5);
});
test('default=5 rescues across 4 consecutive stalls, dead-letters on the 5th', async () => {
const job = await queue.add('noop', {});
// Job starts at max_stalled=5 (schema default).
for (let i = 0; i < 4; i++) {
await queue.claim(`tok-${i}`, 30000, 'default', ['noop']);
await engine.executeRaw(
"UPDATE minion_jobs SET lock_until = now() - interval '1 second' WHERE id = $1",
[job.id]
);
const { requeued, dead } = await queue.handleStalled();
expect(dead.length).toBe(0);
expect(requeued.length).toBe(1);
expect(requeued[0].stalled_counter).toBe(i + 1);
}
// 5th stall = dead (5+1 >= 5 = wait, actually handleStalled gate is stalled_counter + 1 >= max_stalled).
// With stalled_counter now at 4, next stall: 4+1=5 >= 5 = dead.
await queue.claim('tok-final', 30000, 'default', ['noop']);
await engine.executeRaw(
"UPDATE minion_jobs SET lock_until = now() - interval '1 second' WHERE id = $1",
[job.id]
);
const { dead } = await queue.handleStalled();
expect(dead.length).toBe(1);
expect(dead[0].status).toBe('dead');
});
});
describe('MinionQueue: v0.13.1 MinionJobInput.max_stalled plumbing', () => {
test('honored end-to-end when provided', async () => {
const job = await queue.add('noop', {}, { max_stalled: 10 });
expect(job.max_stalled).toBe(10);
});
test('clamps input > 100 to 100', async () => {
const job = await queue.add('noop', {}, { max_stalled: 9999 });
expect(job.max_stalled).toBe(100);
});
test('clamps input < 1 to 1', async () => {
const job = await queue.add('noop', {}, { max_stalled: 0 });
expect(job.max_stalled).toBe(1);
});
test('clamps negative input to 1', async () => {
const job = await queue.add('noop', {}, { max_stalled: -5 });
expect(job.max_stalled).toBe(1);
});
test('non-integer inputs are floored before clamp', async () => {
const job = await queue.add('noop', {}, { max_stalled: 7.9 });
expect(job.max_stalled).toBe(7);
});
test('undefined leaves schema default intact (5)', async () => {
const job = await queue.add('noop', {}, { max_stalled: undefined });
expect(job.max_stalled).toBe(5);
});
});
describe('MinionQueue: v0.13.1 live-queue rescue regression (#219)', () => {
test('a row at max_stalled=1 is rescued by v13 backfill', async () => {
// Simulate a pre-v0.13.1 brain that inserted a row at the old default.
const job = await queue.add('noop', {});
await engine.executeRaw('UPDATE minion_jobs SET max_stalled = 1 WHERE id = $1', [job.id]);
// Run the v13 backfill UPDATE directly (matches migrate.ts v13 body).
await engine.executeRaw(
`UPDATE minion_jobs SET max_stalled = 5
WHERE status IN ('waiting','active','delayed','waiting-children','paused')
AND max_stalled < 5`
);
const refetched = await queue.getJob(job.id);
expect(refetched!.max_stalled).toBe(5);
});
test('backfill does not touch terminal-status rows', async () => {
const job = await queue.add('noop', {});
// Mark completed and set max_stalled=1 (simulating historical data).
await engine.executeRaw(
`UPDATE minion_jobs SET status = 'completed', max_stalled = 1, finished_at = now() WHERE id = $1`,
[job.id]
);
await engine.executeRaw(
`UPDATE minion_jobs SET max_stalled = 5
WHERE status IN ('waiting','active','delayed','waiting-children','paused')
AND max_stalled < 5`
);
const refetched = await queue.getJob(job.id);
// Terminal rows intentionally untouched; historical data stays as-is.
expect(refetched!.max_stalled).toBe(1);
});
});
// --- Dependencies (5 tests) ---
describe('MinionQueue: Dependencies', () => {

View File

@@ -891,3 +891,40 @@ describe('PGLiteEngine: getHealth graph metrics', () => {
expect(h2.orphan_pages).toBe(1);
});
});
// ─────────────────────────────────────────────────────────────────
// v0.13.1 — PGLite.create() error-wrap (structural guard for #223)
// ─────────────────────────────────────────────────────────────────
describe('PGLiteEngine: v0.13.1 error-wrap on connect() (#223)', () => {
test('pglite-engine.ts source contains the wrap with #223 hint and nested original error', async () => {
const { readFileSync } = await import('fs');
const src = readFileSync('src/core/pglite-engine.ts', 'utf-8');
// Structural: the try/catch block must wrap PGlite.create() (the actual
// abort site, NOT engine-factory.ts). The error message must name the
// issue and suggest gbrain doctor. Must NOT suggest "missing migrations"
// as a cause (that was conflating #218 and #223 — migrations run AFTER
// create()).
expect(src).toContain('this._db = await PGlite.create');
expect(src).toContain('https://github.com/garrytan/gbrain/issues/223');
expect(src).toContain('gbrain doctor');
expect(src).toContain('Original error:');
// Regression guard: the user-visible error MESSAGE must not re-introduce
// the misleading "missing migrations" hint. (A source comment explaining
// *why* we removed it is fine — match only inside the wrapped Error body.)
const wrapStart = src.indexOf('const wrapped = new Error(');
expect(wrapStart).toBeGreaterThan(-1);
const wrapEnd = src.indexOf(');', wrapStart);
const errBody = src.slice(wrapStart, wrapEnd);
expect(errBody).not.toContain('missing migrations');
expect(errBody).not.toContain('apply-migrations');
});
});
// ─────────────────────────────────────────────────────────────────
// v0.13.1 — Engine kind discriminator
// ─────────────────────────────────────────────────────────────────
describe('PGLiteEngine: v0.13.1 kind discriminator', () => {
test('exposes readonly kind = pglite', () => {
expect(engine.kind).toBe('pglite');
});
});