Files
gbrain/scripts/run-e2e.sh
Garry Tan ff10796a00 fix(wave): v0.15.1 - 4 hot issues + scope expansion (#248)
* fix(wave): 4 hot issues + 3 scope expansions (v0.13.1)

Addresses four user-filed regressions after v0.13.0 plus three adjacent
footgun closures.

* #170 — CREATE INDEX [CONCURRENTLY] IF NOT EXISTS idx_pages_updated_at_desc
  on pages (updated_at DESC). Engine-aware migration v12 with invalid-index
  cleanup on Postgres, plain CREATE on PGLite. ~700x on 30k+ row brains.
  Contributed by @fuleinist (#215).

* #219 — Minions schema default max_stalled 1 -> 5. v13 migration ALTERs
  the default and UPDATEs existing non-terminal rows (waiting/active/
  delayed/waiting-children/paused) so live queues get rescued on upgrade.
  Adds MinionJobInput.max_stalled with [1,100] clamp. New --max-stalled
  CLI flag on `jobs submit`. Reported by @macbotmini-eng.

* #218 — package.json postinstall surfaces errors instead of silencing.
  trustedDependencies whitelists @electric-sql/pglite. doctor
  schema_version check fails loudly when migrations never ran and links
  to #218. README + INSTALL_FOR_AGENTS warn against `bun install -g`.
  Reported by @gopalpatel.

* #223 — @electric-sql/pglite pinned to exactly 0.4.3 (was ^0.4.4).
  PGLiteEngine.connect() wraps PGlite.create() errors with a message
  pointing at the issue + gbrain doctor. Does NOT suggest 'missing
  migrations' as a cause (create-time abort happens before migrations
  run). Pin is unverified against macOS 26.3; error-wrap is the safety
  net. Reported by @AndreLYL.

* Scope: `gbrain jobs submit` gains --backoff-type/--backoff-delay/
  --backoff-jitter/--timeout-ms/--idempotency-key (MinionJobInput audit).
* Scope: `gbrain jobs smoke --sigkill-rescue` regression case (opt-in,
  CI-only) that simulates a killed worker and asserts the new default
  rescues.
* Scope: `gbrain doctor --index-audit` reports zero-scan Postgres indexes
  as drop candidates (informational; no auto-drop).

Infrastructure:
* Migration interface extended with sqlFor: { postgres?, pglite? } and
  transaction: boolean. Runner picks the engine-specific branch and
  bypasses engine.transaction() when transaction:false (required for
  CONCURRENTLY). BrainEngine.kind readonly discriminator added.
* scripts/check-jsonb-pattern.sh CI guard extended to block
  `max_stalled DEFAULT 1` from regressing.

Tests:
* 15 new unit tests: v12/v13 structural + behavioral assertions,
  max_stalled default/clamp/backfill, PGLite error-wrap source guard,
  engine kind discriminator.
* 3 regression tests pinned by IRON RULE.
* Full unit suite: 1416 pass.
* Full E2E suite against Postgres 16 + pgvector: 126 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.1)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: sync documentation for v0.13.1

CLAUDE.md "Key files" and "Commands" sections refreshed to match the
v0.13.1 fix wave:

- Note `BrainEngine.kind` discriminator on engine.ts
- Document v0.13.1 connect() error-wrap on pglite-engine.ts
- Refresh src/core/minions/ layout (no shell handler, no protected-names,
  no quiet-hours/stagger — that was v0.13-development scaffolding that
  did not ship)
- Add src/core/migrate.ts entry with `Migration` interface extensions
  (`sqlFor`, `transaction: false`)
- Document new `gbrain jobs submit` flags (--max-stalled, --backoff-type,
  --backoff-delay, --backoff-jitter, --timeout-ms, --idempotency-key)
- Document `gbrain jobs smoke --sigkill-rescue` regression guard
- Document `gbrain doctor --index-audit` and the schema_version=0
  surface that catches #218 postinstall failures
- Extend check-jsonb-pattern.sh note with the max_stalled DEFAULT 1
  regression guard
- Touch up test file blurbs for migrate.test.ts, pglite-engine.test.ts,
  minions.test.ts with v0.13.1 coverage

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): run files sequentially to eliminate shared-DB race

The E2E suite was flaky. ~3 of every 5 runs had 4-10 failures clustered
in Links, Timeline, Versions, Minions resilience, Parallel Import, and
Page CRUD tests. Symptoms included "expected 16 pages, got 8" (half),
"expected 1 link inserted, got 0", timeline entries missing after
round-trip, and similar data-shape mismatches.

Root cause: bun test runs test FILES in parallel (each in a worker
process). 13 E2E files share one DATABASE_URL, and `setupDB()` in
`test/e2e/helpers.ts` does `TRUNCATE ... CASCADE` on all tables before
each file's `importFixtures()`. File A's TRUNCATE would race with file
B's in-flight INSERT stream, producing the observed half-populated or
wrong-count states.

An earlier attempt used a Postgres advisory lock held on a dedicated
single-connection client for the lifetime of each file's run. It broke
because bun's default 5000 ms hook timeout fires on queued beforeAll()
calls: with 13 files serializing through the lock, files 2-13 would
time out waiting for file 1 to finish.

This commit switches to sequential file execution at the harness level
via scripts/run-e2e.sh, which loops through test/e2e/*.test.ts one at
a time, tracks aggregate pass/fail counts, and exits non-zero on the
first failing file. No lock, no timeout issues, no changes to any test
file. package.json test:e2e points at the new script.

Verified: 5 back-to-back runs against the same Postgres container,
each completing in ~5 min. Every run: 13 files, 138 tests, 0 fails.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version to 0.15.1 (fix wave locked to MINOR line)

Master v0.14.2 was the last /investigate root-cause wave on the
v0.14.x line. This fix wave opens v0.15.x: four hot issues (#170,
#218, #219, #223) close v0.13.x regressions that v0.14.x didn't
cover, so the MINOR bump reflects the semantic shift — new schema
migrations (v14, v15), a new CLI surface (`--max-stalled`,
`--sigkill-rescue`, `--index-audit`), a new BrainEngine contract
(`kind` discriminator + extended `Migration` interface), and a new
install-time contract (PGLite 0.4.3 pin + `trustedDependencies`).

Locked to 0.15.1 in advance: other work may land before/after this
PR, but the version is fixed so reviewers can cite a stable number.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 13:19:23 -07:00

67 lines
2.2 KiB
Bash
Executable File

#!/usr/bin/env bash
# Run E2E tests ONE FILE AT A TIME.
#
# Bun's default is to run test files in parallel (each in its own worker).
# Our E2E suite shares one Postgres database across all 13 files, and
# `setupDB()` does TRUNCATE CASCADE + fixture import. When files run in
# parallel, file A's TRUNCATE can race with file B's fixture import,
# producing observed fails like "expected 16 pages, got 8", missing
# links, orphaned timeline entries, etc. The flakiness was visible on
# ~3 of every 5 runs pre-fix.
#
# Running files sequentially eliminates the race entirely. It also costs
# some startup overhead (each file spins up a fresh bun process) but for
# a suite this size that is measured in ~1-2s per file, amortized under
# the natural per-file test time of 5-10s.
#
# Exits non-zero on the first failing file so CI fails fast.
set -euo pipefail
cd "$(dirname "$0")/.."
pass_files=0
fail_files=0
fail_list=()
total_pass=0
total_fail=0
for f in test/e2e/*.test.ts; do
name=$(basename "$f")
echo ""
echo "=== $name ==="
if output=$(bun test "$f" 2>&1); then
pass_files=$((pass_files + 1))
# Extract pass/fail counts from bun's summary (e.g., "123 pass")
p=$(echo "$output" | grep -oE '[0-9]+ pass' | tail -1 | grep -oE '[0-9]+' || echo 0)
total_pass=$((total_pass + p))
echo "$output" | tail -8
else
fail_files=$((fail_files + 1))
fail_list+=("$name")
p=$(echo "$output" | grep -oE '[0-9]+ pass' | tail -1 | grep -oE '[0-9]+' || echo 0)
fl=$(echo "$output" | grep -oE '[0-9]+ fail' | tail -1 | grep -oE '[0-9]+' || echo 0)
total_pass=$((total_pass + p))
total_fail=$((total_fail + fl))
echo "$output"
echo ""
echo "FAILED: $name"
# Continue so we see all failures; exit nonzero at the end.
fi
done
echo ""
echo "========================================"
echo "E2E SUMMARY (sequential execution)"
echo "========================================"
echo "Files: $((pass_files + fail_files)) total, $pass_files passed, $fail_files failed"
echo "Tests: $total_pass passed, $total_fail failed"
if [ ${#fail_list[@]} -gt 0 ]; then
echo ""
echo "Failing files:"
for f in "${fail_list[@]}"; do
echo " - $f"
done
exit 1
fi