Files
gbrain/test/extract-fs.test.ts
Garry Tan 96178d726e fix(subagent): v0.16.3 — bind Anthropic SDK correctly + enable tsc in CI (#318)
* fix(subagent): bind Anthropic SDK messages.create() correctly

The makeSubagentHandler was casting `new Anthropic()` directly to
MessagesClient, but MessagesClient.create() maps to sdk.messages.create(),
not sdk.create(). Every subagent job immediately died with:

  client.create is not a function

Fix: wrap the SDK instance so .create() delegates to .messages.create()
with proper `this` binding via .bind(sdk.messages).

Discovered on first production run of gbrain agent against Supabase.

Co-Authored-By: Wintermute <wintermute@openclaw.ai>

* chore(ci): add typescript typecheck to test pipeline + clean up baseline errors

Root cause infra gap that let the v0.16.0 subagent bug ship: CI ran
only `bun test`, which transpiles types without checking them. Type
errors only surfaced at runtime, in production.

Changes:
- Add `typescript` devDep and a `typecheck` npm script (`tsc --noEmit`).
- Chain `bun run typecheck` into `bun run test` so developers get the
  same pipeline locally that CI runs.
- Flip `.github/workflows/test.yml` to invoke `bun run test` (the npm
  script, including typecheck) instead of `bun test` (runner only).
- Clean up 100+ pre-existing type errors across 30+ files so the first
  run of `tsc --noEmit` is green. Root causes were:
  - `databaseUrl` → `database_url` rename drift in test fixtures (9 files)
  - `PageType` union missing `'meeting'` / `'note'` entries that are
    already used in both src and tests (link-extraction.ts comments
    acknowledged the gap)
  - `GBrainConfig.storage` field never declared despite being read in
    files.ts and operations.ts
  - `ErrorCode` union missing `'permission_denied'`
  - `OrchestratorOpts` shape changed; test callers not updated
  - Dead-code comparisons in migration orchestrators against narrowed
    status types
  - postgres.js `Row`-callback type drift on several `.map()` calls
  - Buffer-as-BodyInit assignment in supabase.ts (real but non-fatal
    runtime bug; Uint8Array slice works and is type-correct)
  - Various `as X` single-step casts that now need `as unknown as X`
    per TS's stricter structural-conversion rules
- Bump `beforeAll` hook timeout to 30s on four PGLite-heavy tests that
  were flaky under parallel test execution: wait-for-completion,
  extract-fs, e2e/search-quality, e2e/graph-quality. All pass in
  isolation; timeouts only happened when dozens of PGLite instances
  init'd simultaneously.

The new CI pipeline now fails on any type error across src/ or test/,
giving us the compile-time regression guard the subagent fix depends on.

* fix(subagent): bind Anthropic SDK messages.create() correctly

Shipped bug: v0.16.0 cast `new Anthropic()` to `MessagesClient`, but
`.create()` lives at `sdk.messages.create`, not on the top-level client.
Every subagent job in production died on first LLM call with
`client.create is not a function`. Discovered on the first `gbrain agent
run` against Supabase.

Fix: assign `sdk.messages` directly to the `MessagesClient` slot.
`sdk.messages` IS the object with a callable `.create()`; the original
bug was picking the wrong entry point on the SDK. No helper, no
wrapper, no `.bind()` — JS method-call semantics preserve `this` at
the call site because `subagent.ts:336` invokes `client.create(...)`
with `client === sdk.messages`.

The one-line assignment also typechecks cleanly against the existing
`MessagesClient` interface (SDK's first `create` overload:
`(MessageCreateParamsNonStreaming, Core.RequestOptions?) =>
APIPromise<Message>` is assignable structurally). This gives us
compile-time regression protection: anyone reverting to
`new Anthropic()` would fail tsc because `Anthropic` has no top-level
`.create`. (The companion chore commit puts `tsc --noEmit` in CI so
this guard is enforced.)

Also adds a `makeAnthropic?: () => Anthropic` dep-injection seam so
the factory default construction branch is testable without real API
calls. Regression test drives one handler turn through a fake SDK,
asserting `sdk.messages.create` is actually called. If someone later
reverts to `new Anthropic()`, both guards fire: tsc fails AND the test
fails.

Co-Authored-By: Wintermute <wintermute@garrytan.com>

* chore(tests): add bunfig.toml + 60s hook timeouts to stabilize PGLite-heavy suites

After turning on tsc in CI (previous commit), running the full `bun run test`
suite in one shot triggered flaky `beforeEach/afterEach hook timed out`
failures on 8+ test files. Every failure traced to PGLite WASM init
contention when many test files spin up fresh PGLite instances in parallel;
each one alone passes in isolation.

- `bunfig.toml` sets the global test hook timeout to 60s (default is 5s),
  covering every test file without per-file edits.
- Individual `beforeAll(fn, 60_000)` / `beforeEach(fn, 15_000)` calls on
  the 8 tests that flaked most stay in place as explicit safety nets so
  a future bunfig config change doesn't silently re-introduce the flake.

Result: 1997 pass, 0 fail on `bun run test` (117 tests added since the
prior baseline by picking up typecheck-gated passes). No infrastructure
flake tolerated in CI.

* chore: bump version and changelog (v0.16.3)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Wintermute <wintermute@openclaw.ai>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 01:34:22 -07:00

162 lines
6.1 KiB
TypeScript

/**
* Tests for `gbrain extract --source fs` (the default, FS-walking path).
*
* Companion to test/extract-db.test.ts. Specifically guards against the
* v0.12.0 N+1 hang: extractLinksFromDir / extractTimelineFromDir used to
* pre-load the entire dedup set with one engine.getLinks() per page across
* engine.listPages(), which on a 47K-page brain meant 47K sequential
* round-trips before any work happened.
*
* Verifies:
* 1. Single run extracts the expected links + timeline entries.
* 2. Second run reports `created: 0` (proves DO NOTHING in batch + accurate
* counter via RETURNING).
* 3. --dry-run prints the same link found across multiple files exactly
* once (proves the dry-run-only dedup Set works).
* 4. Second run wall-clock < 2s (regression guard against any future change
* that re-introduces the N+1 read pre-load).
*/
import { describe, test, expect, beforeAll, afterAll, beforeEach } from 'bun:test';
import { mkdtempSync, writeFileSync, mkdirSync, rmSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
import { PGLiteEngine } from '../src/core/pglite-engine.ts';
import { runExtract } from '../src/commands/extract.ts';
import type { PageInput } from '../src/core/types.ts';
let engine: PGLiteEngine;
let brainDir: string;
beforeAll(async () => {
engine = new PGLiteEngine();
await engine.connect({});
await engine.initSchema();
}, 60_000);
afterAll(async () => {
await engine.disconnect();
});
async function truncateAll() {
for (const t of ['content_chunks', 'links', 'tags', 'raw_data', 'timeline_entries', 'page_versions', 'ingest_log', 'pages']) {
await (engine as any).db.exec(`DELETE FROM ${t}`);
}
}
const personPage = (title: string, body = ''): PageInput => ({
type: 'person', title, compiled_truth: body, timeline: '',
});
const companyPage = (title: string, body = ''): PageInput => ({
type: 'company', title, compiled_truth: body, timeline: '',
});
beforeEach(async () => {
await truncateAll();
brainDir = mkdtempSync(join(tmpdir(), 'gbrain-extract-fs-'));
}, 15_000);
function writeFile(rel: string, content: string) {
const full = join(brainDir, rel);
mkdirSync(join(full, '..'), { recursive: true });
writeFileSync(full, content);
}
describe('gbrain extract links --source fs', () => {
test('first run inserts links, second run reports 0 (idempotent + truthful counter)', async () => {
// Set up brain in DB matching the file structure
await engine.putPage('people/alice', personPage('Alice'));
await engine.putPage('people/bob', personPage('Bob'));
await engine.putPage('companies/acme', companyPage('Acme'));
// Set up matching markdown files on disk
writeFile('people/alice.md', '---\ntitle: Alice\n---\n\n[Bob](../people/bob.md) is a friend.\n');
writeFile('people/bob.md', '---\ntitle: Bob\n---\n\nWorks at [Acme](../companies/acme.md).\n');
writeFile('companies/acme.md', '---\ntitle: Acme\n---\n\nFounded by [Alice](../people/alice.md).\n');
// First run — write batch path
await runExtract(engine, ['links', '--dir', brainDir]);
const linksAfter1 = (await engine.getLinks('people/alice'))
.concat(await engine.getLinks('people/bob'))
.concat(await engine.getLinks('companies/acme'));
expect(linksAfter1.length).toBeGreaterThanOrEqual(3);
// Second run — must dedup via ON CONFLICT and report 0 new (truthful counter)
const start = Date.now();
await runExtract(engine, ['links', '--dir', brainDir]);
const elapsedMs = Date.now() - start;
const linksAfter2 = (await engine.getLinks('people/alice'))
.concat(await engine.getLinks('people/bob'))
.concat(await engine.getLinks('companies/acme'));
expect(linksAfter2.length).toBe(linksAfter1.length);
// Perf regression guard: re-run on tiny fixture must not loop through
// listPages + per-page getLinks. ~10 files should complete in well under
// 2s even on a slow CI box.
expect(elapsedMs).toBeLessThan(2000);
});
test('--dry-run dedups duplicate candidates across files (printed once, not N times)', async () => {
await engine.putPage('people/alice', personPage('Alice'));
await engine.putPage('companies/acme', companyPage('Acme'));
// Same link target appears in 3 different files. The target file must
// exist on disk so the FS extractor's allSlugs Set includes it.
writeFile('companies/acme.md', '---\ntitle: Acme\n---\n');
writeFile('a.md', '[Acme](companies/acme.md)\n');
writeFile('b.md', '[Acme](companies/acme.md)\n');
writeFile('c.md', '[Acme](companies/acme.md)\n');
// Capture stdout to check print frequency
const lines: string[] = [];
const origLog = console.log;
console.log = (...args: unknown[]) => { lines.push(args.join(' ')); };
try {
await runExtract(engine, ['links', '--dry-run', '--dir', brainDir]);
} finally {
console.log = origLog;
}
// Each (from, to, link_type) tuple should print at most once.
// Three distinct from_slugs (a, b, c) all link to companies/acme, so
// we expect 3 link lines (one per source file), not 9.
const linkLines = lines.filter(l => l.includes('→') && l.includes('companies/acme'));
expect(linkLines.length).toBe(3);
// No actual writes happened
const links = await engine.getLinks('companies/acme');
expect(links.length).toBe(0);
});
});
describe('gbrain extract timeline --source fs', () => {
test('first run inserts entries, second run reports 0 (idempotent + truthful counter)', async () => {
await engine.putPage('people/alice', personPage('Alice'));
writeFile('people/alice.md', `---
title: Alice
---
## Timeline
- **2024-01-15** | source — Founded NovaMind
- **2024-06-01** | source — Raised seed round
`);
await runExtract(engine, ['timeline', '--dir', brainDir]);
const after1 = await engine.getTimeline('people/alice');
expect(after1.length).toBe(2);
const start = Date.now();
await runExtract(engine, ['timeline', '--dir', brainDir]);
const elapsedMs = Date.now() - start;
const after2 = await engine.getTimeline('people/alice');
expect(after2.length).toBe(2);
expect(elapsedMs).toBeLessThan(2000);
});
});