* fix(subagent): bind Anthropic SDK messages.create() correctly The makeSubagentHandler was casting `new Anthropic()` directly to MessagesClient, but MessagesClient.create() maps to sdk.messages.create(), not sdk.create(). Every subagent job immediately died with: client.create is not a function Fix: wrap the SDK instance so .create() delegates to .messages.create() with proper `this` binding via .bind(sdk.messages). Discovered on first production run of gbrain agent against Supabase. Co-Authored-By: Wintermute <wintermute@openclaw.ai> * chore(ci): add typescript typecheck to test pipeline + clean up baseline errors Root cause infra gap that let the v0.16.0 subagent bug ship: CI ran only `bun test`, which transpiles types without checking them. Type errors only surfaced at runtime, in production. Changes: - Add `typescript` devDep and a `typecheck` npm script (`tsc --noEmit`). - Chain `bun run typecheck` into `bun run test` so developers get the same pipeline locally that CI runs. - Flip `.github/workflows/test.yml` to invoke `bun run test` (the npm script, including typecheck) instead of `bun test` (runner only). - Clean up 100+ pre-existing type errors across 30+ files so the first run of `tsc --noEmit` is green. Root causes were: - `databaseUrl` → `database_url` rename drift in test fixtures (9 files) - `PageType` union missing `'meeting'` / `'note'` entries that are already used in both src and tests (link-extraction.ts comments acknowledged the gap) - `GBrainConfig.storage` field never declared despite being read in files.ts and operations.ts - `ErrorCode` union missing `'permission_denied'` - `OrchestratorOpts` shape changed; test callers not updated - Dead-code comparisons in migration orchestrators against narrowed status types - postgres.js `Row`-callback type drift on several `.map()` calls - Buffer-as-BodyInit assignment in supabase.ts (real but non-fatal runtime bug; Uint8Array slice works and is type-correct) - Various `as X` single-step casts that now need `as unknown as X` per TS's stricter structural-conversion rules - Bump `beforeAll` hook timeout to 30s on four PGLite-heavy tests that were flaky under parallel test execution: wait-for-completion, extract-fs, e2e/search-quality, e2e/graph-quality. All pass in isolation; timeouts only happened when dozens of PGLite instances init'd simultaneously. The new CI pipeline now fails on any type error across src/ or test/, giving us the compile-time regression guard the subagent fix depends on. * fix(subagent): bind Anthropic SDK messages.create() correctly Shipped bug: v0.16.0 cast `new Anthropic()` to `MessagesClient`, but `.create()` lives at `sdk.messages.create`, not on the top-level client. Every subagent job in production died on first LLM call with `client.create is not a function`. Discovered on the first `gbrain agent run` against Supabase. Fix: assign `sdk.messages` directly to the `MessagesClient` slot. `sdk.messages` IS the object with a callable `.create()`; the original bug was picking the wrong entry point on the SDK. No helper, no wrapper, no `.bind()` — JS method-call semantics preserve `this` at the call site because `subagent.ts:336` invokes `client.create(...)` with `client === sdk.messages`. The one-line assignment also typechecks cleanly against the existing `MessagesClient` interface (SDK's first `create` overload: `(MessageCreateParamsNonStreaming, Core.RequestOptions?) => APIPromise<Message>` is assignable structurally). This gives us compile-time regression protection: anyone reverting to `new Anthropic()` would fail tsc because `Anthropic` has no top-level `.create`. (The companion chore commit puts `tsc --noEmit` in CI so this guard is enforced.) Also adds a `makeAnthropic?: () => Anthropic` dep-injection seam so the factory default construction branch is testable without real API calls. Regression test drives one handler turn through a fake SDK, asserting `sdk.messages.create` is actually called. If someone later reverts to `new Anthropic()`, both guards fire: tsc fails AND the test fails. Co-Authored-By: Wintermute <wintermute@garrytan.com> * chore(tests): add bunfig.toml + 60s hook timeouts to stabilize PGLite-heavy suites After turning on tsc in CI (previous commit), running the full `bun run test` suite in one shot triggered flaky `beforeEach/afterEach hook timed out` failures on 8+ test files. Every failure traced to PGLite WASM init contention when many test files spin up fresh PGLite instances in parallel; each one alone passes in isolation. - `bunfig.toml` sets the global test hook timeout to 60s (default is 5s), covering every test file without per-file edits. - Individual `beforeAll(fn, 60_000)` / `beforeEach(fn, 15_000)` calls on the 8 tests that flaked most stay in place as explicit safety nets so a future bunfig config change doesn't silently re-introduce the flake. Result: 1997 pass, 0 fail on `bun run test` (117 tests added since the prior baseline by picking up typecheck-gated passes). No infrastructure flake tolerated in CI. * chore: bump version and changelog (v0.16.3) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Wintermute <wintermute@garrytan.com> Co-authored-by: Wintermute <wintermute@openclaw.ai> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
229 lines
8.1 KiB
TypeScript
229 lines
8.1 KiB
TypeScript
/**
|
|
* `gbrain agent` CLI tests. Covers arg parsing, --since parser, and the
|
|
* submit path end-to-end against PGLite so we verify trusted submission,
|
|
* protected-name guard, and fan-out wiring.
|
|
*
|
|
* The full handler-run loop is NOT exercised here (tested in subagent-
|
|
* handler.test.ts). This file checks the CLI's submission + orchestration
|
|
* glue.
|
|
*/
|
|
|
|
import { describe, test, expect, beforeAll, afterAll, beforeEach } from 'bun:test';
|
|
import * as fs from 'node:fs';
|
|
import * as path from 'node:path';
|
|
import * as os from 'node:os';
|
|
import { PGLiteEngine } from '../src/core/pglite-engine.ts';
|
|
import { MinionQueue } from '../src/core/minions/queue.ts';
|
|
import { __testing as agentTesting } from '../src/commands/agent.ts';
|
|
import { parseSince } from '../src/commands/agent-logs.ts';
|
|
import { isProtectedJobName, PROTECTED_JOB_NAMES } from '../src/core/minions/protected-names.ts';
|
|
|
|
let engine: PGLiteEngine;
|
|
let queue: MinionQueue;
|
|
|
|
beforeAll(async () => {
|
|
engine = new PGLiteEngine();
|
|
await engine.connect({ database_url: '' });
|
|
await engine.initSchema();
|
|
queue = new MinionQueue(engine);
|
|
});
|
|
|
|
afterAll(async () => {
|
|
await engine.disconnect();
|
|
});
|
|
|
|
beforeEach(async () => {
|
|
await engine.executeRaw('DELETE FROM minion_jobs');
|
|
});
|
|
|
|
describe('parseRunFlags', () => {
|
|
test('follow defaults off when stdout is non-TTY (test env)', () => {
|
|
const { flags, rest } = agentTesting.parseRunFlags(['hello', 'world']);
|
|
expect(flags.follow).toBe(process.stdout.isTTY === true);
|
|
expect(rest).toEqual(['hello', 'world']);
|
|
});
|
|
|
|
test('flags before prompt are parsed, unknown token ends flag parsing', () => {
|
|
const { flags, rest } = agentTesting.parseRunFlags([
|
|
'--model', 'claude-opus-4-7', '--max-turns', '30', 'summarize', 'everything',
|
|
]);
|
|
expect(flags.model).toBe('claude-opus-4-7');
|
|
expect(flags.maxTurns).toBe(30);
|
|
expect(rest).toEqual(['summarize', 'everything']);
|
|
});
|
|
|
|
test('--tools comma-split', () => {
|
|
const { flags } = agentTesting.parseRunFlags(['--tools', 'brain_search, brain_get_page', 'prompt']);
|
|
expect(flags.tools).toEqual(['brain_search', 'brain_get_page']);
|
|
});
|
|
|
|
test('--detach implies !follow', () => {
|
|
const { flags } = agentTesting.parseRunFlags(['--detach', 'x']);
|
|
expect(flags.detach).toBe(true);
|
|
expect(flags.follow).toBe(false);
|
|
});
|
|
|
|
test('double-dash ends flag parsing explicitly', () => {
|
|
const { flags, rest } = agentTesting.parseRunFlags(['--model', 'm', '--', '--not-a-flag']);
|
|
expect(flags.model).toBe('m');
|
|
expect(rest).toEqual(['--not-a-flag']);
|
|
});
|
|
|
|
test('unknown flag throws', () => {
|
|
expect(() => agentTesting.parseRunFlags(['--what', 'x'])).toThrow(/unknown flag/);
|
|
});
|
|
|
|
test('--subagent-def + --timeout-ms parsed', () => {
|
|
const { flags } = agentTesting.parseRunFlags([
|
|
'--subagent-def', 'researcher', '--timeout-ms', '60000', 'hello',
|
|
]);
|
|
expect(flags.subagentDef).toBe('researcher');
|
|
expect(flags.timeoutMs).toBe(60000);
|
|
});
|
|
|
|
test('--fanout-manifest parsed', () => {
|
|
const { flags } = agentTesting.parseRunFlags(['--fanout-manifest', '/tmp/m.json']);
|
|
expect(flags.fanoutManifest).toBe('/tmp/m.json');
|
|
});
|
|
});
|
|
|
|
describe('parseSince', () => {
|
|
test('returns undefined on empty input', () => {
|
|
expect(parseSince(undefined)).toBeUndefined();
|
|
expect(parseSince('')).toBeUndefined();
|
|
});
|
|
|
|
test('parses ISO-8601 timestamps', () => {
|
|
const iso = '2026-04-20T12:00:00.000Z';
|
|
expect(parseSince(iso)).toBe(iso);
|
|
});
|
|
|
|
test('parses relative 5m', () => {
|
|
const out = parseSince('5m')!;
|
|
const parsed = new Date(out).getTime();
|
|
const now = Date.now();
|
|
expect(now - parsed).toBeGreaterThanOrEqual(5 * 60 * 1000 - 1000);
|
|
expect(now - parsed).toBeLessThan(5 * 60 * 1000 + 1000);
|
|
});
|
|
|
|
test('parses relative 2h', () => {
|
|
const out = parseSince('2h')!;
|
|
const delta = Date.now() - new Date(out).getTime();
|
|
expect(delta).toBeGreaterThanOrEqual(2 * 3600 * 1000 - 1000);
|
|
});
|
|
|
|
test('parses relative 1d', () => {
|
|
const out = parseSince('1d')!;
|
|
const delta = Date.now() - new Date(out).getTime();
|
|
expect(delta).toBeGreaterThanOrEqual(86_400_000 - 1000);
|
|
});
|
|
|
|
test('throws on unparseable input', () => {
|
|
expect(() => parseSince('not-a-date')).toThrow(/could not parse/);
|
|
});
|
|
});
|
|
|
|
describe('protected-name guard includes subagent + aggregator', () => {
|
|
test('shell stays protected', () => {
|
|
expect(isProtectedJobName('shell')).toBe(true);
|
|
expect(PROTECTED_JOB_NAMES.has('shell')).toBe(true);
|
|
});
|
|
|
|
test('subagent is protected (v0.15)', () => {
|
|
expect(isProtectedJobName('subagent')).toBe(true);
|
|
});
|
|
|
|
test('subagent_aggregator is protected (v0.15)', () => {
|
|
expect(isProtectedJobName('subagent_aggregator')).toBe(true);
|
|
});
|
|
|
|
test('a random non-protected name is not protected', () => {
|
|
expect(isProtectedJobName('sync')).toBe(false);
|
|
});
|
|
|
|
test('trim normalization still blocks " subagent "', () => {
|
|
expect(isProtectedJobName(' subagent ')).toBe(true);
|
|
});
|
|
});
|
|
|
|
describe('queue.add trusted-submit gate for subagent', () => {
|
|
test('subagent without allowProtectedSubmit throws', async () => {
|
|
await expect(queue.add('subagent', { prompt: 'hi' })).rejects.toThrow();
|
|
});
|
|
|
|
test('subagent with allowProtectedSubmit succeeds', async () => {
|
|
const job = await queue.add('subagent', { prompt: 'hi' }, {}, { allowProtectedSubmit: true });
|
|
expect(job.name).toBe('subagent');
|
|
expect(job.status).toBe('waiting');
|
|
});
|
|
|
|
test('subagent_aggregator gated the same way', async () => {
|
|
await expect(queue.add('subagent_aggregator', { children_ids: [] })).rejects.toThrow();
|
|
const ok = await queue.add('subagent_aggregator', { children_ids: [1] }, {}, {
|
|
allowProtectedSubmit: true,
|
|
});
|
|
expect(ok.name).toBe('subagent_aggregator');
|
|
});
|
|
});
|
|
|
|
describe('fan-out manifest shape (integration)', () => {
|
|
test('fanout-manifest with 3 entries creates 3 subagent children + 1 aggregator', async () => {
|
|
// Manually replicate what runAgentRun does for --fanout-manifest > 1.
|
|
// We don't invoke runAgentRun (it calls process.exit on error) — we
|
|
// assert that the plumbing works via direct queue calls with the
|
|
// same flags it uses.
|
|
const tmp = fs.mkdtempSync(path.join(os.tmpdir(), 'fanout-'));
|
|
try {
|
|
const manifestPath = path.join(tmp, 'm.json');
|
|
fs.writeFileSync(manifestPath, JSON.stringify([
|
|
{ prompt: 'chunk 1' }, { prompt: 'chunk 2' }, { prompt: 'chunk 3' },
|
|
]));
|
|
|
|
// Aggregator first.
|
|
const agg = await queue.add(
|
|
'subagent_aggregator',
|
|
{ children_ids: [] },
|
|
{ max_stalled: 3 },
|
|
{ allowProtectedSubmit: true },
|
|
);
|
|
const kids: number[] = [];
|
|
for (const p of ['chunk 1', 'chunk 2', 'chunk 3']) {
|
|
const c = await queue.add(
|
|
'subagent',
|
|
{ prompt: p },
|
|
{ parent_job_id: agg.id, on_child_fail: 'continue', max_stalled: 3 },
|
|
{ allowProtectedSubmit: true },
|
|
);
|
|
kids.push(c.id);
|
|
}
|
|
await engine.executeRaw(
|
|
`UPDATE minion_jobs SET data = jsonb_set(data, '{children_ids}', $1::jsonb) WHERE id = $2`,
|
|
[JSON.stringify(kids), agg.id],
|
|
);
|
|
|
|
// Aggregator should be in waiting-children since kids were submitted
|
|
// with parent_job_id = agg.id (Lane 1B behavior).
|
|
const aggNow = await queue.getJob(agg.id);
|
|
expect(aggNow?.status).toBe('waiting-children');
|
|
|
|
// Aggregator's data.children_ids reflects the spawned children.
|
|
const dataRow = await engine.executeRaw<{ data: unknown }>(
|
|
`SELECT data FROM minion_jobs WHERE id = $1`, [agg.id],
|
|
);
|
|
const data = typeof dataRow[0]!.data === 'string'
|
|
? JSON.parse(dataRow[0]!.data as string)
|
|
: dataRow[0]!.data as Record<string, unknown>;
|
|
expect(data.children_ids).toEqual(kids);
|
|
|
|
// Each child should have on_child_fail = 'continue'.
|
|
const childRows = await engine.executeRaw<{ on_child_fail: string }>(
|
|
`SELECT on_child_fail FROM minion_jobs WHERE parent_job_id = $1`, [agg.id],
|
|
);
|
|
expect(childRows.length).toBe(3);
|
|
expect(childRows.every(r => r.on_child_fail === 'continue')).toBe(true);
|
|
} finally {
|
|
fs.rmSync(tmp, { recursive: true, force: true });
|
|
}
|
|
});
|
|
});
|