* feat(v0.17.0 step 1/9): sources primitive — additive-only multi-source foundation
Lane A of the multi-repo plan. Installs the sources table and seeds a
'default' row that inherits sync.repo_path/last_commit from existing
config. This is the bisectable foundation every later step builds on;
the breaking schema changes (composite UNIQUE, files FK rewrite,
resolution_type, ingest_log.source_id) land with their paired code
rewrites in Steps 2/4/5/7 so no single commit breaks the engine.
- migration v16 (sources_table_additive) + v0_17_0 orchestrator skeleton
- sort-by-version guard in runMigrations (array insertion order can
never cause a later migration to skip a lower one again)
- default source seeded with config '{"federated": true}' so pre-v0.17
brains keep single-namespace search semantics after upgrade
- orchestrator phase B detects absence of file_migration_ledger and
no-ops until Step 7 lands it
- 8 new structural tests in test/migrate.test.ts (shape, idempotency,
scope-guard that nothing else was smuggled into v16)
- apply-migrations tests include v0.17.0 in the registered list
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 2/9): pages.source_id + composite UNIQUE (Lane B)
Migration v17 adds pages.source_id with DEFAULT 'default' and swaps the
global UNIQUE(slug) for composite UNIQUE(source_id, slug). Ships atomically
with the engine's ON CONFLICT rewrite so the constraint swap and the code
that writes under it land in the same commit — no window where the engine
sees one shape and the schema has another.
Minimum-surface engine change: only putPage's ON CONFLICT target needs
re-targeting. Other slug-based queries work unchanged because single-
source brains (the only brain shape pre-Step-5) have exactly one source
'default', so slug remains effectively unique within it. Step 5+ will
surface an explicit sourceId param on putPage for cross-source sync.
- migration v17 (pages_source_id_composite_unique) in src/core/migrate.ts
- pages.source_id + composite UNIQUE added to schema.sql + pglite-schema.ts
for fresh installs
- ON CONFLICT (slug) → ON CONFLICT (source_id, slug) in both pglite-engine
and postgres-engine putPage
- DEFAULT 'default' closes the Codex-flagged race where an INSERT between
ADD COLUMN and SET NOT NULL could leave source_id NULL
- 5 new v17 structural tests (29 pass / 0 fail in migrate.test.ts)
- Full suite: 1979 pass / 3 fail (same as baseline — no regressions)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 6/9): sources CLI + source-resolver (Lane C)
Adds the CLI surface for multi-source management. Users can now register,
list, rename, federate/unfederate, and attach-to-directory a source. The
source-resolver is the shared 6-priority helper that Steps 4/5 will use
when they start surfacing an explicit --source flag on sync/extract/query.
Commands:
gbrain sources add <id> --path <p> [--name <n>] [--federated|--no-federated]
gbrain sources list [--json]
gbrain sources remove <id> [--yes] [--dry-run] [--keep-storage]
gbrain sources rename <id> <new-name>
gbrain sources default <id>
gbrain sources attach <id> — writes .gbrain-source in CWD
gbrain sources detach
gbrain sources federate <id> / unfederate <id>
Resolution priority (source-resolver.ts) — highest first:
1. --source flag 2. GBRAIN_SOURCE env 3. .gbrain-source dotfile walk-up
4. longest-prefix match on registered local_path (Codex #2 fix)
5. sources.default config 6. fallback 'default'
- add: validates id format (kebab-case alnum, 1-32), rejects overlapping
paths (eng review §4 finding 4.1), supports federated default opt-in
- remove: guards against --yes omission + refuses to remove 'default',
supports --dry-run, reports cascade page count
- attach/detach: matches kubectl/terraform context-pinning semantics
- Throws on overlap rather than process.exit() so the CLI error wrapper
reports it consistently (also makes unit testing clean)
28 new tests across sources.test.ts (dispatcher + validation + overlap
guard) and source-resolver.test.ts (full 6-priority coverage including
longest-prefix). Full suite: 2012 pass / 3 fail (pre-existing PGLite
infra timeouts).
NOT in scope for Step 6 (deferred):
- import-from-github (SSRF + clone integration)
- prune (retention/TTL, lands v0.18)
- MCP tool-defs regen for source-scoping on read ops (Step 5)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(v0.17.0 step 8/9): getting-started guide + migration skill + citation rule
Step 8 (Lane F) documents what Steps 1+2+6 have shipped and sets up
the agent-facing rules for multi-source.
New files:
- skills/migrations/v0.17.0.md — migration skill read by host agents
after `gbrain apply-migrations`. Covers the v16+v17 chain, what's
in v0.17.0 vs what lands later (v0.17.1 ACL, v0.18 sessions), and
the new sources CLI surface. Cites docs/guides/multi-source-brains.md
as the recipe.
- docs/guides/multi-source-brains.md — getting-started for end users.
Three canonical scenarios (unified wiki+gstack / purpose-separated
yc-media+garrys-list / mixed), full resolution priority, federation
flag semantics, command reference, and citation format.
skills/brain-ops/SKILL.md — new "Cross-source citation format"
section mandating `[source-id:slug]` when the brain has multiple
sources. Matches the contract the /plan-devex-review DX review
pinned down (DX Finding 5: surface source_id in every page payload
+ citation contract). Key must be sources.id (immutable), never
sources.name.
No behavior change — this is pure documentation for what already
exists in the binary. 144 skills conformance tests still pass.
NOT in this commit (deferred to later steps):
- docs/guides/repo-architecture.md rewrite (lands with the full
v0.17.0 PR description + release notes)
- skills/_brain-filing-rules.md "which source to file into"
guidance (lands with Step 5 when sync surfaces --source)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 5/9): sync --source <id> routes through sources table (Lane D)
Adds the --source flag to `gbrain sync`. When set, sync reads local_path
+ last_commit from the matching sources(id) row instead of the global
sync.repo_path / sync.last_commit config keys, and writes last_commit +
last_sync_at back to the same row. Backward compat: --source omitted =
pre-v0.17 behavior exactly, global config path unchanged.
- SyncOpts.sourceId threaded through performSync + performFullSync
- readSyncAnchor/writeSyncAnchor helpers centralize the sources-vs-config
branch so every read/write goes through one decision point. Makes
Step 5's later per-source sync-failures tracking a one-file change.
- --source resolved via src/core/source-resolver.ts (Step 6), so any
command that shell-exposes resolveSourceId gets env var + dotfile
walk-up + longest-prefix for free.
- Error message for missing source local_path is actionable:
Source "gstack" has no local_path. Run: gbrain sources add gstack --path <path>
- last_sync_at auto-updates on every last_commit advance so `gbrain
sources list` shows real recency.
No regression: 2012 pass / 3 fail (same as baseline).
NOT in this commit (deferred per plan):
- Per-source failure tracking (~/.gbrain/sources/<id>/sync-failures.jsonl)
- runImport source-awareness (import.ts path — Step 5 continuation)
- Partial-success semantics when walking N sources — single-source flow
today, multi-walk lands when the top-level `gbrain sync` without
--source starts iterating all sources.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 4/9): qualified [[source:slug]] + links.resolution_type (Lane B)
Adds source-pinned wikilink syntax and records the resolution kind on
each edge so `gbrain extract --refresh-unqualified` (future) can
re-resolve bare references when the source topology changes.
Wikilink syntax extension:
[[concepts/ai]] — unqualified; resolves via local-first fallback
[[wiki:concepts/ai]] — qualified; target pinned to sources.id='wiki'
[[gstack:projects/foo|Display]] — qualified + display name
The qualified regex runs first and masks matched spans so the
unqualified pass can't double-emit. Source id format enforced to match
the sources CLI validation: [a-z0-9](?:[a-z0-9-]{0,30}[a-z0-9])?
Schema:
- migration v18 adds links.resolution_type TEXT with CHECK constraint
('qualified'|'unqualified' or NULL for legacy/manual/frontmatter edges)
- schema.sql + pglite-schema.ts updated for fresh installs
EntityRef type:
- sourceId is OPTIONAL (only set on qualified wikilinks). Markdown
[Name](path) and unqualified wikilinks omit it so strict toEqual
tests pre-v0.17 keep working (69 existing tests still pass).
Tests:
- 5 new qualified-wikilink extraction tests + 1 migration v18 structural
assertion. 75 tests in test/link-extraction.test.ts (up from 69).
- Full suite: 2018 pass / 3 fail (pre-existing PGLite infra timeouts).
NOT in this commit (deferred to Step 3 / Step 5 continuation):
- Writing resolution_type to the DB (addLink / addLinksBatch don't
carry the field yet — that's the plumb-through that lands with
Step 3 when search/dedup also needs source-aware result keys).
- `gbrain extract --refresh-unqualified` re-resolver.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 3/9): source-aware search dedup composite keys (Lane B)
Search dedup now keys on (source_id, slug) instead of slug alone. Pre-
v0.17 would collapse two same-slug pages in different sources into
one, destroying cross-source recall. Codex outside-voice review flagged
this as regression-critical — this commit ships the fix plus tests
that lock the invariant in.
Dedup pipeline (src/core/search/dedup.ts):
- pageKey(r) helper — one canonical composite-key derivation. Falls
back to source_id='default' for pre-v0.17 rows so single-source
brains behave identically to before.
- Layer 1 (dedupBySource): group-by composite key.
- Layer 4 (capPerPage): count-by composite key.
- guaranteeCompiledTruth: swap scoped to matching (source_id, slug),
so wiki:topics/ai can't accidentally pull gstack:topics/ai's
compiled_truth chunk.
SearchResult type gains optional source_id — populated by SQL JOINs
in both engines, falls through as 'default' for legacy callers.
Engine SQL:
- pglite-engine.ts + postgres-engine.ts: search SELECTs add p.source_id
- rowToSearchResult (utils.ts): maps row.source_id → result.source_id
when present. Shape stays backward compatible (field optional).
Tests — 4 new in test/dedup.test.ts:
- same-slug-different-source does NOT collapse (the critical regression
guard Codex called out)
- same-slug-same-source DOES still collapse (no over-correction)
- missing source_id falls back to 'default' for pre-v0.17 compat
- compiled_truth guarantee scopes to composite key (Codex second pass
caught this specific path would leak otherwise)
Full suite: 2022 pass / 3 fail (3 pre-existing PGLite infra timeouts).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 7/9): file_migration_ledger + phase-B storage backfill (Lane E)
Adds files.source_id + files.page_id + the file_migration_ledger
state machine that drives storage object rewrites. Each per-file
transition is its own transaction so crash-point recovery is a
ledger read, not a filesystem inspection. Codex second-pass review
flagged that "skip if already has source prefix" was an unsafe
heuristic — the ledger replaces it with explicit state tracking.
Schema:
- migration v19 (files_source_id_page_id_ledger): handler-only
(PGLite has no files table; Postgres-only gate). ADDs
source_id + page_id to files, backfills page_id from page_slug
scoped to source_id='default', creates file_migration_ledger
with PK on file_id (Codex: not storage_path_old — two sources
can share an old path during migration).
- schema.sql updated for fresh Postgres installs; file_migration_ledger
gets RLS alongside other tables.
Runtime:
- src/commands/migrations/v0_17_0-storage-backfill.ts: drives the
ledger state machine pending → copy_done → db_updated → complete.
Idempotent per row: re-running resumes from whichever state
crashed. Old objects preserved (no delete) so operators can
verify the soak window before a future cleanup release.
- phase B in v0_17_0.ts orchestrator: wires the storage backend
(Supabase/S3/local) through createStorage, runs runStorageBackfill,
reports per-state counts + first-three error details.
Tests — 13 new in test/storage-backfill.test.ts:
- pending → copy_done → db_updated → complete happy path
- 3 crash-point recovery tests (resume from copy_done, resume from
db_updated, failed rows don't auto-retry)
- already-complete rows are skipped with zero side effects
- idempotent re-upload (exists-check skips redundant upload)
- dry-run mode (no storage, reports counts without mutating)
Plus 5 new migrate.test.ts assertions for v19 structure (handler-
only, PGLite gate, source_id + page_id + ledger DDL, default-source
backfill scope, state machine values).
Full suite: 2035 pass / 3 fail (3 pre-existing PGLite infra
timeouts).
NOT in this commit (explicitly deferred):
- DROP old page_slug column — kept for backward compat until
operators have time to verify page_id everywhere.
- DROP old UNIQUE(storage_path) in favor of UNIQUE(source_id,
storage_path) — same reason, deferred to later cleanup.
- Actual cleanup phase that deletes old objects post-soak.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(v0.17.0 step 9/9): full multi-source PGLite integration suite (Lane G)
End-to-end exercise of every v0.17.0 surface against real PGLite
(in-memory, fast — no DATABASE_URL needed). The migration chain
v2→v19 runs start-to-finish and the test asserts each Step's
invariants hold together.
16 new integration tests across 7 describes:
1. Migration-installed state:
- sources('default') exists with federated=true config
- pages.source_id column has DEFAULT 'default'
- composite UNIQUE (source_id, slug) is installed
2. Default-source write path:
- putPage without explicit source → source_id='default' via schema
default clause (no engine API change needed for single-source brains)
3. Composite UNIQUE regression guards (Codex-flagged):
- Same slug in two different sources coexists
- Third insert with same (source_id, slug) hits the UNIQUE constraint
4. sources CLI round-trip:
- federate / unfederate flips config.federated
- rename changes display, id stays immutable
5. Source resolution priority (integration):
- Explicit flag > env var > fallback to default
- Unregistered explicit source errors with actionable message
6. Cascade semantics:
- sources remove cascades to pages; default source untouched
7. links.resolution_type (Step 4):
- Qualified/unqualified values accepted
- CHECK constraint rejects invalid values
All 16 tests pass. Full suite: 2042 pass / 4 fail (4 pre-existing
PGLite beforeEach timeouts in test/wait-for-completion,
test/extract-fs, test/e2e/search-quality, test/e2e/graph-quality
— count fluctuated 3-5 on baseline from variance alone).
Total new tests across Steps 1-9: ~85 unit + integration tests
(sources, source-resolver, migrate v16/v17/v18/v19 structural,
link-extraction qualified wikilinks, dedup regression-critical,
storage-backfill state machine + crash recovery, full
multi-source PGLite integration).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: bump to v0.18.0 + CHANGELOG entry (multi-source brains)
One-viewport release summary + itemized changes covering all 9 steps
of the multi-source primitive. Notes the v0.17 → v0.18 version bump
rationale (master shipped gbrain dream as v0.17 while this branch was
in flight).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): v0_18_0 orchestrator TS narrow + mechanical test ON CONFLICT
Two CI failures on PR #337:
1. tsc TS2367 at src/commands/migrations/v0_18_0.ts:190 —
after the early-return on `a.status === 'failed'` (line 179),
TypeScript narrows `a.status` to `'skipped' | 'complete'`, so the
subsequent `a.status === 'failed' ? 'failed' :` branch was dead
code and refused to compile. Dropped the redundant check.
2. E2E `file_list LIMIT enforcement` at test/e2e/mechanical.test.ts:636 —
the test pre-seeded a pages row with `ON CONFLICT (slug) DO NOTHING`
but v21 swapped the global UNIQUE for `UNIQUE (source_id, slug)`, so
Postgres rejects with "no unique or exclusion constraint matching".
Updated the conflict target to the composite key.
Tier-1 E2E had only this one failing test; everything else passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(e2e): v0.18.0 multi-source against real Postgres (v20-v23 schema + cascade + sync)
Closes the three biggest confidence gaps the author flagged in the
self-audit of PR #337:
1. No real Postgres E2E — PGLite has no files table, so v23's
files.source_id + files.page_id rewrite + file_migration_ledger
seed was NEVER executed against the real DB. This file covers it.
2. `gbrain sync --source <id>` had zero direct tests. Now has two:
one that asserts performSync({sourceId}) reads local_path from the
sources row (not the global config), one that asserts no-sourceId
falls back to the global sync.repo_path.
3. Cascade delete coverage — previously verified only pages count
after source removal. Now verifies pages + content_chunks +
timeline_entries + links + files ALL cascade-delete when a source
is removed.
6 describes, 16 tests total:
- Schema shape (fresh install): 6 tests confirming sources('default'),
pages.source_id NOT NULL with DEFAULT, composite UNIQUE pages
(source_id, slug) replaces global UNIQUE(slug), links.resolution_type
column + CHECK, files.source_id + page_id columns, file_migration_ledger
table + status CHECK.
- Composite UNIQUE semantics: 3 tests confirming same-slug in two
sources coexists (Codex-critical regression guard), duplicate
(source_id, slug) hits the UNIQUE, putPage targets default source
by schema DEFAULT.
- Cascade delete: 1 test building a fully populated source (2 pages,
chunks, timeline, links, files) then removing it + asserting every
dependent row is gone.
- Sync routing: 2 tests confirming performSync({sourceId}) reads
per-source local_path vs global config.
- Sources surface: 3 tests for federate/unfederate flipping + rename
preserving id.
- Storage backfill: 1 end-to-end test seeding ledger + running
runStorageBackfill against a stub StorageBackend, asserting
pending → complete transition and files.storage_path rewrite.
Gated by DATABASE_URL per CLAUDE.md E2E lifecycle. Each describe's
beforeAll defensively DELETEs non-default sources + file_migration_ledger
rows so reruns are hermetic (sources isn't in helpers.ALL_TABLES).
Verified: 16/16 pass on first run AND second run (residual-state fix
holds). Full E2E suite still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): TS2352 in multi-source E2E — cast postgres.js RowList via unknown
tsc rejects the direct
`(rows as { column_name: string }[]).map(...)`
cast because postgres.js RowList rows have an iterable-row shape that
doesn't overlap with the plain-object target. Standard fix: cast via
`unknown` first so the narrowing is explicit.
Verified: `bunx tsc --noEmit` clean (ignoring the pre-existing baseUrl
deprecation warning).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(v0.18.0): addLinksBatch + addTimelineEntriesBatch source-aware JOINs
Batch APIs JOINed on pages.slug globally, so two pages sharing the same
slug across sources would silently fan out — addLinksBatch(['a->b']) in
a brain with 'a' in both 'default' and 'alt' wrote 2 edges instead of 1.
Same bug on addTimelineEntriesBatch.
Fix:
- LinkBatchInput + TimelineBatchInput gain optional source_id fields
(from_source_id, to_source_id, origin_source_id for links; source_id
for timeline). All default to 'default' so existing callers are
backward-compatible on single-source brains.
- pglite-engine + postgres-engine batch JOINs now composite-key on
(slug, source_id). Postgres adds 3 more unnest arrays for links + 1
for timeline — still one bind per column, no 65535-param cap risk.
- LEFT JOIN for origin pages also source-qualified so frontmatter-
provenance edges don't cross-pollinate across sources.
Regression coverage:
- test/pglite-engine.test.ts: 5 new tests covering default-path isolation,
explicit alt-source writes, and cross-source edges.
- test/e2e/multi-source.test.ts: 4 new tests against real Postgres so
postgres-js's unnest() bind path is exercised (structurally different
from PGLite's).
Gap #4 from the PR self-audit — latent bug, not previously reachable
because every existing caller wrote to the default source only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1044 lines
44 KiB
TypeScript
1044 lines
44 KiB
TypeScript
/**
|
|
* PGLite Engine Tests — validates all 37 BrainEngine methods against PGLite (in-memory).
|
|
*
|
|
* No Docker, no DATABASE_URL, no external dependencies. Runs instantly in CI.
|
|
*/
|
|
|
|
import { describe, test, expect, beforeAll, afterAll, beforeEach } from 'bun:test';
|
|
import { PGLiteEngine } from '../src/core/pglite-engine.ts';
|
|
import type { BrainEngine } from '../src/core/engine.ts';
|
|
import type { PageInput, ChunkInput } from '../src/core/types.ts';
|
|
|
|
let engine: PGLiteEngine;
|
|
|
|
beforeAll(async () => {
|
|
engine = new PGLiteEngine();
|
|
await engine.connect({}); // in-memory
|
|
await engine.initSchema();
|
|
});
|
|
|
|
afterAll(async () => {
|
|
await engine.disconnect();
|
|
});
|
|
|
|
// Helper to reset data between test groups
|
|
async function truncateAll() {
|
|
const tables = [
|
|
'content_chunks', 'links', 'tags', 'raw_data',
|
|
'timeline_entries', 'page_versions', 'ingest_log', 'pages',
|
|
];
|
|
for (const t of tables) {
|
|
await (engine as any).db.exec(`DELETE FROM ${t}`);
|
|
}
|
|
}
|
|
|
|
const testPage: PageInput = {
|
|
type: 'concept',
|
|
title: 'Test Page',
|
|
compiled_truth: 'This is a test page about NovaMind AI agents.',
|
|
timeline: '2024-01-15: Founded NovaMind',
|
|
};
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Pages CRUD
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Pages', () => {
|
|
beforeEach(truncateAll);
|
|
|
|
test('putPage + getPage round trip', async () => {
|
|
const page = await engine.putPage('test/hello', testPage);
|
|
expect(page.slug).toBe('test/hello');
|
|
expect(page.title).toBe('Test Page');
|
|
expect(page.type).toBe('concept');
|
|
expect(page.compiled_truth).toContain('NovaMind');
|
|
|
|
const fetched = await engine.getPage('test/hello');
|
|
expect(fetched).not.toBeNull();
|
|
expect(fetched!.title).toBe('Test Page');
|
|
expect(fetched!.content_hash).toBeTruthy();
|
|
});
|
|
|
|
test('putPage upserts on conflict', async () => {
|
|
await engine.putPage('test/upsert', testPage);
|
|
const updated = await engine.putPage('test/upsert', {
|
|
...testPage,
|
|
title: 'Updated Title',
|
|
});
|
|
expect(updated.title).toBe('Updated Title');
|
|
|
|
const all = await engine.listPages();
|
|
const matches = all.filter(p => p.slug === 'test/upsert');
|
|
expect(matches.length).toBe(1);
|
|
});
|
|
|
|
test('getPage returns null for missing slug', async () => {
|
|
const result = await engine.getPage('nonexistent/slug');
|
|
expect(result).toBeNull();
|
|
});
|
|
|
|
test('deletePage removes page', async () => {
|
|
await engine.putPage('test/delete-me', testPage);
|
|
await engine.deletePage('test/delete-me');
|
|
const result = await engine.getPage('test/delete-me');
|
|
expect(result).toBeNull();
|
|
});
|
|
|
|
test('listPages with type filter', async () => {
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('concepts/rag', { ...testPage, type: 'concept', title: 'RAG' });
|
|
|
|
const people = await engine.listPages({ type: 'person' });
|
|
expect(people.length).toBe(1);
|
|
expect(people[0].title).toBe('Alice');
|
|
});
|
|
|
|
test('listPages with tag filter', async () => {
|
|
await engine.putPage('test/tagged', testPage);
|
|
await engine.addTag('test/tagged', 'special');
|
|
|
|
const tagged = await engine.listPages({ tag: 'special' });
|
|
expect(tagged.length).toBe(1);
|
|
expect(tagged[0].slug).toBe('test/tagged');
|
|
});
|
|
|
|
test('resolveSlugs exact match', async () => {
|
|
await engine.putPage('test/exact', testPage);
|
|
const slugs = await engine.resolveSlugs('test/exact');
|
|
expect(slugs).toEqual(['test/exact']);
|
|
});
|
|
|
|
test('resolveSlugs fuzzy match via pg_trgm', async () => {
|
|
await engine.putPage('people/sarah-chen', { ...testPage, title: 'Sarah Chen' });
|
|
const slugs = await engine.resolveSlugs('sarah');
|
|
expect(slugs.length).toBeGreaterThan(0);
|
|
expect(slugs).toContain('people/sarah-chen');
|
|
});
|
|
|
|
test('updateSlug renames page', async () => {
|
|
await engine.putPage('test/old-name', testPage);
|
|
await engine.updateSlug('test/old-name', 'test/new-name');
|
|
expect(await engine.getPage('test/old-name')).toBeNull();
|
|
expect((await engine.getPage('test/new-name'))?.title).toBe('Test Page');
|
|
});
|
|
|
|
test('validateSlug rejects path traversal', async () => {
|
|
expect(() => engine.putPage('../etc/passwd', testPage)).toThrow();
|
|
});
|
|
|
|
test('validateSlug rejects leading slash', async () => {
|
|
expect(() => engine.putPage('/absolute/path', testPage)).toThrow();
|
|
});
|
|
|
|
test('validateSlug normalizes to lowercase', async () => {
|
|
const page = await engine.putPage('Test/UPPER', testPage);
|
|
expect(page.slug).toBe('test/upper');
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Search (tsvector triggers + FTS)
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Search', () => {
|
|
beforeAll(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('companies/novamind', {
|
|
type: 'company', title: 'NovaMind',
|
|
compiled_truth: 'NovaMind builds AI agents for enterprise automation.',
|
|
});
|
|
await engine.upsertChunks('companies/novamind', [
|
|
{ chunk_index: 0, chunk_text: 'NovaMind builds AI agents for enterprise', chunk_source: 'compiled_truth' },
|
|
]);
|
|
await engine.putPage('concepts/rag', {
|
|
type: 'concept', title: 'Retrieval-Augmented Generation',
|
|
compiled_truth: 'RAG combines retrieval with generation for better answers.',
|
|
});
|
|
await engine.upsertChunks('concepts/rag', [
|
|
{ chunk_index: 0, chunk_text: 'RAG combines retrieval with generation', chunk_source: 'compiled_truth' },
|
|
]);
|
|
});
|
|
|
|
test('searchKeyword returns results for matching term', async () => {
|
|
const results = await engine.searchKeyword('NovaMind');
|
|
expect(results.length).toBeGreaterThan(0);
|
|
expect(results[0].slug).toBe('companies/novamind');
|
|
});
|
|
|
|
test('searchKeyword returns empty for non-matching term', async () => {
|
|
const results = await engine.searchKeyword('xyznonexistent');
|
|
expect(results.length).toBe(0);
|
|
});
|
|
|
|
test('tsvector trigger populates search_vector on insert', async () => {
|
|
// Verify the PL/pgSQL trigger fires and search_vector is populated
|
|
const results = await engine.searchKeyword('enterprise automation');
|
|
expect(results.length).toBeGreaterThan(0);
|
|
});
|
|
|
|
test('searchVector returns empty when no embeddings', async () => {
|
|
const fakeEmbedding = new Float32Array(1536);
|
|
const results = await engine.searchVector(fakeEmbedding);
|
|
expect(results.length).toBe(0);
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Chunks
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Chunks', () => {
|
|
beforeEach(truncateAll);
|
|
|
|
test('upsertChunks + getChunks round trip', async () => {
|
|
await engine.putPage('test/chunks', testPage);
|
|
await engine.upsertChunks('test/chunks', [
|
|
{ chunk_index: 0, chunk_text: 'Chunk zero', chunk_source: 'compiled_truth' },
|
|
{ chunk_index: 1, chunk_text: 'Chunk one', chunk_source: 'compiled_truth' },
|
|
]);
|
|
const chunks = await engine.getChunks('test/chunks');
|
|
expect(chunks.length).toBe(2);
|
|
expect(chunks[0].chunk_text).toBe('Chunk zero');
|
|
expect(chunks[1].chunk_text).toBe('Chunk one');
|
|
});
|
|
|
|
test('upsertChunks removes orphan chunks', async () => {
|
|
await engine.putPage('test/orphan', testPage);
|
|
await engine.upsertChunks('test/orphan', [
|
|
{ chunk_index: 0, chunk_text: 'Keep', chunk_source: 'compiled_truth' },
|
|
{ chunk_index: 1, chunk_text: 'Remove', chunk_source: 'compiled_truth' },
|
|
]);
|
|
// Re-upsert with only index 0
|
|
await engine.upsertChunks('test/orphan', [
|
|
{ chunk_index: 0, chunk_text: 'Updated', chunk_source: 'compiled_truth' },
|
|
]);
|
|
const chunks = await engine.getChunks('test/orphan');
|
|
expect(chunks.length).toBe(1);
|
|
expect(chunks[0].chunk_text).toBe('Updated');
|
|
});
|
|
|
|
test('upsertChunks throws for missing page', async () => {
|
|
await expect(
|
|
engine.upsertChunks('nonexistent/page', [
|
|
{ chunk_index: 0, chunk_text: 'test', chunk_source: 'compiled_truth' },
|
|
])
|
|
).rejects.toThrow('Page not found');
|
|
});
|
|
|
|
test('deleteChunks removes all chunks for page', async () => {
|
|
await engine.putPage('test/delete-chunks', testPage);
|
|
await engine.upsertChunks('test/delete-chunks', [
|
|
{ chunk_index: 0, chunk_text: 'Gone', chunk_source: 'compiled_truth' },
|
|
]);
|
|
await engine.deleteChunks('test/delete-chunks');
|
|
const chunks = await engine.getChunks('test/delete-chunks');
|
|
expect(chunks.length).toBe(0);
|
|
});
|
|
|
|
test('getChunksWithEmbeddings returns embedding data', async () => {
|
|
await engine.putPage('test/embed', testPage);
|
|
const embedding = new Float32Array(1536).fill(0.1);
|
|
await engine.upsertChunks('test/embed', [
|
|
{ chunk_index: 0, chunk_text: 'With embedding', chunk_source: 'compiled_truth', embedding },
|
|
]);
|
|
const chunks = await engine.getChunksWithEmbeddings('test/embed');
|
|
expect(chunks.length).toBe(1);
|
|
expect(chunks[0].embedding).not.toBeNull();
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Links + Graph
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Links', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('companies/acme', { ...testPage, type: 'company', title: 'ACME' });
|
|
await engine.putPage('companies/beta', { ...testPage, type: 'company', title: 'Beta' });
|
|
});
|
|
|
|
test('addLink + getLinks', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', 'works at', 'employment');
|
|
const links = await engine.getLinks('people/alice');
|
|
expect(links.length).toBe(1);
|
|
expect(links[0].to_slug).toBe('companies/acme');
|
|
});
|
|
|
|
test('getBacklinks', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme');
|
|
const backlinks = await engine.getBacklinks('companies/acme');
|
|
expect(backlinks.length).toBe(1);
|
|
expect(backlinks[0].from_slug).toBe('people/alice');
|
|
});
|
|
|
|
test('removeLink', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme');
|
|
await engine.removeLink('people/alice', 'companies/acme');
|
|
const links = await engine.getLinks('people/alice');
|
|
expect(links.length).toBe(0);
|
|
});
|
|
|
|
test('traverseGraph with depth', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme');
|
|
await engine.addLink('companies/acme', 'companies/beta');
|
|
|
|
const graph = await engine.traverseGraph('people/alice', 2);
|
|
expect(graph.length).toBeGreaterThanOrEqual(2);
|
|
const slugs = graph.map(n => n.slug);
|
|
expect(slugs).toContain('people/alice');
|
|
expect(slugs).toContain('companies/acme');
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Tags
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Tags', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('test/tags', testPage);
|
|
});
|
|
|
|
test('addTag + getTags', async () => {
|
|
await engine.addTag('test/tags', 'alpha');
|
|
await engine.addTag('test/tags', 'beta');
|
|
const tags = await engine.getTags('test/tags');
|
|
expect(tags).toEqual(['alpha', 'beta']);
|
|
});
|
|
|
|
test('removeTag', async () => {
|
|
await engine.addTag('test/tags', 'remove-me');
|
|
await engine.removeTag('test/tags', 'remove-me');
|
|
const tags = await engine.getTags('test/tags');
|
|
expect(tags).not.toContain('remove-me');
|
|
});
|
|
|
|
test('duplicate tag is idempotent', async () => {
|
|
await engine.addTag('test/tags', 'dup');
|
|
await engine.addTag('test/tags', 'dup');
|
|
const tags = await engine.getTags('test/tags');
|
|
expect(tags.filter(t => t === 'dup').length).toBe(1);
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Timeline
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Timeline', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('test/timeline', testPage);
|
|
});
|
|
|
|
test('addTimelineEntry + getTimeline', async () => {
|
|
await engine.addTimelineEntry('test/timeline', {
|
|
date: '2024-01-15', summary: 'Founded', detail: 'Company founded',
|
|
});
|
|
const entries = await engine.getTimeline('test/timeline');
|
|
expect(entries.length).toBe(1);
|
|
expect(entries[0].summary).toBe('Founded');
|
|
});
|
|
|
|
test('getTimeline with date range', async () => {
|
|
await engine.addTimelineEntry('test/timeline', { date: '2024-01-01', summary: 'Jan' });
|
|
await engine.addTimelineEntry('test/timeline', { date: '2024-06-01', summary: 'Jun' });
|
|
await engine.addTimelineEntry('test/timeline', { date: '2024-12-01', summary: 'Dec' });
|
|
|
|
const filtered = await engine.getTimeline('test/timeline', {
|
|
after: '2024-03-01', before: '2024-09-01',
|
|
});
|
|
expect(filtered.length).toBe(1);
|
|
expect(filtered[0].summary).toBe('Jun');
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Batch methods (addLinksBatch / addTimelineEntriesBatch)
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: addLinksBatch', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('a', { type: 'concept', title: 'A', compiled_truth: '', timeline: '' });
|
|
await engine.putPage('b', { type: 'concept', title: 'B', compiled_truth: '', timeline: '' });
|
|
await engine.putPage('c', { type: 'concept', title: 'C', compiled_truth: '', timeline: '' });
|
|
});
|
|
|
|
test('empty batch returns 0 with no DB call', async () => {
|
|
expect(await engine.addLinksBatch([])).toBe(0);
|
|
});
|
|
|
|
test('batch of 1 with missing optional fields inserts row with empty defaults', async () => {
|
|
const inserted = await engine.addLinksBatch([{ from_slug: 'a', to_slug: 'b' }]);
|
|
expect(inserted).toBe(1);
|
|
const links = await engine.getLinks('a');
|
|
expect(links.length).toBe(1);
|
|
expect(links[0].context).toBe('');
|
|
expect(links[0].link_type).toBe('');
|
|
});
|
|
|
|
test('within-batch duplicates are deduped via ON CONFLICT (no 21000 error)', async () => {
|
|
const inserted = await engine.addLinksBatch([
|
|
{ from_slug: 'a', to_slug: 'b', link_type: 'mention' },
|
|
{ from_slug: 'a', to_slug: 'b', link_type: 'mention' },
|
|
{ from_slug: 'a', to_slug: 'c', link_type: 'mention' },
|
|
]);
|
|
expect(inserted).toBe(2);
|
|
});
|
|
|
|
test('rows with missing slug are silently dropped by JOIN', async () => {
|
|
const inserted = await engine.addLinksBatch([
|
|
{ from_slug: 'doesnt-exist', to_slug: 'b' },
|
|
{ from_slug: 'a', to_slug: 'b' },
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
});
|
|
|
|
test('half-existing batch returns count of new only', async () => {
|
|
await engine.addLink('a', 'b', '', 'mention');
|
|
const inserted = await engine.addLinksBatch([
|
|
{ from_slug: 'a', to_slug: 'b', link_type: 'mention' },
|
|
{ from_slug: 'a', to_slug: 'c', link_type: 'mention' },
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
});
|
|
|
|
test('batch of 100 fresh rows returns 100', async () => {
|
|
// Create 100 target pages
|
|
for (let i = 0; i < 100; i++) {
|
|
await engine.putPage(`target/${i}`, { type: 'concept', title: `T${i}`, compiled_truth: '', timeline: '' });
|
|
}
|
|
const batch = Array.from({ length: 100 }, (_, i) => ({
|
|
from_slug: 'a', to_slug: `target/${i}`, link_type: 'mention',
|
|
}));
|
|
expect(await engine.addLinksBatch(batch)).toBe(100);
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: addTimelineEntriesBatch', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('p1', { type: 'concept', title: 'P1', compiled_truth: '', timeline: '' });
|
|
await engine.putPage('p2', { type: 'concept', title: 'P2', compiled_truth: '', timeline: '' });
|
|
});
|
|
|
|
test('empty batch returns 0', async () => {
|
|
expect(await engine.addTimelineEntriesBatch([])).toBe(0);
|
|
});
|
|
|
|
test('batch of 1 with missing optionals inserts with empty defaults', async () => {
|
|
const inserted = await engine.addTimelineEntriesBatch([
|
|
{ slug: 'p1', date: '2024-01-15', summary: 'Founded' },
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
const entries = await engine.getTimeline('p1');
|
|
expect(entries.length).toBe(1);
|
|
expect(entries[0].source).toBe('');
|
|
expect(entries[0].detail).toBe('');
|
|
});
|
|
|
|
test('within-batch duplicates are deduped via ON CONFLICT', async () => {
|
|
const inserted = await engine.addTimelineEntriesBatch([
|
|
{ slug: 'p1', date: '2024-01-15', summary: 'Founded' },
|
|
{ slug: 'p1', date: '2024-01-15', summary: 'Founded' },
|
|
{ slug: 'p1', date: '2024-02-01', summary: 'Launched' },
|
|
]);
|
|
expect(inserted).toBe(2);
|
|
});
|
|
|
|
test('rows with missing slug are silently dropped by JOIN', async () => {
|
|
const inserted = await engine.addTimelineEntriesBatch([
|
|
{ slug: 'no-such-page', date: '2024-01-15', summary: 'Phantom' },
|
|
{ slug: 'p1', date: '2024-01-15', summary: 'Real' },
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
});
|
|
|
|
test('mix of new + existing returns count of new only', async () => {
|
|
await engine.addTimelineEntry('p1', { date: '2024-01-15', summary: 'Founded' });
|
|
const inserted = await engine.addTimelineEntriesBatch([
|
|
{ slug: 'p1', date: '2024-01-15', summary: 'Founded' },
|
|
{ slug: 'p1', date: '2024-02-01', summary: 'Launched' },
|
|
{ slug: 'p2', date: '2024-03-01', summary: 'Spun off' },
|
|
]);
|
|
expect(inserted).toBe(2);
|
|
});
|
|
});
|
|
|
|
// v0.18.0: regression guards for the cross-source JOIN fan-out.
|
|
// Before the fix, addLinksBatch/addTimelineEntriesBatch JOINed on pages.slug
|
|
// only — so a page with the same slug in two sources would fan out and
|
|
// silently create duplicate edges / entries. Source-id-qualified JOINs
|
|
// eliminate the fan-out.
|
|
describe('PGLiteEngine: batch ops source-awareness (v0.18.0)', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
// Register a second source and populate the same slugs in both.
|
|
const db = (engine as any).db;
|
|
await db.query(
|
|
`INSERT INTO sources (id, name) VALUES ('alt', 'alt')
|
|
ON CONFLICT (id) DO NOTHING`
|
|
);
|
|
// default-source rows via putPage (schema DEFAULT 'default').
|
|
await engine.putPage('topics/ai', { type: 'concept', title: 'AI (default)', compiled_truth: '', timeline: '' });
|
|
await engine.putPage('topics/ml', { type: 'concept', title: 'ML (default)', compiled_truth: '', timeline: '' });
|
|
// alt-source rows with the same slugs, inserted via raw SQL.
|
|
await db.query(
|
|
`INSERT INTO pages (slug, type, title, compiled_truth, timeline, frontmatter, content_hash, source_id, updated_at)
|
|
VALUES ('topics/ai', 'concept', 'AI (alt)', '', '', '{}'::jsonb, 'h1', 'alt', now()),
|
|
('topics/ml', 'concept', 'ML (alt)', '', '', '{}'::jsonb, 'h2', 'alt', now())`
|
|
);
|
|
});
|
|
|
|
test('addLinksBatch default source_id does NOT fan out across sources', async () => {
|
|
const inserted = await engine.addLinksBatch([
|
|
{ from_slug: 'topics/ai', to_slug: 'topics/ml', link_type: 'mention' },
|
|
]);
|
|
// Exactly one edge, not two. Before the fix this was 2.
|
|
expect(inserted).toBe(1);
|
|
const db = (engine as any).db;
|
|
const { rows } = await db.query(
|
|
`SELECT f.source_id AS from_src, t.source_id AS to_src
|
|
FROM links l
|
|
JOIN pages f ON f.id = l.from_page_id
|
|
JOIN pages t ON t.id = l.to_page_id`
|
|
);
|
|
expect(rows.length).toBe(1);
|
|
expect(rows[0].from_src).toBe('default');
|
|
expect(rows[0].to_src).toBe('default');
|
|
});
|
|
|
|
test('addLinksBatch with explicit alt source_id lands in alt only', async () => {
|
|
const inserted = await engine.addLinksBatch([
|
|
{
|
|
from_slug: 'topics/ai', to_slug: 'topics/ml', link_type: 'mention',
|
|
from_source_id: 'alt', to_source_id: 'alt',
|
|
},
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
const db = (engine as any).db;
|
|
const { rows } = await db.query(
|
|
`SELECT f.source_id AS from_src, t.source_id AS to_src
|
|
FROM links l
|
|
JOIN pages f ON f.id = l.from_page_id
|
|
JOIN pages t ON t.id = l.to_page_id`
|
|
);
|
|
expect(rows.length).toBe(1);
|
|
expect(rows[0].from_src).toBe('alt');
|
|
expect(rows[0].to_src).toBe('alt');
|
|
});
|
|
|
|
test('addLinksBatch supports cross-source edges', async () => {
|
|
const inserted = await engine.addLinksBatch([
|
|
{
|
|
from_slug: 'topics/ai', to_slug: 'topics/ml', link_type: 'mention',
|
|
from_source_id: 'default', to_source_id: 'alt',
|
|
},
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
const db = (engine as any).db;
|
|
const { rows } = await db.query(
|
|
`SELECT f.source_id AS from_src, t.source_id AS to_src
|
|
FROM links l
|
|
JOIN pages f ON f.id = l.from_page_id
|
|
JOIN pages t ON t.id = l.to_page_id`
|
|
);
|
|
expect(rows.length).toBe(1);
|
|
expect(rows[0].from_src).toBe('default');
|
|
expect(rows[0].to_src).toBe('alt');
|
|
});
|
|
|
|
test('addTimelineEntriesBatch default source_id does NOT fan out across sources', async () => {
|
|
const inserted = await engine.addTimelineEntriesBatch([
|
|
{ slug: 'topics/ai', date: '2024-01-15', summary: 'Founded' },
|
|
]);
|
|
// Exactly one entry (default source), not two. Before the fix this was 2.
|
|
expect(inserted).toBe(1);
|
|
const db = (engine as any).db;
|
|
const { rows } = await db.query(
|
|
`SELECT p.source_id FROM timeline_entries te
|
|
JOIN pages p ON p.id = te.page_id`
|
|
);
|
|
expect(rows.length).toBe(1);
|
|
expect(rows[0].source_id).toBe('default');
|
|
});
|
|
|
|
test('addTimelineEntriesBatch with explicit alt source_id lands in alt only', async () => {
|
|
const inserted = await engine.addTimelineEntriesBatch([
|
|
{ slug: 'topics/ai', date: '2024-01-15', summary: 'Founded', source_id: 'alt' },
|
|
]);
|
|
expect(inserted).toBe(1);
|
|
const db = (engine as any).db;
|
|
const { rows } = await db.query(
|
|
`SELECT p.source_id FROM timeline_entries te
|
|
JOIN pages p ON p.id = te.page_id`
|
|
);
|
|
expect(rows.length).toBe(1);
|
|
expect(rows[0].source_id).toBe('alt');
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Raw Data, Versions, Config, IngestLog
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: RawData', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('test/raw', testPage);
|
|
});
|
|
|
|
test('putRawData + getRawData', async () => {
|
|
await engine.putRawData('test/raw', 'crunchbase', { funding: '$10M' });
|
|
const data = await engine.getRawData('test/raw', 'crunchbase');
|
|
expect(data.length).toBe(1);
|
|
expect((data[0].data as any).funding).toBe('$10M');
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: Versions', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('test/version', testPage);
|
|
});
|
|
|
|
test('createVersion + getVersions', async () => {
|
|
const v = await engine.createVersion('test/version');
|
|
expect(v.compiled_truth).toBe(testPage.compiled_truth);
|
|
|
|
const versions = await engine.getVersions('test/version');
|
|
expect(versions.length).toBe(1);
|
|
});
|
|
|
|
test('revertToVersion restores content', async () => {
|
|
await engine.createVersion('test/version');
|
|
await engine.putPage('test/version', { ...testPage, compiled_truth: 'Changed' });
|
|
|
|
const versions = await engine.getVersions('test/version');
|
|
await engine.revertToVersion('test/version', versions[0].id);
|
|
|
|
const page = await engine.getPage('test/version');
|
|
expect(page!.compiled_truth).toBe(testPage.compiled_truth);
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: Config', () => {
|
|
test('getConfig + setConfig', async () => {
|
|
await engine.setConfig('test_key', 'test_value');
|
|
const val = await engine.getConfig('test_key');
|
|
expect(val).toBe('test_value');
|
|
});
|
|
|
|
test('getConfig returns null for missing key', async () => {
|
|
const val = await engine.getConfig('nonexistent_key');
|
|
expect(val).toBeNull();
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: IngestLog', () => {
|
|
test('logIngest + getIngestLog', async () => {
|
|
await engine.logIngest({
|
|
source_type: 'git', source_ref: '/tmp/test-repo',
|
|
pages_updated: ['test/a', 'test/b'], summary: 'Imported 2 pages',
|
|
});
|
|
const log = await engine.getIngestLog({ limit: 10 });
|
|
expect(log.length).toBeGreaterThan(0);
|
|
expect(log[0].source_type).toBe('git');
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Stats + Health
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Stats & Health', () => {
|
|
beforeAll(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('test/stats', testPage);
|
|
await engine.upsertChunks('test/stats', [
|
|
{ chunk_index: 0, chunk_text: 'chunk', chunk_source: 'compiled_truth' },
|
|
]);
|
|
await engine.addTag('test/stats', 'stat-tag');
|
|
});
|
|
|
|
test('getStats returns correct counts', async () => {
|
|
const stats = await engine.getStats();
|
|
expect(stats.page_count).toBe(1);
|
|
expect(stats.chunk_count).toBe(1);
|
|
expect(stats.tag_count).toBe(1);
|
|
expect(stats.pages_by_type.concept).toBe(1);
|
|
});
|
|
|
|
test('getHealth returns coverage metrics', async () => {
|
|
const health = await engine.getHealth();
|
|
expect(health.page_count).toBe(1);
|
|
expect(health.missing_embeddings).toBe(1); // chunk has no embedding
|
|
expect(health.embed_coverage).toBe(0);
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Transactions
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Transactions', () => {
|
|
beforeEach(truncateAll);
|
|
|
|
test('transaction commits on success', async () => {
|
|
await engine.transaction(async (tx) => {
|
|
await tx.putPage('test/tx-ok', testPage);
|
|
});
|
|
const page = await engine.getPage('test/tx-ok');
|
|
expect(page).not.toBeNull();
|
|
});
|
|
|
|
test('transaction rolls back on error', async () => {
|
|
try {
|
|
await engine.transaction(async (tx) => {
|
|
await tx.putPage('test/tx-fail', testPage);
|
|
throw new Error('Deliberate rollback');
|
|
});
|
|
} catch { /* expected */ }
|
|
|
|
const page = await engine.getPage('test/tx-fail');
|
|
expect(page).toBeNull();
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// Cascade deletes
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: Cascade deletes', () => {
|
|
test('deleting a page cascades to chunks, tags, links', async () => {
|
|
await engine.putPage('test/cascade', testPage);
|
|
await engine.upsertChunks('test/cascade', [
|
|
{ chunk_index: 0, chunk_text: 'cascade chunk', chunk_source: 'compiled_truth' },
|
|
]);
|
|
await engine.addTag('test/cascade', 'cascade-tag');
|
|
|
|
await engine.deletePage('test/cascade');
|
|
|
|
const chunks = await engine.getChunks('test/cascade');
|
|
expect(chunks.length).toBe(0);
|
|
const tags = await engine.getTags('test/cascade');
|
|
expect(tags.length).toBe(0);
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// v0.10.1: Knowledge graph layer
|
|
// ─────────────────────────────────────────────────────────────────
|
|
|
|
describe('PGLiteEngine: getAllSlugs', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('people/bob', { ...testPage, type: 'person', title: 'Bob' });
|
|
await engine.putPage('companies/acme', { ...testPage, type: 'company', title: 'Acme' });
|
|
});
|
|
|
|
test('returns Set of all page slugs', async () => {
|
|
const slugs = await engine.getAllSlugs();
|
|
expect(slugs).toBeInstanceOf(Set);
|
|
expect(slugs.size).toBe(3);
|
|
expect(slugs.has('people/alice')).toBe(true);
|
|
expect(slugs.has('companies/acme')).toBe(true);
|
|
});
|
|
|
|
test('empty brain returns empty Set', async () => {
|
|
await truncateAll();
|
|
const slugs = await engine.getAllSlugs();
|
|
expect(slugs.size).toBe(0);
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: listPages updated_after filter', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
});
|
|
|
|
test('filters pages by updated_at > given date', async () => {
|
|
await engine.putPage('test/old', testPage);
|
|
// Sleep briefly so the second page has a strictly later updated_at.
|
|
await new Promise(r => setTimeout(r, 10));
|
|
const cutoff = new Date().toISOString();
|
|
await new Promise(r => setTimeout(r, 10));
|
|
await engine.putPage('test/new', testPage);
|
|
|
|
const recent = await engine.listPages({ updated_after: cutoff, limit: 100 });
|
|
const recentSlugs = recent.map(p => p.slug);
|
|
expect(recentSlugs).toContain('test/new');
|
|
expect(recentSlugs).not.toContain('test/old');
|
|
});
|
|
|
|
test('without updated_after, returns all pages (regression)', async () => {
|
|
await engine.putPage('test/a', testPage);
|
|
await engine.putPage('test/b', testPage);
|
|
const all = await engine.listPages({ limit: 100 });
|
|
expect(all.length).toBe(2);
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: Multi-type links (v5 migration)', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('companies/acme', { ...testPage, type: 'company', title: 'Acme' });
|
|
});
|
|
|
|
test('same (from, to) with different link_types both stored', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', 'CEO', 'works_at');
|
|
await engine.addLink('people/alice', 'companies/acme', 'on the board', 'advises');
|
|
const links = await engine.getLinks('people/alice');
|
|
expect(links.length).toBe(2);
|
|
const types = links.map(l => l.link_type).sort();
|
|
expect(types).toEqual(['advises', 'works_at']);
|
|
});
|
|
|
|
test('upsert on same (from, to, type) updates context', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', 'old context', 'works_at');
|
|
await engine.addLink('people/alice', 'companies/acme', 'new context', 'works_at');
|
|
const links = await engine.getLinks('people/alice');
|
|
expect(links.length).toBe(1);
|
|
expect(links[0].context).toBe('new context');
|
|
});
|
|
|
|
test('removeLink without linkType removes ALL types for the pair (regression)', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', 'a', 'works_at');
|
|
await engine.addLink('people/alice', 'companies/acme', 'b', 'advises');
|
|
await engine.removeLink('people/alice', 'companies/acme');
|
|
const links = await engine.getLinks('people/alice');
|
|
expect(links.length).toBe(0);
|
|
});
|
|
|
|
test('removeLink with linkType removes only that type', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', 'a', 'works_at');
|
|
await engine.addLink('people/alice', 'companies/acme', 'b', 'advises');
|
|
await engine.removeLink('people/alice', 'companies/acme', 'works_at');
|
|
const links = await engine.getLinks('people/alice');
|
|
expect(links.length).toBe(1);
|
|
expect(links[0].link_type).toBe('advises');
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: Timeline dedup constraint (v6 migration)', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('test/timeline-dedup', testPage);
|
|
});
|
|
|
|
test('inserting same (date, summary) twice is silent no-op (idempotent)', async () => {
|
|
await engine.addTimelineEntry('test/timeline-dedup', { date: '2026-01-15', summary: 'Event A' });
|
|
await engine.addTimelineEntry('test/timeline-dedup', { date: '2026-01-15', summary: 'Event A' });
|
|
const entries = await engine.getTimeline('test/timeline-dedup');
|
|
expect(entries.length).toBe(1);
|
|
});
|
|
|
|
test('different summary on same date: both inserted', async () => {
|
|
await engine.addTimelineEntry('test/timeline-dedup', { date: '2026-01-15', summary: 'Morning' });
|
|
await engine.addTimelineEntry('test/timeline-dedup', { date: '2026-01-15', summary: 'Evening' });
|
|
const entries = await engine.getTimeline('test/timeline-dedup');
|
|
expect(entries.length).toBe(2);
|
|
});
|
|
|
|
test('throws on missing page (default behavior preserved)', async () => {
|
|
await expect(engine.addTimelineEntry('does/not-exist', { date: '2026-01-15', summary: 'X' }))
|
|
.rejects.toThrow();
|
|
});
|
|
|
|
test('skipExistenceCheck=true: silent no-op on missing page', async () => {
|
|
// No throw, but also nothing inserted (subquery returns no rows).
|
|
await engine.addTimelineEntry(
|
|
'does/not-exist',
|
|
{ date: '2026-01-15', summary: 'X' },
|
|
{ skipExistenceCheck: true },
|
|
);
|
|
// No assertion needed beyond "did not throw".
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: getBacklinkCounts', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('people/bob', { ...testPage, type: 'person', title: 'Bob' });
|
|
await engine.putPage('companies/acme', { ...testPage, type: 'company', title: 'Acme' });
|
|
});
|
|
|
|
test('returns Map<slug, count> for given slugs', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', '', 'works_at');
|
|
await engine.addLink('people/bob', 'companies/acme', '', 'invested_in');
|
|
const counts = await engine.getBacklinkCounts(['companies/acme', 'people/alice']);
|
|
expect(counts.get('companies/acme')).toBe(2);
|
|
expect(counts.get('people/alice')).toBe(0);
|
|
});
|
|
|
|
test('empty input -> empty Map', async () => {
|
|
const counts = await engine.getBacklinkCounts([]);
|
|
expect(counts.size).toBe(0);
|
|
});
|
|
|
|
test('slugs with zero links: present in Map with 0', async () => {
|
|
const counts = await engine.getBacklinkCounts(['people/alice']);
|
|
expect(counts.get('people/alice')).toBe(0);
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: traversePaths (v0.10.1)', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('people/bob', { ...testPage, type: 'person', title: 'Bob' });
|
|
await engine.putPage('people/carol', { ...testPage, type: 'person', title: 'Carol' });
|
|
await engine.putPage('companies/acme', { ...testPage, type: 'company', title: 'Acme' });
|
|
await engine.putPage('meetings/standup', { ...testPage, type: 'meeting', title: 'Standup' });
|
|
// Build a small typed graph
|
|
await engine.addLink('meetings/standup', 'people/alice', '', 'attended');
|
|
await engine.addLink('meetings/standup', 'people/bob', '', 'attended');
|
|
await engine.addLink('meetings/standup', 'people/carol', '', 'attended');
|
|
await engine.addLink('people/alice', 'companies/acme', '', 'works_at');
|
|
await engine.addLink('people/bob', 'companies/acme', '', 'invested_in');
|
|
});
|
|
|
|
test('out direction (default): follows from->to edges', async () => {
|
|
const paths = await engine.traversePaths('meetings/standup', { depth: 1 });
|
|
expect(paths.length).toBe(3);
|
|
expect(new Set(paths.map(p => p.to_slug))).toEqual(new Set(['people/alice', 'people/bob', 'people/carol']));
|
|
expect(paths.every(p => p.link_type === 'attended')).toBe(true);
|
|
});
|
|
|
|
test('in direction: follows to->from edges', async () => {
|
|
const paths = await engine.traversePaths('companies/acme', { depth: 1, direction: 'in' });
|
|
expect(paths.length).toBe(2);
|
|
expect(new Set(paths.map(p => p.from_slug))).toEqual(new Set(['people/alice', 'people/bob']));
|
|
});
|
|
|
|
test('linkType per-edge filter: only follows matching edges', async () => {
|
|
const paths = await engine.traversePaths('companies/acme', {
|
|
depth: 1, direction: 'in', linkType: 'works_at',
|
|
});
|
|
expect(paths.length).toBe(1);
|
|
expect(paths[0].from_slug).toBe('people/alice');
|
|
});
|
|
|
|
test('depth 2: multi-hop traversal', async () => {
|
|
const paths = await engine.traversePaths('meetings/standup', { depth: 2 });
|
|
// alice/bob/carol direct + alice->acme + bob->acme
|
|
expect(paths.length).toBeGreaterThanOrEqual(5);
|
|
const acmePaths = paths.filter(p => p.to_slug === 'companies/acme');
|
|
expect(acmePaths.length).toBe(2);
|
|
expect(acmePaths.every(p => p.depth === 2)).toBe(true);
|
|
});
|
|
|
|
test('non-existent slug returns empty', async () => {
|
|
const paths = await engine.traversePaths('does/not-exist', { depth: 5 });
|
|
expect(paths).toEqual([]);
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: traverseGraph cycle prevention', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/a', { ...testPage, type: 'person', title: 'A' });
|
|
await engine.putPage('people/b', { ...testPage, type: 'person', title: 'B' });
|
|
// Create a 2-cycle: A -> B -> A
|
|
await engine.addLink('people/a', 'people/b', '', 'mentions');
|
|
await engine.addLink('people/b', 'people/a', '', 'mentions');
|
|
});
|
|
|
|
test('does not amplify on cyclic graphs', async () => {
|
|
// Without cycle prevention, depth 5 on a 2-cycle would loop indefinitely
|
|
// (or at least produce many duplicate nodes). With the visited array, each
|
|
// node appears at most once.
|
|
const graph = await engine.traverseGraph('people/a', 5);
|
|
const slugs = graph.map(n => n.slug);
|
|
// Each slug should appear at most twice (once at depth 0, possibly once
|
|
// again at a deeper level via the cycle, but bounded by visited check).
|
|
const counts = new Map<string, number>();
|
|
for (const s of slugs) counts.set(s, (counts.get(s) ?? 0) + 1);
|
|
for (const [slug, count] of counts) {
|
|
expect(count).toBeLessThanOrEqual(2); // tolerate root + 1 traversal entry
|
|
void slug;
|
|
}
|
|
});
|
|
});
|
|
|
|
describe('PGLiteEngine: getHealth graph metrics', () => {
|
|
beforeEach(async () => {
|
|
await truncateAll();
|
|
await engine.putPage('people/alice', { ...testPage, type: 'person', title: 'Alice' });
|
|
await engine.putPage('people/bob', { ...testPage, type: 'person', title: 'Bob' });
|
|
await engine.putPage('companies/acme', { ...testPage, type: 'company', title: 'Acme' });
|
|
});
|
|
|
|
test('link_coverage = 0 when no links exist', async () => {
|
|
const h = await engine.getHealth();
|
|
expect(h.link_coverage).toBe(0);
|
|
});
|
|
|
|
test('link_coverage = % of entity pages with >= 1 inbound link', async () => {
|
|
// Acme gets 1 inbound link (from Alice), Alice/Bob get 0 inbound.
|
|
// 1 of 3 entity pages has inbound links -> 33%.
|
|
await engine.addLink('people/alice', 'companies/acme', '', 'works_at');
|
|
const h = await engine.getHealth();
|
|
expect(h.link_coverage).toBeCloseTo(1 / 3, 2);
|
|
});
|
|
|
|
test('timeline_coverage = % with >= 1 timeline entry', async () => {
|
|
await engine.addTimelineEntry('people/alice', { date: '2026-01-15', summary: 'Joined' });
|
|
const h = await engine.getHealth();
|
|
expect(h.timeline_coverage).toBeCloseTo(1 / 3, 2);
|
|
});
|
|
|
|
test('most_connected lists top entities by link count', async () => {
|
|
await engine.addLink('people/alice', 'companies/acme', '', 'works_at');
|
|
await engine.addLink('people/bob', 'companies/acme', '', 'invested_in');
|
|
const h = await engine.getHealth();
|
|
expect(h.most_connected.length).toBeGreaterThan(0);
|
|
expect(h.most_connected[0].slug).toBe('companies/acme');
|
|
expect(h.most_connected[0].link_count).toBe(2);
|
|
});
|
|
|
|
test('orphan_pages: pages with neither inbound nor outbound links', async () => {
|
|
// All 3 pages start with no links. Expect 3 orphans.
|
|
const h = await engine.getHealth();
|
|
expect(h.orphan_pages).toBe(3);
|
|
|
|
// Add alice -> acme. Alice has outbound, acme has inbound, only Bob is orphan.
|
|
await engine.addLink('people/alice', 'companies/acme', '', 'works_at');
|
|
const h2 = await engine.getHealth();
|
|
expect(h2.orphan_pages).toBe(1);
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// v0.13.1 — PGLite.create() error-wrap (structural guard for #223)
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: v0.13.1 error-wrap on connect() (#223)', () => {
|
|
test('pglite-engine.ts source contains the wrap with #223 hint and nested original error', async () => {
|
|
const { readFileSync } = await import('fs');
|
|
const src = readFileSync('src/core/pglite-engine.ts', 'utf-8');
|
|
// Structural: the try/catch block must wrap PGlite.create() (the actual
|
|
// abort site, NOT engine-factory.ts). The error message must name the
|
|
// issue and suggest gbrain doctor. Must NOT suggest "missing migrations"
|
|
// as a cause (that was conflating #218 and #223 — migrations run AFTER
|
|
// create()).
|
|
expect(src).toContain('this._db = await PGlite.create');
|
|
expect(src).toContain('https://github.com/garrytan/gbrain/issues/223');
|
|
expect(src).toContain('gbrain doctor');
|
|
expect(src).toContain('Original error:');
|
|
// Regression guard: the user-visible error MESSAGE must not re-introduce
|
|
// the misleading "missing migrations" hint. (A source comment explaining
|
|
// *why* we removed it is fine — match only inside the wrapped Error body.)
|
|
const wrapStart = src.indexOf('const wrapped = new Error(');
|
|
expect(wrapStart).toBeGreaterThan(-1);
|
|
const wrapEnd = src.indexOf(');', wrapStart);
|
|
const errBody = src.slice(wrapStart, wrapEnd);
|
|
expect(errBody).not.toContain('missing migrations');
|
|
expect(errBody).not.toContain('apply-migrations');
|
|
});
|
|
});
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// v0.13.1 — Engine kind discriminator
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe('PGLiteEngine: v0.13.1 kind discriminator', () => {
|
|
test('exposes readonly kind = pglite', () => {
|
|
expect(engine.kind).toBe('pglite');
|
|
});
|
|
});
|