* feat(v0.17.0 step 1/9): sources primitive — additive-only multi-source foundation
Lane A of the multi-repo plan. Installs the sources table and seeds a
'default' row that inherits sync.repo_path/last_commit from existing
config. This is the bisectable foundation every later step builds on;
the breaking schema changes (composite UNIQUE, files FK rewrite,
resolution_type, ingest_log.source_id) land with their paired code
rewrites in Steps 2/4/5/7 so no single commit breaks the engine.
- migration v16 (sources_table_additive) + v0_17_0 orchestrator skeleton
- sort-by-version guard in runMigrations (array insertion order can
never cause a later migration to skip a lower one again)
- default source seeded with config '{"federated": true}' so pre-v0.17
brains keep single-namespace search semantics after upgrade
- orchestrator phase B detects absence of file_migration_ledger and
no-ops until Step 7 lands it
- 8 new structural tests in test/migrate.test.ts (shape, idempotency,
scope-guard that nothing else was smuggled into v16)
- apply-migrations tests include v0.17.0 in the registered list
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 2/9): pages.source_id + composite UNIQUE (Lane B)
Migration v17 adds pages.source_id with DEFAULT 'default' and swaps the
global UNIQUE(slug) for composite UNIQUE(source_id, slug). Ships atomically
with the engine's ON CONFLICT rewrite so the constraint swap and the code
that writes under it land in the same commit — no window where the engine
sees one shape and the schema has another.
Minimum-surface engine change: only putPage's ON CONFLICT target needs
re-targeting. Other slug-based queries work unchanged because single-
source brains (the only brain shape pre-Step-5) have exactly one source
'default', so slug remains effectively unique within it. Step 5+ will
surface an explicit sourceId param on putPage for cross-source sync.
- migration v17 (pages_source_id_composite_unique) in src/core/migrate.ts
- pages.source_id + composite UNIQUE added to schema.sql + pglite-schema.ts
for fresh installs
- ON CONFLICT (slug) → ON CONFLICT (source_id, slug) in both pglite-engine
and postgres-engine putPage
- DEFAULT 'default' closes the Codex-flagged race where an INSERT between
ADD COLUMN and SET NOT NULL could leave source_id NULL
- 5 new v17 structural tests (29 pass / 0 fail in migrate.test.ts)
- Full suite: 1979 pass / 3 fail (same as baseline — no regressions)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 6/9): sources CLI + source-resolver (Lane C)
Adds the CLI surface for multi-source management. Users can now register,
list, rename, federate/unfederate, and attach-to-directory a source. The
source-resolver is the shared 6-priority helper that Steps 4/5 will use
when they start surfacing an explicit --source flag on sync/extract/query.
Commands:
gbrain sources add <id> --path <p> [--name <n>] [--federated|--no-federated]
gbrain sources list [--json]
gbrain sources remove <id> [--yes] [--dry-run] [--keep-storage]
gbrain sources rename <id> <new-name>
gbrain sources default <id>
gbrain sources attach <id> — writes .gbrain-source in CWD
gbrain sources detach
gbrain sources federate <id> / unfederate <id>
Resolution priority (source-resolver.ts) — highest first:
1. --source flag 2. GBRAIN_SOURCE env 3. .gbrain-source dotfile walk-up
4. longest-prefix match on registered local_path (Codex #2 fix)
5. sources.default config 6. fallback 'default'
- add: validates id format (kebab-case alnum, 1-32), rejects overlapping
paths (eng review §4 finding 4.1), supports federated default opt-in
- remove: guards against --yes omission + refuses to remove 'default',
supports --dry-run, reports cascade page count
- attach/detach: matches kubectl/terraform context-pinning semantics
- Throws on overlap rather than process.exit() so the CLI error wrapper
reports it consistently (also makes unit testing clean)
28 new tests across sources.test.ts (dispatcher + validation + overlap
guard) and source-resolver.test.ts (full 6-priority coverage including
longest-prefix). Full suite: 2012 pass / 3 fail (pre-existing PGLite
infra timeouts).
NOT in scope for Step 6 (deferred):
- import-from-github (SSRF + clone integration)
- prune (retention/TTL, lands v0.18)
- MCP tool-defs regen for source-scoping on read ops (Step 5)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(v0.17.0 step 8/9): getting-started guide + migration skill + citation rule
Step 8 (Lane F) documents what Steps 1+2+6 have shipped and sets up
the agent-facing rules for multi-source.
New files:
- skills/migrations/v0.17.0.md — migration skill read by host agents
after `gbrain apply-migrations`. Covers the v16+v17 chain, what's
in v0.17.0 vs what lands later (v0.17.1 ACL, v0.18 sessions), and
the new sources CLI surface. Cites docs/guides/multi-source-brains.md
as the recipe.
- docs/guides/multi-source-brains.md — getting-started for end users.
Three canonical scenarios (unified wiki+gstack / purpose-separated
yc-media+garrys-list / mixed), full resolution priority, federation
flag semantics, command reference, and citation format.
skills/brain-ops/SKILL.md — new "Cross-source citation format"
section mandating `[source-id:slug]` when the brain has multiple
sources. Matches the contract the /plan-devex-review DX review
pinned down (DX Finding 5: surface source_id in every page payload
+ citation contract). Key must be sources.id (immutable), never
sources.name.
No behavior change — this is pure documentation for what already
exists in the binary. 144 skills conformance tests still pass.
NOT in this commit (deferred to later steps):
- docs/guides/repo-architecture.md rewrite (lands with the full
v0.17.0 PR description + release notes)
- skills/_brain-filing-rules.md "which source to file into"
guidance (lands with Step 5 when sync surfaces --source)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 5/9): sync --source <id> routes through sources table (Lane D)
Adds the --source flag to `gbrain sync`. When set, sync reads local_path
+ last_commit from the matching sources(id) row instead of the global
sync.repo_path / sync.last_commit config keys, and writes last_commit +
last_sync_at back to the same row. Backward compat: --source omitted =
pre-v0.17 behavior exactly, global config path unchanged.
- SyncOpts.sourceId threaded through performSync + performFullSync
- readSyncAnchor/writeSyncAnchor helpers centralize the sources-vs-config
branch so every read/write goes through one decision point. Makes
Step 5's later per-source sync-failures tracking a one-file change.
- --source resolved via src/core/source-resolver.ts (Step 6), so any
command that shell-exposes resolveSourceId gets env var + dotfile
walk-up + longest-prefix for free.
- Error message for missing source local_path is actionable:
Source "gstack" has no local_path. Run: gbrain sources add gstack --path <path>
- last_sync_at auto-updates on every last_commit advance so `gbrain
sources list` shows real recency.
No regression: 2012 pass / 3 fail (same as baseline).
NOT in this commit (deferred per plan):
- Per-source failure tracking (~/.gbrain/sources/<id>/sync-failures.jsonl)
- runImport source-awareness (import.ts path — Step 5 continuation)
- Partial-success semantics when walking N sources — single-source flow
today, multi-walk lands when the top-level `gbrain sync` without
--source starts iterating all sources.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 4/9): qualified [[source:slug]] + links.resolution_type (Lane B)
Adds source-pinned wikilink syntax and records the resolution kind on
each edge so `gbrain extract --refresh-unqualified` (future) can
re-resolve bare references when the source topology changes.
Wikilink syntax extension:
[[concepts/ai]] — unqualified; resolves via local-first fallback
[[wiki:concepts/ai]] — qualified; target pinned to sources.id='wiki'
[[gstack:projects/foo|Display]] — qualified + display name
The qualified regex runs first and masks matched spans so the
unqualified pass can't double-emit. Source id format enforced to match
the sources CLI validation: [a-z0-9](?:[a-z0-9-]{0,30}[a-z0-9])?
Schema:
- migration v18 adds links.resolution_type TEXT with CHECK constraint
('qualified'|'unqualified' or NULL for legacy/manual/frontmatter edges)
- schema.sql + pglite-schema.ts updated for fresh installs
EntityRef type:
- sourceId is OPTIONAL (only set on qualified wikilinks). Markdown
[Name](path) and unqualified wikilinks omit it so strict toEqual
tests pre-v0.17 keep working (69 existing tests still pass).
Tests:
- 5 new qualified-wikilink extraction tests + 1 migration v18 structural
assertion. 75 tests in test/link-extraction.test.ts (up from 69).
- Full suite: 2018 pass / 3 fail (pre-existing PGLite infra timeouts).
NOT in this commit (deferred to Step 3 / Step 5 continuation):
- Writing resolution_type to the DB (addLink / addLinksBatch don't
carry the field yet — that's the plumb-through that lands with
Step 3 when search/dedup also needs source-aware result keys).
- `gbrain extract --refresh-unqualified` re-resolver.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 3/9): source-aware search dedup composite keys (Lane B)
Search dedup now keys on (source_id, slug) instead of slug alone. Pre-
v0.17 would collapse two same-slug pages in different sources into
one, destroying cross-source recall. Codex outside-voice review flagged
this as regression-critical — this commit ships the fix plus tests
that lock the invariant in.
Dedup pipeline (src/core/search/dedup.ts):
- pageKey(r) helper — one canonical composite-key derivation. Falls
back to source_id='default' for pre-v0.17 rows so single-source
brains behave identically to before.
- Layer 1 (dedupBySource): group-by composite key.
- Layer 4 (capPerPage): count-by composite key.
- guaranteeCompiledTruth: swap scoped to matching (source_id, slug),
so wiki:topics/ai can't accidentally pull gstack:topics/ai's
compiled_truth chunk.
SearchResult type gains optional source_id — populated by SQL JOINs
in both engines, falls through as 'default' for legacy callers.
Engine SQL:
- pglite-engine.ts + postgres-engine.ts: search SELECTs add p.source_id
- rowToSearchResult (utils.ts): maps row.source_id → result.source_id
when present. Shape stays backward compatible (field optional).
Tests — 4 new in test/dedup.test.ts:
- same-slug-different-source does NOT collapse (the critical regression
guard Codex called out)
- same-slug-same-source DOES still collapse (no over-correction)
- missing source_id falls back to 'default' for pre-v0.17 compat
- compiled_truth guarantee scopes to composite key (Codex second pass
caught this specific path would leak otherwise)
Full suite: 2022 pass / 3 fail (3 pre-existing PGLite infra timeouts).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(v0.17.0 step 7/9): file_migration_ledger + phase-B storage backfill (Lane E)
Adds files.source_id + files.page_id + the file_migration_ledger
state machine that drives storage object rewrites. Each per-file
transition is its own transaction so crash-point recovery is a
ledger read, not a filesystem inspection. Codex second-pass review
flagged that "skip if already has source prefix" was an unsafe
heuristic — the ledger replaces it with explicit state tracking.
Schema:
- migration v19 (files_source_id_page_id_ledger): handler-only
(PGLite has no files table; Postgres-only gate). ADDs
source_id + page_id to files, backfills page_id from page_slug
scoped to source_id='default', creates file_migration_ledger
with PK on file_id (Codex: not storage_path_old — two sources
can share an old path during migration).
- schema.sql updated for fresh Postgres installs; file_migration_ledger
gets RLS alongside other tables.
Runtime:
- src/commands/migrations/v0_17_0-storage-backfill.ts: drives the
ledger state machine pending → copy_done → db_updated → complete.
Idempotent per row: re-running resumes from whichever state
crashed. Old objects preserved (no delete) so operators can
verify the soak window before a future cleanup release.
- phase B in v0_17_0.ts orchestrator: wires the storage backend
(Supabase/S3/local) through createStorage, runs runStorageBackfill,
reports per-state counts + first-three error details.
Tests — 13 new in test/storage-backfill.test.ts:
- pending → copy_done → db_updated → complete happy path
- 3 crash-point recovery tests (resume from copy_done, resume from
db_updated, failed rows don't auto-retry)
- already-complete rows are skipped with zero side effects
- idempotent re-upload (exists-check skips redundant upload)
- dry-run mode (no storage, reports counts without mutating)
Plus 5 new migrate.test.ts assertions for v19 structure (handler-
only, PGLite gate, source_id + page_id + ledger DDL, default-source
backfill scope, state machine values).
Full suite: 2035 pass / 3 fail (3 pre-existing PGLite infra
timeouts).
NOT in this commit (explicitly deferred):
- DROP old page_slug column — kept for backward compat until
operators have time to verify page_id everywhere.
- DROP old UNIQUE(storage_path) in favor of UNIQUE(source_id,
storage_path) — same reason, deferred to later cleanup.
- Actual cleanup phase that deletes old objects post-soak.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(v0.17.0 step 9/9): full multi-source PGLite integration suite (Lane G)
End-to-end exercise of every v0.17.0 surface against real PGLite
(in-memory, fast — no DATABASE_URL needed). The migration chain
v2→v19 runs start-to-finish and the test asserts each Step's
invariants hold together.
16 new integration tests across 7 describes:
1. Migration-installed state:
- sources('default') exists with federated=true config
- pages.source_id column has DEFAULT 'default'
- composite UNIQUE (source_id, slug) is installed
2. Default-source write path:
- putPage without explicit source → source_id='default' via schema
default clause (no engine API change needed for single-source brains)
3. Composite UNIQUE regression guards (Codex-flagged):
- Same slug in two different sources coexists
- Third insert with same (source_id, slug) hits the UNIQUE constraint
4. sources CLI round-trip:
- federate / unfederate flips config.federated
- rename changes display, id stays immutable
5. Source resolution priority (integration):
- Explicit flag > env var > fallback to default
- Unregistered explicit source errors with actionable message
6. Cascade semantics:
- sources remove cascades to pages; default source untouched
7. links.resolution_type (Step 4):
- Qualified/unqualified values accepted
- CHECK constraint rejects invalid values
All 16 tests pass. Full suite: 2042 pass / 4 fail (4 pre-existing
PGLite beforeEach timeouts in test/wait-for-completion,
test/extract-fs, test/e2e/search-quality, test/e2e/graph-quality
— count fluctuated 3-5 on baseline from variance alone).
Total new tests across Steps 1-9: ~85 unit + integration tests
(sources, source-resolver, migrate v16/v17/v18/v19 structural,
link-extraction qualified wikilinks, dedup regression-critical,
storage-backfill state machine + crash recovery, full
multi-source PGLite integration).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: bump to v0.18.0 + CHANGELOG entry (multi-source brains)
One-viewport release summary + itemized changes covering all 9 steps
of the multi-source primitive. Notes the v0.17 → v0.18 version bump
rationale (master shipped gbrain dream as v0.17 while this branch was
in flight).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): v0_18_0 orchestrator TS narrow + mechanical test ON CONFLICT
Two CI failures on PR #337:
1. tsc TS2367 at src/commands/migrations/v0_18_0.ts:190 —
after the early-return on `a.status === 'failed'` (line 179),
TypeScript narrows `a.status` to `'skipped' | 'complete'`, so the
subsequent `a.status === 'failed' ? 'failed' :` branch was dead
code and refused to compile. Dropped the redundant check.
2. E2E `file_list LIMIT enforcement` at test/e2e/mechanical.test.ts:636 —
the test pre-seeded a pages row with `ON CONFLICT (slug) DO NOTHING`
but v21 swapped the global UNIQUE for `UNIQUE (source_id, slug)`, so
Postgres rejects with "no unique or exclusion constraint matching".
Updated the conflict target to the composite key.
Tier-1 E2E had only this one failing test; everything else passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(e2e): v0.18.0 multi-source against real Postgres (v20-v23 schema + cascade + sync)
Closes the three biggest confidence gaps the author flagged in the
self-audit of PR #337:
1. No real Postgres E2E — PGLite has no files table, so v23's
files.source_id + files.page_id rewrite + file_migration_ledger
seed was NEVER executed against the real DB. This file covers it.
2. `gbrain sync --source <id>` had zero direct tests. Now has two:
one that asserts performSync({sourceId}) reads local_path from the
sources row (not the global config), one that asserts no-sourceId
falls back to the global sync.repo_path.
3. Cascade delete coverage — previously verified only pages count
after source removal. Now verifies pages + content_chunks +
timeline_entries + links + files ALL cascade-delete when a source
is removed.
6 describes, 16 tests total:
- Schema shape (fresh install): 6 tests confirming sources('default'),
pages.source_id NOT NULL with DEFAULT, composite UNIQUE pages
(source_id, slug) replaces global UNIQUE(slug), links.resolution_type
column + CHECK, files.source_id + page_id columns, file_migration_ledger
table + status CHECK.
- Composite UNIQUE semantics: 3 tests confirming same-slug in two
sources coexists (Codex-critical regression guard), duplicate
(source_id, slug) hits the UNIQUE, putPage targets default source
by schema DEFAULT.
- Cascade delete: 1 test building a fully populated source (2 pages,
chunks, timeline, links, files) then removing it + asserting every
dependent row is gone.
- Sync routing: 2 tests confirming performSync({sourceId}) reads
per-source local_path vs global config.
- Sources surface: 3 tests for federate/unfederate flipping + rename
preserving id.
- Storage backfill: 1 end-to-end test seeding ledger + running
runStorageBackfill against a stub StorageBackend, asserting
pending → complete transition and files.storage_path rewrite.
Gated by DATABASE_URL per CLAUDE.md E2E lifecycle. Each describe's
beforeAll defensively DELETEs non-default sources + file_migration_ledger
rows so reruns are hermetic (sources isn't in helpers.ALL_TABLES).
Verified: 16/16 pass on first run AND second run (residual-state fix
holds). Full E2E suite still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): TS2352 in multi-source E2E — cast postgres.js RowList via unknown
tsc rejects the direct
`(rows as { column_name: string }[]).map(...)`
cast because postgres.js RowList rows have an iterable-row shape that
doesn't overlap with the plain-object target. Standard fix: cast via
`unknown` first so the narrowing is explicit.
Verified: `bunx tsc --noEmit` clean (ignoring the pre-existing baseUrl
deprecation warning).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(v0.18.0): addLinksBatch + addTimelineEntriesBatch source-aware JOINs
Batch APIs JOINed on pages.slug globally, so two pages sharing the same
slug across sources would silently fan out — addLinksBatch(['a->b']) in
a brain with 'a' in both 'default' and 'alt' wrote 2 edges instead of 1.
Same bug on addTimelineEntriesBatch.
Fix:
- LinkBatchInput + TimelineBatchInput gain optional source_id fields
(from_source_id, to_source_id, origin_source_id for links; source_id
for timeline). All default to 'default' so existing callers are
backward-compatible on single-source brains.
- pglite-engine + postgres-engine batch JOINs now composite-key on
(slug, source_id). Postgres adds 3 more unnest arrays for links + 1
for timeline — still one bind per column, no 65535-param cap risk.
- LEFT JOIN for origin pages also source-qualified so frontmatter-
provenance edges don't cross-pollinate across sources.
Regression coverage:
- test/pglite-engine.test.ts: 5 new tests covering default-path isolation,
explicit alt-source writes, and cross-source edges.
- test/e2e/multi-source.test.ts: 4 new tests against real Postgres so
postgres-js's unnest() bind path is exercised (structurally different
from PGLite's).
Gap #4 from the PR self-audit — latent bug, not previously reachable
because every existing caller wrote to the default source only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
677 lines
27 KiB
TypeScript
677 lines
27 KiB
TypeScript
import { describe, test, expect } from 'bun:test';
|
|
import {
|
|
extractEntityRefs,
|
|
extractPageLinks,
|
|
extractFrontmatterLinks,
|
|
inferLinkType,
|
|
makeResolver,
|
|
parseTimelineEntries,
|
|
isAutoLinkEnabled,
|
|
FRONTMATTER_LINK_MAP,
|
|
type SlugResolver,
|
|
} from '../src/core/link-extraction.ts';
|
|
import type { BrainEngine } from '../src/core/engine.ts';
|
|
|
|
// ─── extractEntityRefs ─────────────────────────────────────────
|
|
|
|
describe('extractEntityRefs', () => {
|
|
test('extracts filesystem-relative refs ([Name](../people/slug.md))', () => {
|
|
const refs = extractEntityRefs('Met with [Alice Chen](../people/alice-chen.md) at the office.');
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0]).toEqual({ name: 'Alice Chen', slug: 'people/alice-chen', dir: 'people' });
|
|
});
|
|
|
|
test('extracts engine-style slug refs ([Name](people/slug))', () => {
|
|
const refs = extractEntityRefs('See [Alice Chen](people/alice-chen) for context.');
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0]).toEqual({ name: 'Alice Chen', slug: 'people/alice-chen', dir: 'people' });
|
|
});
|
|
|
|
test('extracts company refs', () => {
|
|
const refs = extractEntityRefs('We invested in [Acme AI](companies/acme-ai).');
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0].dir).toBe('companies');
|
|
expect(refs[0].slug).toBe('companies/acme-ai');
|
|
});
|
|
|
|
test('extracts multiple refs in same content', () => {
|
|
const refs = extractEntityRefs('[Alice](people/alice) and [Bob](people/bob) met at [Acme](companies/acme).');
|
|
expect(refs.length).toBe(3);
|
|
expect(refs.map(r => r.slug)).toEqual(['people/alice', 'people/bob', 'companies/acme']);
|
|
});
|
|
|
|
test('handles ../../ deep paths', () => {
|
|
const refs = extractEntityRefs('[Alice](../../people/alice.md)');
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0].slug).toBe('people/alice');
|
|
});
|
|
|
|
test('handles unicode names', () => {
|
|
const refs = extractEntityRefs('Met [Héctor García](people/hector-garcia)');
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0].name).toBe('Héctor García');
|
|
});
|
|
|
|
test('returns empty array on no matches', () => {
|
|
expect(extractEntityRefs('No links here.')).toEqual([]);
|
|
});
|
|
|
|
test('skips malformed markdown (unclosed bracket)', () => {
|
|
expect(extractEntityRefs('[Alice(people/alice)')).toEqual([]);
|
|
});
|
|
|
|
test('skips non-entity dirs (notes/, ideas/ stay if added later but are accepted now)', () => {
|
|
// Current regex targets entity dirs explicitly. Notes/ shouldn't match.
|
|
const refs = extractEntityRefs('See [random](notes/random).');
|
|
expect(refs).toEqual([]);
|
|
});
|
|
|
|
test('extracts meeting refs', () => {
|
|
const refs = extractEntityRefs('See [Standup](meetings/2026-01-15-standup).');
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0].dir).toBe('meetings');
|
|
});
|
|
});
|
|
|
|
// ─── extractPageLinks ──────────────────────────────────────────
|
|
|
|
// Resolver that always returns whatever the caller asks for (pretend every
|
|
// page exists). Used by tests that only want to exercise the non-resolver
|
|
// paths (markdown + bare-slug + frontmatter.source).
|
|
const allowAllResolver = {
|
|
resolve: async (name: string) => {
|
|
if (/^[a-z][a-z0-9-]*\/[a-z0-9][a-z0-9-]*$/.test(name)) return name;
|
|
return null;
|
|
},
|
|
};
|
|
|
|
// Resolver that never resolves. Used to test that the non-frontmatter
|
|
// paths still produce candidates even when no fuzzy matching is possible.
|
|
const nullResolver = { resolve: async () => null };
|
|
|
|
describe('extractPageLinks', () => {
|
|
test('returns LinkCandidate[] with inferred types', async () => {
|
|
const { candidates } = await extractPageLinks(
|
|
'docs/x',
|
|
'[Alice](people/alice) is the CEO of Acme.',
|
|
{},
|
|
'concept',
|
|
allowAllResolver,
|
|
);
|
|
expect(candidates.length).toBeGreaterThan(0);
|
|
const aliceLink = candidates.find(c => c.targetSlug === 'people/alice');
|
|
expect(aliceLink).toBeDefined();
|
|
expect(aliceLink!.linkType).toBe('works_at');
|
|
});
|
|
|
|
test('dedups multiple mentions of same entity (within-page dedup)', async () => {
|
|
const content = '[Alice](people/alice) said this. Later, [Alice](people/alice) said that.';
|
|
const { candidates } = await extractPageLinks('docs/x', content, {}, 'concept', allowAllResolver);
|
|
const aliceLinks = candidates.filter(c => c.targetSlug === 'people/alice');
|
|
expect(aliceLinks.length).toBe(1);
|
|
});
|
|
|
|
test('extracts frontmatter source as source-type link', async () => {
|
|
const { candidates } = await extractPageLinks(
|
|
'docs/x', 'Some content.', { source: 'meetings/2026-01-15' }, 'person', allowAllResolver,
|
|
);
|
|
const sourceLink = candidates.find(c => c.linkType === 'source');
|
|
expect(sourceLink).toBeDefined();
|
|
expect(sourceLink!.targetSlug).toBe('meetings/2026-01-15');
|
|
});
|
|
|
|
test('extracts bare slug references in text', async () => {
|
|
const { candidates } = await extractPageLinks(
|
|
'docs/x', 'See companies/acme for details.', {}, 'concept', nullResolver,
|
|
);
|
|
const acme = candidates.find(c => c.targetSlug === 'companies/acme');
|
|
expect(acme).toBeDefined();
|
|
});
|
|
|
|
test('returns empty when no refs found', async () => {
|
|
const { candidates } = await extractPageLinks(
|
|
'docs/x', 'Plain text with no links.', {}, 'concept', nullResolver,
|
|
);
|
|
expect(candidates).toEqual([]);
|
|
});
|
|
|
|
test('meeting page references default to attended type', async () => {
|
|
const { candidates } = await extractPageLinks(
|
|
'meetings/x', 'Attendees: [Alice](people/alice), [Bob](people/bob).',
|
|
{}, 'meeting' as never, nullResolver,
|
|
);
|
|
const aliceLink = candidates.find(c => c.targetSlug === 'people/alice');
|
|
expect(aliceLink!.linkType).toBe('attended');
|
|
});
|
|
});
|
|
|
|
// ─── inferLinkType ─────────────────────────────────────────────
|
|
|
|
describe('inferLinkType', () => {
|
|
test('meeting + person ref -> attended', () => {
|
|
expect(inferLinkType('meeting', 'Attendees: Alice')).toBe('attended');
|
|
});
|
|
|
|
test('CEO of -> works_at', () => {
|
|
expect(inferLinkType('person', 'Alice is CEO of Acme.')).toBe('works_at');
|
|
});
|
|
|
|
test('VP at -> works_at', () => {
|
|
expect(inferLinkType('person', 'Bob, VP at Stripe, said.')).toBe('works_at');
|
|
});
|
|
|
|
test('invested in -> invested_in', () => {
|
|
expect(inferLinkType('person', 'YC invested in Acme.')).toBe('invested_in');
|
|
});
|
|
|
|
test('founded -> founded', () => {
|
|
expect(inferLinkType('person', 'Alice founded NovaPay.')).toBe('founded');
|
|
});
|
|
|
|
test('co-founded -> founded', () => {
|
|
expect(inferLinkType('person', 'Bob co-founded Beta Health.')).toBe('founded');
|
|
});
|
|
|
|
test('advises -> advises', () => {
|
|
expect(inferLinkType('person', 'Emily advises Acme on go-to-market.')).toBe('advises');
|
|
});
|
|
|
|
test('"board member" alone is too ambiguous (investors also hold board seats) -> mentions', () => {
|
|
// Tightened in v0.10.4 after BrainBench rich-prose surfaced that partner
|
|
// bios ("She sits on the boards of [portfolio company]") were classified
|
|
// as advises. Generic board language now requires explicit advisor/advise
|
|
// rooting to count.
|
|
expect(inferLinkType('person', 'Jane is a board member at Beta Health.')).toBe('mentions');
|
|
});
|
|
|
|
test('explicit advisor language -> advises', () => {
|
|
expect(inferLinkType('person', 'Jane is an advisor to Beta Health.')).toBe('advises');
|
|
expect(inferLinkType('person', 'Joined the advisory board at Beta Health.')).toBe('advises');
|
|
});
|
|
|
|
test('investment narrative variants -> invested_in', () => {
|
|
expect(inferLinkType('person', 'Wendy led the Series A for Cipher Labs.')).toBe('invested_in');
|
|
expect(inferLinkType('person', 'Bob is an early investor in Acme.')).toBe('invested_in');
|
|
expect(inferLinkType('person', 'She invests in fintech startups.')).toBe('invested_in');
|
|
expect(inferLinkType('person', 'Acme is a portfolio company of Founders Fund.')).toBe('invested_in');
|
|
expect(inferLinkType('person', 'Sequoia led the seed round for Vox.')).toBe('invested_in');
|
|
});
|
|
|
|
test('default -> mentions', () => {
|
|
expect(inferLinkType('person', 'Random context with no relationship verbs.')).toBe('mentions');
|
|
});
|
|
|
|
test('precedence: founded beats works_at', () => {
|
|
// "founded" appears first in regex precedence
|
|
expect(inferLinkType('person', 'Alice founded Acme and is the CEO of it.')).toBe('founded');
|
|
});
|
|
|
|
test('media page -> mentions (not attended)', () => {
|
|
expect(inferLinkType('media', 'Alice attended the workshop.')).toBe('mentions');
|
|
});
|
|
});
|
|
|
|
// ─── parseTimelineEntries ──────────────────────────────────────
|
|
|
|
describe('parseTimelineEntries', () => {
|
|
test('parses standard format: - **YYYY-MM-DD** | summary', () => {
|
|
const entries = parseTimelineEntries('- **2026-01-15** | Met with Alice');
|
|
expect(entries.length).toBe(1);
|
|
expect(entries[0]).toEqual({ date: '2026-01-15', summary: 'Met with Alice', detail: '' });
|
|
});
|
|
|
|
test('parses dash variant: - **YYYY-MM-DD** -- summary', () => {
|
|
const entries = parseTimelineEntries('- **2026-01-15** -- Met with Bob');
|
|
expect(entries.length).toBe(1);
|
|
expect(entries[0].summary).toBe('Met with Bob');
|
|
});
|
|
|
|
test('parses single dash: - **YYYY-MM-DD** - summary', () => {
|
|
const entries = parseTimelineEntries('- **2026-01-15** - Met with Carol');
|
|
expect(entries.length).toBe(1);
|
|
expect(entries[0].summary).toBe('Met with Carol');
|
|
});
|
|
|
|
test('parses without leading dash: **YYYY-MM-DD** | summary', () => {
|
|
const entries = parseTimelineEntries('**2026-01-15** | Standalone entry');
|
|
expect(entries.length).toBe(1);
|
|
});
|
|
|
|
test('parses multiple entries', () => {
|
|
const content = `## Timeline
|
|
- **2026-01-15** | First event
|
|
- **2026-02-20** | Second event
|
|
- **2026-03-10** | Third event`;
|
|
const entries = parseTimelineEntries(content);
|
|
expect(entries.length).toBe(3);
|
|
expect(entries.map(e => e.date)).toEqual(['2026-01-15', '2026-02-20', '2026-03-10']);
|
|
});
|
|
|
|
test('skips invalid dates (2026-13-45)', () => {
|
|
const entries = parseTimelineEntries('- **2026-13-45** | Bad date');
|
|
expect(entries.length).toBe(0);
|
|
});
|
|
|
|
test('skips invalid dates (2026-02-30)', () => {
|
|
const entries = parseTimelineEntries('- **2026-02-30** | Feb 30 doesnt exist');
|
|
expect(entries.length).toBe(0);
|
|
});
|
|
|
|
test('returns empty when no timeline lines found', () => {
|
|
expect(parseTimelineEntries('Just some plain text.')).toEqual([]);
|
|
});
|
|
|
|
test('handles mixed content (timeline lines interspersed with prose)', () => {
|
|
const content = `Some intro paragraph.
|
|
|
|
- **2026-01-15** | An event happened
|
|
|
|
More prose here.
|
|
|
|
- **2026-02-20** | Another event`;
|
|
const entries = parseTimelineEntries(content);
|
|
expect(entries.length).toBe(2);
|
|
});
|
|
});
|
|
|
|
// ─── isAutoLinkEnabled ─────────────────────────────────────────
|
|
|
|
function makeFakeEngine(configMap: Map<string, string | null>): BrainEngine {
|
|
return {
|
|
getConfig: async (key: string) => configMap.get(key) ?? null,
|
|
} as unknown as BrainEngine;
|
|
}
|
|
|
|
describe('isAutoLinkEnabled', () => {
|
|
test('null/undefined -> true (default on)', async () => {
|
|
const engine = makeFakeEngine(new Map());
|
|
expect(await isAutoLinkEnabled(engine)).toBe(true);
|
|
});
|
|
|
|
test('"false" -> false', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', 'false']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(false);
|
|
});
|
|
|
|
test('"FALSE" (case-insensitive) -> false', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', 'FALSE']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(false);
|
|
});
|
|
|
|
test('"0" -> false', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', '0']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(false);
|
|
});
|
|
|
|
test('"no" -> false', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', 'no']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(false);
|
|
});
|
|
|
|
test('"off" -> false', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', 'off']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(false);
|
|
});
|
|
|
|
test('"true" -> true', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', 'true']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(true);
|
|
});
|
|
|
|
test('"1" -> true', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', '1']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(true);
|
|
});
|
|
|
|
test('whitespace and case: " False " -> false', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', ' False ']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(false);
|
|
});
|
|
|
|
test('garbage value -> true (fail-safe to default)', async () => {
|
|
const engine = makeFakeEngine(new Map([['auto_link', 'garbage']]));
|
|
expect(await isAutoLinkEnabled(engine)).toBe(true);
|
|
});
|
|
});
|
|
|
|
// ─── Frontmatter link extraction (v0.13) ────────────────────────
|
|
|
|
/**
|
|
* In-memory resolver for frontmatter tests. Maps names to slugs via an
|
|
* explicit fixture map; returns null for anything missing. Mirrors what
|
|
* the real resolver does on a production brain but with deterministic
|
|
* inputs (no pg_trgm, no searchPages).
|
|
*/
|
|
function makeFixtureResolver(pages: Record<string, string>): SlugResolver {
|
|
return {
|
|
async resolve(name: string, dirHint?: string | string[]) {
|
|
const hints = Array.isArray(dirHint) ? dirHint : (dirHint ? [dirHint] : []);
|
|
// Already a slug — check if present.
|
|
if (/^[a-z][a-z0-9-]*\/[a-z0-9][a-z0-9-]*$/.test(name)) {
|
|
return pages[name] ?? null;
|
|
}
|
|
const slugified = name.toLowerCase().replace(/\s+/g, '-');
|
|
for (const hint of hints) {
|
|
if (!hint) continue;
|
|
const candidate = `${hint}/${slugified}`;
|
|
if (pages[candidate]) return candidate;
|
|
}
|
|
return null;
|
|
},
|
|
};
|
|
}
|
|
|
|
describe('extractFrontmatterLinks — field-map coverage', () => {
|
|
const pages = {
|
|
'people/pedro': 'people/pedro',
|
|
'people/garry': 'people/garry',
|
|
'people/diana-hu': 'people/diana-hu',
|
|
'companies/stripe': 'companies/stripe',
|
|
'companies/brex': 'companies/brex',
|
|
'companies/sequoia': 'companies/sequoia',
|
|
'companies/benchmark': 'companies/benchmark',
|
|
'meetings/2026-04-03': 'meetings/2026-04-03',
|
|
'deal/riveter-seed': 'deal/riveter-seed',
|
|
};
|
|
const resolver = makeFixtureResolver(pages);
|
|
|
|
test('person.company → outgoing works_at', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'people/pedro', 'person' as never, { company: 'Stripe' }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(1);
|
|
expect(candidates[0]).toMatchObject({
|
|
fromSlug: 'people/pedro',
|
|
targetSlug: 'companies/stripe',
|
|
linkType: 'works_at',
|
|
linkSource: 'frontmatter',
|
|
originSlug: 'people/pedro',
|
|
originField: 'company',
|
|
});
|
|
});
|
|
|
|
test('person.companies (array alias) → multiple works_at edges', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'people/pedro', 'person' as never, { companies: ['Stripe', 'Brex'] }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(2);
|
|
for (const c of candidates) {
|
|
expect(c.fromSlug).toBe('people/pedro');
|
|
expect(c.linkType).toBe('works_at');
|
|
expect(c.targetSlug).toMatch(/^companies\/(stripe|brex)$/);
|
|
}
|
|
});
|
|
|
|
test('company.key_people → INCOMING works_at (person → company)', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'companies/stripe', 'company' as never, { key_people: ['Pedro', 'Garry'] }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(2);
|
|
for (const c of candidates) {
|
|
// Incoming: from = resolved person, to = the page being written.
|
|
expect(c.targetSlug).toBe('companies/stripe');
|
|
expect(c.fromSlug).toMatch(/^people\/(pedro|garry)$/);
|
|
expect(c.linkType).toBe('works_at');
|
|
expect(c.originSlug).toBe('companies/stripe');
|
|
expect(c.originField).toBe('key_people');
|
|
}
|
|
});
|
|
|
|
test('meeting.attendees → INCOMING attended (person → meeting)', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'meetings/2026-04-03', 'meeting' as never, { attendees: ['Pedro', 'Garry'] }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(2);
|
|
for (const c of candidates) {
|
|
expect(c.targetSlug).toBe('meetings/2026-04-03');
|
|
expect(c.linkType).toBe('attended');
|
|
expect(c.fromSlug).toMatch(/^people\/(pedro|garry)$/);
|
|
}
|
|
});
|
|
|
|
test('deal.investors (multi-dir hint) → INCOMING invested_in', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'deal/riveter-seed', 'deal' as never,
|
|
{ investors: ['Sequoia', 'Benchmark'] }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(2);
|
|
for (const c of candidates) {
|
|
expect(c.targetSlug).toBe('deal/riveter-seed');
|
|
expect(c.linkType).toBe('invested_in');
|
|
expect(c.fromSlug).toMatch(/^companies\/(sequoia|benchmark)$/);
|
|
}
|
|
});
|
|
|
|
test('source field → outgoing source edge', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'people/pedro', 'person' as never, { source: 'meetings/2026-04-03' }, resolver,
|
|
);
|
|
const src = candidates.find(c => c.linkType === 'source');
|
|
expect(src).toBeDefined();
|
|
expect(src!.fromSlug).toBe('people/pedro');
|
|
expect(src!.targetSlug).toBe('meetings/2026-04-03');
|
|
});
|
|
|
|
test('unresolvable name goes to unresolved list, not candidates', async () => {
|
|
const { candidates, unresolved } = await extractFrontmatterLinks(
|
|
'meetings/x', 'meeting' as never,
|
|
{ attendees: ['Pedro', 'Unknown Person'] }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(1);
|
|
expect(unresolved).toHaveLength(1);
|
|
expect(unresolved[0]).toEqual({ field: 'attendees', name: 'Unknown Person' });
|
|
});
|
|
|
|
test('bad types (number, null, empty) skipped silently', async () => {
|
|
const { candidates, unresolved } = await extractFrontmatterLinks(
|
|
'meetings/x', 'meeting' as never,
|
|
{ attendees: [42, null, '', 'Pedro', { nothing: true }] }, resolver,
|
|
);
|
|
// Only 'Pedro' produces a candidate. 42/null/'' silently skipped.
|
|
// Object without name/slug/title is skipped. No unresolved entry for skipped.
|
|
expect(candidates).toHaveLength(1);
|
|
expect(candidates[0].fromSlug).toBe('people/pedro');
|
|
expect(unresolved).toHaveLength(0);
|
|
});
|
|
|
|
test('array of objects: uses .name, carries role into context', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'deal/riveter-seed', 'deal' as never,
|
|
{ investors: [{ name: 'Sequoia', role: 'lead' }] }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(1);
|
|
expect(candidates[0].context).toContain('Sequoia');
|
|
expect(candidates[0].context).toContain('lead');
|
|
});
|
|
|
|
test('context enrichment — not bare field name', async () => {
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'companies/stripe', 'company' as never, { key_people: ['Pedro'] }, resolver,
|
|
);
|
|
// Per plan Finding 7: context must include field + value, not bare 'frontmatter.key_people'.
|
|
expect(candidates[0].context).toBe('frontmatter.key_people: Pedro');
|
|
});
|
|
|
|
test('pageType filter — field ignored on non-matching page', async () => {
|
|
// `company` field only fires on person pages. On a concept page it's ignored.
|
|
const { candidates } = await extractFrontmatterLinks(
|
|
'concepts/x', 'concept' as never, { company: 'Stripe' }, resolver,
|
|
);
|
|
expect(candidates).toHaveLength(0);
|
|
});
|
|
});
|
|
|
|
describe('makeResolver — fallback chain', () => {
|
|
// Minimal engine fake with controlled pages + findByTitleFuzzy.
|
|
function makeFakeEngine(
|
|
slugs: string[],
|
|
fuzzyMap: Map<string, { slug: string; similarity: number }> = new Map(),
|
|
): BrainEngine {
|
|
const lookup = new Set(slugs);
|
|
let getPageCalls = 0;
|
|
let fuzzyCalls = 0;
|
|
let searchCalls = 0;
|
|
const engine = {
|
|
async getPage(slug: string) {
|
|
getPageCalls++;
|
|
return lookup.has(slug) ? { slug } as any : null;
|
|
},
|
|
async findByTitleFuzzy(name: string) {
|
|
fuzzyCalls++;
|
|
return fuzzyMap.get(name) ?? null;
|
|
},
|
|
async searchKeyword() {
|
|
searchCalls++;
|
|
return [];
|
|
},
|
|
} as unknown as BrainEngine;
|
|
(engine as any)._counts = () => ({ getPageCalls, fuzzyCalls, searchCalls });
|
|
return engine;
|
|
}
|
|
|
|
test('step 1: slug passthrough', async () => {
|
|
const engine = makeFakeEngine(['people/pedro']);
|
|
const r = makeResolver(engine);
|
|
expect(await r.resolve('people/pedro')).toBe('people/pedro');
|
|
});
|
|
|
|
test('step 2: dir-hint construction', async () => {
|
|
const engine = makeFakeEngine(['companies/stripe']);
|
|
const r = makeResolver(engine);
|
|
expect(await r.resolve('Stripe', 'companies')).toBe('companies/stripe');
|
|
});
|
|
|
|
test('step 3: pg_trgm fuzzy hit', async () => {
|
|
const engine = makeFakeEngine(
|
|
['companies/brex'],
|
|
new Map([['Brex Inc', { slug: 'companies/brex', similarity: 0.8 }]]),
|
|
);
|
|
const r = makeResolver(engine);
|
|
expect(await r.resolve('Brex Inc', 'companies')).toBe('companies/brex');
|
|
});
|
|
|
|
test('batch mode NEVER calls searchKeyword (deterministic migration)', async () => {
|
|
const engine = makeFakeEngine([]);
|
|
const r = makeResolver(engine, { mode: 'batch' });
|
|
const result = await r.resolve('Unknown Name', 'companies');
|
|
expect(result).toBeNull();
|
|
const counts = (engine as any)._counts();
|
|
expect(counts.searchCalls).toBe(0);
|
|
});
|
|
|
|
test('cache: same name → single getPage call', async () => {
|
|
const engine = makeFakeEngine(['people/pedro']);
|
|
const r = makeResolver(engine);
|
|
await r.resolve('people/pedro');
|
|
await r.resolve('people/pedro');
|
|
await r.resolve('people/pedro');
|
|
const counts = (engine as any)._counts();
|
|
expect(counts.getPageCalls).toBe(1);
|
|
});
|
|
|
|
test('unresolvable → null (no dead link written)', async () => {
|
|
const engine = makeFakeEngine([]);
|
|
const r = makeResolver(engine, { mode: 'batch' });
|
|
expect(await r.resolve('Nonexistent Person', 'people')).toBeNull();
|
|
});
|
|
});
|
|
|
|
describe('FRONTMATTER_LINK_MAP integrity', () => {
|
|
test('every mapping has fields + type + direction + dirHint', () => {
|
|
for (const m of FRONTMATTER_LINK_MAP) {
|
|
expect(m.fields.length).toBeGreaterThan(0);
|
|
expect(m.type).toBeTruthy();
|
|
expect(['outgoing', 'incoming']).toContain(m.direction);
|
|
expect(m.dirHint !== undefined).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('key_people maps to INCOMING works_at on company page', () => {
|
|
const m = FRONTMATTER_LINK_MAP.find(m => m.fields.includes('key_people'));
|
|
expect(m).toBeDefined();
|
|
expect(m!.direction).toBe('incoming');
|
|
expect(m!.pageType).toBe('company');
|
|
expect(m!.type).toBe('works_at');
|
|
});
|
|
|
|
test('attendees maps to INCOMING attended on meeting page', () => {
|
|
const m = FRONTMATTER_LINK_MAP.find(m => m.fields.includes('attendees'));
|
|
expect(m!.direction).toBe('incoming');
|
|
expect(m!.pageType).toBe('meeting');
|
|
expect(m!.type).toBe('attended');
|
|
});
|
|
|
|
test('investors uses multi-dir hint (companies/funds/people)', () => {
|
|
const m = FRONTMATTER_LINK_MAP.find(m => m.fields.includes('investors'));
|
|
expect(Array.isArray(m!.dirHint)).toBe(true);
|
|
expect(m!.dirHint).toContain('companies');
|
|
expect(m!.dirHint).toContain('funds');
|
|
expect(m!.dirHint).toContain('people');
|
|
});
|
|
});
|
|
|
|
|
|
// ─────────────────────────────────────────────────────────────────
|
|
// v0.18.0 Step 4 — qualified wikilink syntax [[source-id:dir/slug]]
|
|
// ─────────────────────────────────────────────────────────────────
|
|
describe("extractEntityRefs — v0.18.0 qualified wikilinks", () => {
|
|
test("[[wiki:topics/ai]] extracts with sourceId=wiki", () => {
|
|
const refs = extractEntityRefs("See [[concepts/ai]] vs [[wiki:concepts/ai]] for wiki-specific take.");
|
|
// One unqualified + one qualified.
|
|
expect(refs.length).toBe(2);
|
|
const qual = refs.find(r => r.sourceId === "wiki");
|
|
expect(qual).toBeDefined();
|
|
expect(qual!.slug).toBe("concepts/ai");
|
|
expect(qual!.name).toBe("concepts/ai");
|
|
const unqual = refs.find(r => r.sourceId === undefined);
|
|
expect(unqual).toBeDefined();
|
|
expect(unqual!.slug).toBe("concepts/ai");
|
|
});
|
|
|
|
test("[[gstack:projects/foo|Display Name]] preserves display + sourceId", () => {
|
|
const refs = extractEntityRefs("See [[gstack:projects/foo|The Foo Project]] for details.");
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0]).toEqual({ name: "The Foo Project", slug: "projects/foo", dir: "projects", sourceId: "gstack" });
|
|
});
|
|
|
|
test("qualified source-id format is validated (must match [a-z0-9-]+ kebab rules)", () => {
|
|
// Uppercase source IDs are not qualified — fall through to unqualified wikilink or no match.
|
|
const refs = extractEntityRefs("Legit: [[yc-media:concepts/seed]] Not legit: [[NotValid:concepts/x]]");
|
|
const qualified = refs.filter(r => r.sourceId);
|
|
expect(qualified.length).toBe(1);
|
|
expect(qualified[0].sourceId).toBe("yc-media");
|
|
});
|
|
|
|
test("masking prevents unqualified regex from matching inside a qualified link", () => {
|
|
// Without the mask, [[wiki:concepts/ai]] could also match as
|
|
// unqualified with slug "wiki:concepts/ai" (invalid dir) — the
|
|
// DIR_PATTERN whitelist normally blocks it, but masking is
|
|
// defense-in-depth.
|
|
const refs = extractEntityRefs("Ref: [[wiki:concepts/ai]]");
|
|
expect(refs.length).toBe(1);
|
|
expect(refs[0].sourceId).toBe("wiki");
|
|
});
|
|
|
|
test("markdown [Name](path) links always have no sourceId (unqualified by shape)", () => {
|
|
const refs = extractEntityRefs("[Alice](people/alice-chen) met [[wiki:people/bob]]");
|
|
const mdLink = refs.find(r => r.slug === "people/alice-chen");
|
|
expect(mdLink!.sourceId).toBeUndefined();
|
|
const wiki = refs.find(r => r.slug === "people/bob");
|
|
expect(wiki!.sourceId).toBe("wiki");
|
|
});
|
|
});
|
|
|
|
describe("v0.18.0 migration v22 — links_resolution_type", () => {
|
|
test("migration v22 exists with CHECK constraint", async () => {
|
|
const { MIGRATIONS } = await import("../src/core/migrate.ts");
|
|
const v22 = MIGRATIONS.find(m => m.version === 22);
|
|
expect(v22).toBeDefined();
|
|
expect(v22!.name).toBe("links_resolution_type");
|
|
expect(v22!.sql).toContain("ADD COLUMN IF NOT EXISTS resolution_type");
|
|
expect(v22!.sql).toContain("links_resolution_type_check");
|
|
expect(v22!.sql).toContain("qualified");
|
|
expect(v22!.sql).toContain("unqualified");
|
|
});
|
|
});
|
|
|