* feat(schema): links provenance + engine plumbing (v0.13)
Adds link_source, origin_page_id, origin_field columns with
UNIQUE NULLS NOT DISTINCT constraint + CHECK constraint. New indexes
on link_source + origin_page_id.
migrate.ts v11 handles idempotent upgrade path for existing brains.
Both engines: addLink/addLinksBatch threads new columns (4→7 col
unnest). removeLink gains linkSource filter. getLinks/getBacklinks
return new columns.
New engine method findByTitleFuzzy(name, dirPrefix?, minSim?) uses
pg_trgm % operator + similarity(). Drives the v0.13 resolver's
fuzzy-match step with zero LLM/embedding cost.
* feat(graph): frontmatter edge extraction + slug resolver (v0.13)
Canonical FRONTMATTER_LINK_MAP: field → type + direction + dir-hint
for 10 frontmatter patterns (company/companies, key_people, investors,
attendees, partner, lead, founded, sources, source, related/see_also).
Direction semantics: "incoming" means resolved value is the FROM side
so subject-of-verb reads naturally (pedro → meeting, not backwards).
makeResolver(engine, {mode}) — two-mode resolver:
batch (migration): slug → dir-hint → pg_trgm. NEVER hits search.
live (put_page): + optional search fallback with expand=false
(dodges hidden Haiku per operations-query learning).
Per-run cache: same name → single DB lookup.
extractFrontmatterLinks handles arrays-of-objects (investors:
[{name: 'Sequoia', role: 'lead'}]), skips bad types silently,
tracks unresolved names for the summary report.
extractPageLinks is now async. LinkCandidate gains fromSlug,
linkSource, originSlug, originField. Returns {candidates, unresolved}.
22 new tests: field-map coverage, direction semantics, source vs
sources, resolver fallback chain (batch + live), cache hit, bad
types skipped, context enrichment, FRONTMATTER_LINK_MAP integrity.
* feat(auto-link): bidirectional reconciliation + unresolved response
put_page auto-link post-hook now handles incoming-direction frontmatter
edges. Reconciliation splits candidates into out (fromSlug === slug)
and in (fromSlug !== slug — frontmatter fields like key_people on a
company page emit person → company edges).
Safe reconciliation via origin_page_id scoping: we only touch
link_source='frontmatter' edges where origin_slug = the page being
written. Markdown + manual edges survive untouched. Edges created
by OTHER pages' frontmatter also survive.
put_page response extends auto_links with unresolved: Array<{field,
name}>. Agents writing attendees: [Pedro, Alex] where Alex doesn't
resolve see it in the response and can queue for enrichment.
Additive — existing agents unaffected.
extract.ts: delete the local 5-field extractFrontmatterLinks + local
inferLinkType. FS-source now calls canonical link-extraction.ts via
a synthetic resolver backed by the allSlugs Set. --include-frontmatter
flag (default OFF in v0.13 for back-compat; migration explicitly
enables for the one-time backfill). Top-20 unresolved names summary
when active.
* feat(migration): v0.13.0 orchestrator
3-phase orchestrator (schema → backfill → verify → record) follows
the v0_12_2.ts pattern. Phase A triggers migrate.ts v11 via
gbrain init --migrate-only. Phase B runs:
gbrain extract links --source db --include-frontmatter
to backfill frontmatter edges for every existing page. Uses the
batch-mode resolver (pg_trgm only, no LLM calls, zero API cost).
Ignores auto_link=false config — migration is canonical, the
auto_link flag controls per-write post-hook not one-time schema
work.
Idempotent + resumable via ON CONFLICT DO NOTHING + origin_page_id
scoping. Wall-clock budget: 2-5 min on 46K-page brains.
Registered in migrations/index.ts. apply-migrations test updated
to include v0.13.0 in skippedFuture for older installed versions.
* feat(release): upgrade-errors.jsonl trail + doctor surfacing
upgrade.ts catches post-upgrade subprocess failures as best-effort
today (line 65 comment: "post-upgrade is best-effort, don't fail
the upgrade"). When that chain silently fails, users end up with
half-upgraded brains and no signal.
v0.13: on post-upgrade failure, append a structured record to
~/.gbrain/upgrade-errors.jsonl with ts, phase, versions, error
message, and a paste-ready recovery hint.
doctor.ts reads the jsonl and surfaces the latest entry with a
warn-status check. User runs gbrain doctor, sees exactly what
failed, pastes the recovery command, files an issue if needed.
Applies to every future release — doctor grows with the codebase
without per-release edits. The CHANGELOG pattern ("To take advantage
of v[version]" block) mirrors this in user-facing form.
* chore: bump version and changelog (v0.13.0)
v0.13.0 — Frontmatter Relationship Indexing.
Adds the "To take advantage of v[version]" block pattern to
CHANGELOG format (CLAUDE.md documents the requirement going
forward). Pairs with the upgrade-errors.jsonl + doctor surfacing
to close the "half-upgraded brain, no signal" loop.
UPGRADING_DOWNSTREAM_AGENTS.md gets a v0.13 section: no-action-
required verdict for most skills, optional diffs for meeting-
ingestion / enrich / idea-ingest if they want to consume
auto_links.unresolved.
skills/migrations/v0.13.0.md is the user-facing upgrade skill.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(v0.13): adversarial review P0s
Codex + Claude adversarial review caught 4 critical issues in the
v0.13 implementation. Fixing before ship.
1. findByTitleFuzzy SET LOCAL was a no-op. postgres.js auto-commits
each sql`` so SET LOCAL pg_trgm.similarity_threshold committed
before the `%` operator ran against it. Resolver used server
default (0.3, not 0.55) → way too many fuzzy matches, wrong
links on a 46K-page brain. Switched to inline
`similarity(title, $1) >= $N` which has no transaction scoping.
Added `ORDER BY sim DESC, slug ASC` for deterministic
tie-breaking (prevents reconciliation churn on re-runs).
2. v11 migration now checks Postgres ≥ 15 before applying
UNIQUE NULLS NOT DISTINCT. Old Supabase projects on PG14 would
have dropped the old unique constraint and failed to add the
new one, corrupting the uniqueness invariant. The check raises
a clear error with the actual PG version, leaving the old
constraint in place.
3. v11 migration now backfills NULL link_source → 'markdown' for
pre-v0.13 legacy rows. Without this, reconciliation's existKey
comparison treats NULL and 'markdown' as equivalent but the
unique constraint sees them as distinct (NULLS NOT DISTINCT
only collapses NULL with NULL, not NULL with 'markdown'). Result
was duplicate edges accumulating forever. Treating legacy as
markdown is the accurate best-guess — pre-v0.13 auto-link only
emitted markdown edges.
4. v0_13_0.ts orchestrator now uses process.execPath, not a bare
`gbrain` on PATH. After `gbrain upgrade` rewrites the binary,
alias shadowing / PATH caching / multiple installs could
resolve a stale `gbrain` binary. process.execPath is always
the binary that loaded this migration module.
Phase C verify clarified: reports page + link counts and points to
Phase B's own stdout as the authoritative signal for backfill
results (extract.ts already prints `Links: created N from M pages`).
* docs: scrub real names from public docs + add privacy rule to CLAUDE.md
Public artifacts (CHANGELOG, skills, docs) should never reveal real
contacts, companies, funds, or private agent-fork names from any
user's brain. When a doc copies a query like `gbrain graph diana-hu`
or names a fork like `Wintermute`, that real name gets indexed,
cross-referenced, and distributed with every release.
CLAUDE.md gains a "Privacy rule: scrub real names from public docs"
section with:
- What counts as public (CHANGELOG, README, docs/, skills/, PR bodies,
commit messages, code comments)
- Name mapping table (agent forks → your agent fork; example person →
alice-example; example fund → fund-a; etc.)
- Distinction between illustrative API examples with household brands
(Stripe, Brex) and queries that reveal real relationships
Applied the rule to v0.13 scope:
- CHANGELOG v0.13 entry: Pedro/Diana/Wintermute/Sequoia/Benchmark/a16z
all replaced with alice/charlie/fund-a/acme/agent-fork placeholders
- skills/migrations/v0.13.0.md: same
- docs/UPGRADING_DOWNSTREAM_AGENTS.md: Wintermute references scrubbed
throughout (pre-v0.13 and v0.13 sections)
- CLAUDE.md: "Brain skills (from Wintermute)" → "(ported from an
upstream agent fork)", internal Wintermute provenance notes
genericized, "Garry finds fragile upgrade paths" → "the gbrain
maintainers find fragile upgrade paths" in the template
Pre-v0.13 historical CHANGELOG entries (v0.10-v0.12) left alone —
those are shipped releases; rewriting changes public history.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
145 lines
6.3 KiB
TypeScript
145 lines
6.3 KiB
TypeScript
import { describe, it, expect } from 'bun:test';
|
||
import {
|
||
extractMarkdownLinks,
|
||
extractLinksFromFile,
|
||
extractTimelineFromContent,
|
||
walkMarkdownFiles,
|
||
} from '../src/commands/extract.ts';
|
||
|
||
describe('extractMarkdownLinks', () => {
|
||
it('extracts relative markdown links', () => {
|
||
const content = 'Check [Pedro](../people/pedro-franceschi.md) and [Brex](../../companies/brex.md).';
|
||
const links = extractMarkdownLinks(content);
|
||
expect(links).toHaveLength(2);
|
||
expect(links[0].name).toBe('Pedro');
|
||
expect(links[0].relTarget).toBe('../people/pedro-franceschi.md');
|
||
});
|
||
|
||
it('skips external URLs ending in .md', () => {
|
||
const content = 'See [readme](https://example.com/readme.md) for details.';
|
||
const links = extractMarkdownLinks(content);
|
||
expect(links).toHaveLength(0);
|
||
});
|
||
|
||
it('handles links with no matches', () => {
|
||
const content = 'No links here.';
|
||
expect(extractMarkdownLinks(content)).toHaveLength(0);
|
||
});
|
||
|
||
it('extracts multiple links from same line', () => {
|
||
const content = '[A](a.md) and [B](b.md)';
|
||
expect(extractMarkdownLinks(content)).toHaveLength(2);
|
||
});
|
||
});
|
||
|
||
describe('extractLinksFromFile', () => {
|
||
it('resolves relative paths to slugs', async () => {
|
||
const content = '---\ntitle: Test\n---\nSee [Pedro](../people/pedro.md).';
|
||
const allSlugs = new Set(['people/pedro', 'deals/test-deal']);
|
||
const links = await extractLinksFromFile(content, 'deals/test-deal.md', allSlugs);
|
||
expect(links.length).toBeGreaterThanOrEqual(1);
|
||
expect(links[0].from_slug).toBe('deals/test-deal');
|
||
expect(links[0].to_slug).toBe('people/pedro');
|
||
});
|
||
|
||
it('skips links to non-existent pages', async () => {
|
||
const content = 'See [Ghost](../people/ghost.md).';
|
||
const allSlugs = new Set(['deals/test']);
|
||
const links = await extractLinksFromFile(content, 'deals/test.md', allSlugs);
|
||
expect(links).toHaveLength(0);
|
||
});
|
||
|
||
it('extracts frontmatter company links (v0.13, includeFrontmatter opt-in)', async () => {
|
||
const content = '---\ncompany: brex\ntype: person\n---\nContent.';
|
||
// v0.13 canonical: person page with company: X → person → company works_at (outgoing).
|
||
// Resolver needs companies/brex to exist in allSlugs to emit the edge.
|
||
const allSlugs = new Set(['people/test', 'companies/brex']);
|
||
const links = await extractLinksFromFile(content, 'people/test.md', allSlugs, { includeFrontmatter: true });
|
||
const companyLinks = links.filter(l => l.link_type === 'works_at');
|
||
expect(companyLinks.length).toBeGreaterThanOrEqual(1);
|
||
expect(companyLinks[0].from_slug).toBe('people/test');
|
||
expect(companyLinks[0].to_slug).toBe('companies/brex');
|
||
});
|
||
|
||
it('extracts frontmatter investors array (v0.13: incoming direction)', async () => {
|
||
// v0.13: deal page with investors:[yc, threshold] emits INCOMING edges:
|
||
// companies/yc → deals/seed invested_in and same for threshold.
|
||
const content = '---\ninvestors: [yc, threshold]\ntype: deal\n---\nContent.';
|
||
const allSlugs = new Set(['deals/seed', 'companies/yc', 'companies/threshold']);
|
||
const links = await extractLinksFromFile(content, 'deals/seed.md', allSlugs, { includeFrontmatter: true });
|
||
const investorLinks = links.filter(l => l.link_type === 'invested_in');
|
||
expect(investorLinks).toHaveLength(2);
|
||
// Incoming: from = resolved investor, to = deal page.
|
||
for (const l of investorLinks) {
|
||
expect(l.to_slug).toBe('deals/seed');
|
||
expect(l.from_slug).toMatch(/^companies\/(yc|threshold)$/);
|
||
}
|
||
});
|
||
|
||
it('frontmatter extraction is default OFF (back-compat)', async () => {
|
||
// Without includeFrontmatter, fs-source no longer auto-extracts frontmatter.
|
||
// Matches db-source behavior. User opts in with --include-frontmatter flag.
|
||
const content = '---\ncompany: brex\ntype: person\n---\nContent.';
|
||
const allSlugs = new Set(['people/test', 'companies/brex']);
|
||
const links = await extractLinksFromFile(content, 'people/test.md', allSlugs);
|
||
expect(links).toEqual([]);
|
||
});
|
||
|
||
it('infers link type from directory structure', async () => {
|
||
const content = 'See [Brex](../companies/brex.md).';
|
||
const allSlugs = new Set(['people/pedro', 'companies/brex']);
|
||
const links = await extractLinksFromFile(content, 'people/pedro.md', allSlugs);
|
||
expect(links[0].link_type).toBe('works_at');
|
||
});
|
||
|
||
it('infers deal_for type for deals -> companies', async () => {
|
||
const content = 'See [Brex](../companies/brex.md).';
|
||
const allSlugs = new Set(['deals/seed', 'companies/brex']);
|
||
const links = await extractLinksFromFile(content, 'deals/seed.md', allSlugs);
|
||
expect(links[0].link_type).toBe('deal_for');
|
||
});
|
||
});
|
||
|
||
describe('extractTimelineFromContent', () => {
|
||
it('extracts bullet format entries', () => {
|
||
const content = `## Timeline\n- **2025-03-18** | Meeting — Discussed partnership`;
|
||
const entries = extractTimelineFromContent(content, 'people/test');
|
||
expect(entries).toHaveLength(1);
|
||
expect(entries[0].date).toBe('2025-03-18');
|
||
expect(entries[0].source).toBe('Meeting');
|
||
expect(entries[0].summary).toBe('Discussed partnership');
|
||
});
|
||
|
||
it('extracts header format entries', () => {
|
||
const content = `### 2025-03-28 — Round Closed\n\nAll docs signed. Marcus joins the board.`;
|
||
const entries = extractTimelineFromContent(content, 'deals/seed');
|
||
expect(entries).toHaveLength(1);
|
||
expect(entries[0].date).toBe('2025-03-28');
|
||
expect(entries[0].summary).toBe('Round Closed');
|
||
expect(entries[0].detail).toContain('Marcus joins the board');
|
||
});
|
||
|
||
it('returns empty for no timeline content', () => {
|
||
const content = 'Just plain text without dates.';
|
||
expect(extractTimelineFromContent(content, 'test')).toHaveLength(0);
|
||
});
|
||
|
||
it('extracts multiple bullet entries', () => {
|
||
const content = `- **2025-01-01** | Source1 — Summary1\n- **2025-02-01** | Source2 — Summary2`;
|
||
const entries = extractTimelineFromContent(content, 'test');
|
||
expect(entries).toHaveLength(2);
|
||
});
|
||
|
||
it('handles em dash and en dash in bullet format', () => {
|
||
const content = `- **2025-03-18** | Meeting – Discussed partnership`;
|
||
const entries = extractTimelineFromContent(content, 'test');
|
||
expect(entries).toHaveLength(1);
|
||
});
|
||
});
|
||
|
||
describe('walkMarkdownFiles', () => {
|
||
it('is a function', () => {
|
||
expect(typeof walkMarkdownFiles).toBe('function');
|
||
});
|
||
});
|