* refactor: extract importFile from import.ts + add tag reconciliation Shared single-file import function used by both import and sync. Adds tag reconciliation (removes stale tags on reimport), >1MB file skip, and import->sync checkpoint continuity (writes git HEAD to config table after import so sync picks up seamlessly). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add sync pure functions, updateSlug engine method, and sync tests - buildSyncManifest: parses git diff --name-status -M output - isSyncable: filters to .md pages, excludes hidden/ops/.raw/skip-list - pathToSlug: converts file paths to page slugs with optional prefix - updateSlug: renames page slug in-place (preserves page_id, chunks, embeddings) - rewriteLinks: stub for v0.2 (FKs use page_id, already correct) - 20 new tests, all passing (39 total across 3 files) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add gbrain sync command with CLI, MCP, and watch mode 18-step sync protocol: read config, git pull, ancestry validation, git diff --name-status -M for net changes, isSyncable filter, process deletes/renames/adds/modifies via importFile, batch optimization, sync state checkpoint in Postgres config table. Watch mode with polling and consecutive error counter. MCP sync_brain tool returns structured SyncResult. Stale page deletion for un-syncable files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add files table, gbrain files commands, and config show redaction - files table: page_slug FK with ON DELETE SET NULL + ON UPDATE CASCADE, storage_path, storage_url, mime_type, content_hash for dedup - gbrain files list/upload/sync/verify commands for Supabase Storage - gbrain config show redacts postgresql:// passwords and secret keys - CLI help updated with FILES section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add install skill for GBrain onboarding 6-phase install workflow: environment discovery, Supabase setup (magic path via CLI OAuth or fallback 2-copy-paste), init + import, ongoing sync cron, optional file migration with mandatory verification, and agent teaching (AGENTS.md rules). Every error gets what + why + fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.2.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.2 features to README (sync, files, install skill) README.md: added sync command to IMPORT/EXPORT section, added FILES section with 4 commands, added files table to schema diagram, added install skill to skills table, updated MCP tools count from 20 to 21 (sync_brain added). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: OpenClaw DX improvements (skill count, upgrade docs, config show help) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: consolidate version to single source of truth Create src/version.ts that reads from package.json via static import (safe for bun compiled binaries). Update mcp/server.ts from hardcoded '0.1.0' to use shared VERSION. Bump skills/manifest.json to 0.2.0. * fix: upgrade detection order, npm→bun naming, clawhub false positives Reorder detection: node_modules first, binary second, clawhub last. Rename 'npm' install method to 'bun'. Use 'clawhub --version' instead of 'which clawhub' to avoid false positives from dangling symlinks. Add 120s timeout to execSync calls to prevent hanging. Add --help flag. * feat: per-command --help, unknown command check before DB connection Add COMMAND_HELP map covering all 28 commands. Check --help before init/upgrade dispatch and before connectEngine() so help works without a database. Use COMMAND_HELP keys as known-command set to catch unknown commands before wasting a DB round-trip. * docs: standardize npm references to bun, add Upgrade section to README Fix init.ts: npx→bunx, npm→bun for supabase CLI guidance. Fix README: npm install→bun add for standalone CLI install. Add ## Upgrade section to README with all three install methods. Update install skill Upgrading section to list bun, ClawHub, and binary. * test: full coverage audit — CLI dispatch, upgrade detection, config, edge cases New test files: - test/cli.test.ts: COMMAND_HELP ↔ switch consistency, version from package.json, per-command --help, unknown command handling, global help - test/upgrade.test.ts: detection order verification, npm→bun naming, clawhub --version (not which), timeout presence - test/config.test.ts: redactUrl for postgresql URLs, edge cases Extended existing tests: - test/sync.test.ts: empty string pathToSlug, uppercase .MD rejection, deeply nested files, multiple renames, unknown status codes - test/markdown.test.ts: multiple --- separators, missing frontmatter, no frontmatter at all, empty string, type inference from paths Tests: 39 → 83 (+44 new). All pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: 100% coverage — import-file mock engine, files utils, chunker edge cases New test files: - test/import-file.test.ts (9 tests): mock BrainEngine to test importFile without DB — MAX_FILE_SIZE skip, content_hash dedup, tag reconciliation (remove stale + add new), compiled_truth/timeline chunking, noEmbed flag, sequential chunk_index - test/files.test.ts (22 tests): getMimeType for all extensions + uppercase + unknown + no-extension, fileHash consistency + different content + empty, collectFiles pattern (skip .md, skip hidden dirs, recurse, sorted output) Extended: - test/chunkers/recursive.test.ts (+6 tests): single newline splits, word-only text, clause delimiters, lossless preservation, default options, mixed delimiter hierarchy Tests: 83 → 118 (+35 new). All pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
203 lines
5.5 KiB
TypeScript
203 lines
5.5 KiB
TypeScript
import { describe, test, expect } from 'bun:test';
|
|
import { parseMarkdown, serializeMarkdown, splitBody } from '../src/core/markdown.ts';
|
|
|
|
describe('Markdown Parser', () => {
|
|
test('parses frontmatter + compiled_truth + timeline', () => {
|
|
const md = `---
|
|
type: concept
|
|
title: Do Things That Don't Scale
|
|
tags: [startups, growth]
|
|
---
|
|
|
|
Paul Graham argues that startups should do unscalable things early on.
|
|
|
|
---
|
|
|
|
- 2013-07-01: Published on paulgraham.com
|
|
- 2024-11-15: Referenced in batch kickoff talk
|
|
`;
|
|
const parsed = parseMarkdown(md);
|
|
expect(parsed.type).toBe('concept');
|
|
expect(parsed.title).toBe("Do Things That Don't Scale");
|
|
expect(parsed.tags).toEqual(['startups', 'growth']);
|
|
expect(parsed.compiled_truth).toContain('unscalable things');
|
|
expect(parsed.timeline).toContain('Published on paulgraham.com');
|
|
expect(parsed.timeline).toContain('batch kickoff talk');
|
|
});
|
|
|
|
test('handles no timeline separator', () => {
|
|
const md = `---
|
|
type: concept
|
|
title: Superlinear Returns
|
|
---
|
|
|
|
Returns in many fields are superlinear.
|
|
Performance compounds over time.
|
|
`;
|
|
const parsed = parseMarkdown(md);
|
|
expect(parsed.compiled_truth).toContain('superlinear');
|
|
expect(parsed.timeline).toBe('');
|
|
});
|
|
|
|
test('handles empty body', () => {
|
|
const md = `---
|
|
type: concept
|
|
title: Empty Page
|
|
---
|
|
`;
|
|
const parsed = parseMarkdown(md);
|
|
expect(parsed.compiled_truth).toBe('');
|
|
expect(parsed.timeline).toBe('');
|
|
});
|
|
|
|
test('removes type, title, tags from frontmatter object', () => {
|
|
const md = `---
|
|
type: concept
|
|
title: Test
|
|
tags: [a, b]
|
|
custom_field: hello
|
|
---
|
|
|
|
Content
|
|
`;
|
|
const parsed = parseMarkdown(md);
|
|
expect(parsed.frontmatter).not.toHaveProperty('type');
|
|
expect(parsed.frontmatter).not.toHaveProperty('title');
|
|
expect(parsed.frontmatter).not.toHaveProperty('tags');
|
|
expect(parsed.frontmatter).toHaveProperty('custom_field', 'hello');
|
|
});
|
|
|
|
test('infers type from file path', () => {
|
|
const md = `---
|
|
title: Someone
|
|
---
|
|
Content
|
|
`;
|
|
const parsed = parseMarkdown(md, 'people/someone.md');
|
|
expect(parsed.type).toBe('person');
|
|
});
|
|
|
|
test('infers slug from file path', () => {
|
|
const md = `---
|
|
type: concept
|
|
title: Test
|
|
---
|
|
Content
|
|
`;
|
|
const parsed = parseMarkdown(md, 'concepts/do-things-that-dont-scale.md');
|
|
expect(parsed.slug).toBe('concepts/do-things-that-dont-scale');
|
|
});
|
|
});
|
|
|
|
describe('splitBody', () => {
|
|
test('splits at first standalone ---', () => {
|
|
const body = 'Above the line\n\n---\n\nBelow the line';
|
|
const { compiled_truth, timeline } = splitBody(body);
|
|
expect(compiled_truth).toContain('Above the line');
|
|
expect(timeline).toContain('Below the line');
|
|
});
|
|
|
|
test('returns all as compiled_truth if no separator', () => {
|
|
const body = 'Just some content\nWith multiple lines';
|
|
const { compiled_truth, timeline } = splitBody(body);
|
|
expect(compiled_truth).toBe(body);
|
|
expect(timeline).toBe('');
|
|
});
|
|
|
|
test('handles --- at end of content', () => {
|
|
const body = 'Content here\n\n---\n';
|
|
const { compiled_truth, timeline } = splitBody(body);
|
|
expect(compiled_truth).toContain('Content here');
|
|
expect(timeline.trim()).toBe('');
|
|
});
|
|
});
|
|
|
|
describe('serializeMarkdown', () => {
|
|
test('round-trips through parse and serialize', () => {
|
|
const original = `---
|
|
type: concept
|
|
title: Do Things That Don't Scale
|
|
tags:
|
|
- startups
|
|
- growth
|
|
custom: value
|
|
---
|
|
|
|
Paul Graham argues that startups should do unscalable things early on.
|
|
|
|
---
|
|
|
|
- 2013-07-01: Published on paulgraham.com
|
|
`;
|
|
const parsed = parseMarkdown(original);
|
|
const serialized = serializeMarkdown(
|
|
parsed.frontmatter,
|
|
parsed.compiled_truth,
|
|
parsed.timeline,
|
|
{ type: parsed.type, title: parsed.title, tags: parsed.tags },
|
|
);
|
|
|
|
// Re-parse the serialized version
|
|
const reparsed = parseMarkdown(serialized);
|
|
expect(reparsed.type).toBe(parsed.type);
|
|
expect(reparsed.title).toBe(parsed.title);
|
|
expect(reparsed.compiled_truth).toBe(parsed.compiled_truth);
|
|
expect(reparsed.timeline).toBe(parsed.timeline);
|
|
expect(reparsed.frontmatter.custom).toBe('value');
|
|
});
|
|
});
|
|
|
|
describe('parseMarkdown edge cases', () => {
|
|
test('handles content with multiple --- separators', () => {
|
|
const md = `---
|
|
type: concept
|
|
title: Test
|
|
---
|
|
|
|
First section.
|
|
|
|
---
|
|
|
|
Timeline part 1.
|
|
|
|
---
|
|
|
|
More timeline.`;
|
|
const parsed = parseMarkdown(md);
|
|
// Only splits at the FIRST standalone ---
|
|
expect(parsed.compiled_truth.trim()).toBe('First section.');
|
|
expect(parsed.timeline).toContain('Timeline part 1.');
|
|
expect(parsed.timeline).toContain('More timeline.');
|
|
});
|
|
|
|
test('handles frontmatter without type or title', () => {
|
|
const md = `---
|
|
custom_field: hello
|
|
---
|
|
|
|
Some content.`;
|
|
const parsed = parseMarkdown(md);
|
|
expect(parsed.type).toBeTruthy(); // should have a default
|
|
expect(parsed.compiled_truth.trim()).toBe('Some content.');
|
|
expect(parsed.frontmatter.custom_field).toBe('hello');
|
|
});
|
|
|
|
test('handles content with no frontmatter at all', () => {
|
|
const md = `Just plain text with no YAML.`;
|
|
const parsed = parseMarkdown(md);
|
|
expect(parsed.compiled_truth).toContain('Just plain text');
|
|
});
|
|
|
|
test('handles empty string', () => {
|
|
const parsed = parseMarkdown('');
|
|
expect(parsed.compiled_truth).toBe('');
|
|
expect(parsed.timeline).toBe('');
|
|
});
|
|
|
|
test('infers type from various directory paths', () => {
|
|
expect(parseMarkdown('', 'people/someone.md').type).toBe('person');
|
|
expect(parseMarkdown('', 'concepts/thing.md').type).toBe('concept');
|
|
expect(parseMarkdown('', 'companies/acme.md').type).toBe('company');
|
|
});
|
|
});
|