Files
gbrain/skills/migrate/SKILL.md
Garry Tan a86f995883 feat: GBrain v0.3.0 — contract-first architecture + ClawHub plugin (#7)
* feat: contract-first operations.ts with OperationError, dry_run, importFromContent

30 shared operations as single source of truth for CLI and MCP.
- OperationError with typed error codes (page_not_found, invalid_params, etc.)
- dry_run support on all mutating operations
- importFromContent split from importFile with transaction wrapping
- Idempotency hash now includes ALL fields (title, type, frontmatter, tags)
- Config env var fallback: GBRAIN_DATABASE_URL > DATABASE_URL > config file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: rewrite MCP server + CLI + tools-json from operations

server.ts: 233 -> ~80 lines. Tool definitions and dispatch generated from operations[].
cli.ts: shared operations auto-registered, CLI-only commands kept as manual dispatch.
tools-json: generated FROM operations[], eliminating the third contract surface.
Parity test verifies structural contract between operations, CLI, and MCP.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: delete 12 command files migrated to operations.ts

Handler logic for get, put, delete, list, search, query, health, stats,
tags, link, timeline, and version now lives in operations.ts.
Kept: init, upgrade, import, export, files, embed, sync, serve, call, config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: init --non-interactive, upgrade verification, schema migration

- gbrain init --non-interactive --url <url> for plugin mode (no TTY required)
- Post-upgrade version verification in gbrain upgrade
- Drop storage_url from files table (storage_path is the only identifier)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: tool-agnostic skills + new setup skill

All 7 skills rewritten with intent-based language instead of CLI commands.
Works with both CLI and MCP plugin contexts.
New setup skill replaces install: auto-provision Supabase via CLI,
AGENTS.md injection, target TTHW < 2 min.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: ClawHub bundle plugin, CI workflows, v0.3.0

- openclaw.plugin.json with configSchema, MCP server config, skill listing
- GitHub Actions: test on push/PR, multi-platform release (macOS arm64 + Linux x64)
- Version bump 0.3.0, CHANGELOG, README ClawHub section, CLAUDE.md updated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: idempotency hash mismatch + MCP dry_run passthrough

importFromContent now passes its all-fields hash through putPage via
content_hash on PageInput, so the stored hash matches the computed hash.
Previously the skip-if-unchanged check never fired because the hash
formulas differed.

MCP server now passes dry_run from tool params to OperationContext.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.3.0.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: schema loader handles PL/pgSQL $$ blocks

Delete the semicolon-based SQL splitter in db.ts which broke on
PL/pgSQL trigger functions containing semicolons inside $$ delimiter
blocks. Use single conn.unsafe(schemaSql) call instead — the postgres
driver handles multi-statement SQL natively. schema.sql already uses
IF NOT EXISTS / CREATE OR REPLACE for idempotency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: E2E test infrastructure + realistic brain fixtures

Add test infrastructure for running E2E tests against real
Postgres+pgvector. Includes:
- test/e2e/helpers.ts: DB lifecycle, fixture import, timing, diagnostics
- 13 fixture files as a miniature realistic brain (people, companies,
  deals, meetings, concepts, projects, sources) following the
  compiled truth + timeline format from GBRAIN_RECOMMENDED_SCHEMA.md
- docker-compose.test.yml: local pgvector convenience (port 5433)
- .env.testing.example: template for test credentials
- package.json: add test:e2e script

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: E2E test suites + CI workflow

Tier 1 (mechanical.test.ts): 14 test suites covering all operations
against real Postgres — page CRUD, search with quality scoring, links,
tags, timeline, versions, admin, chunks, resolution, ingest log, raw
data, files, idempotency stress, setup journey (full CLI flow), init
edge cases, schema idempotency, schema diff guard, performance baselines.

Tier 1 (mcp.test.ts): MCP protocol test — spawns server, sends JSON-RPC,
verifies tools/list matches operations count.

Tier 2 (skills.test.ts): OpenClaw skill tests — ingest, query, health.
Skips gracefully when dependencies missing.

CI (.github/workflows/e2e.yml): Tier 1 on every PR (pgvector service),
Tier 2 nightly/manual with API key secrets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: E2E test fixes + traverseGraph jsonb cast

- Fix traverseGraph query: cast json_agg to jsonb_agg so SELECT DISTINCT works
- Fix put_page tests to use importFromContent with noEmbed (no OpenAI key in Tier 1)
- Fix get_health assertion (page_count not total_pages)
- Fix raw_data test to handle JSONB string/object return
- Simplify MCP test to verify tool generation directly
- Add timeouts to CLI subprocess tests
- Use port 5434 for docker-compose (5433 often in use)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update all project docs for E2E test suite

- CLAUDE.md: updated test count (9 unit + 3 E2E), added E2E test
  instructions, fixed skill count to 8
- CONTRIBUTING.md: updated project structure with test/e2e/, added E2E
  test instructions, rewrote "Adding a new command" to reflect
  contract-first architecture (add to operations.ts, done)
- README.md: fixed table count (10 not 9), added recommended schema doc
  to Docs section, added E2E instructions to Contributing section
- CHANGELOG.md: added E2E test suite, docker-compose, schema loader fix,
  and traverseGraph jsonb fix to v0.3.0 entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:26:11 -10:00

2.8 KiB

Migrate Skill

Universal migration from any wiki, note tool, or brain system into GBrain.

Supported Sources

Source Format Strategy
Obsidian Markdown + [[wikilinks]] Direct import, convert wikilinks to gbrain links
Notion Exported markdown or CSV Parse Notion's export structure
Logseq Markdown with ((block refs)) Convert block refs to page links
Plain markdown Any .md directory Import directory into gbrain directly
CSV Tabular data Map columns to frontmatter fields
JSON Structured data Map keys to page fields
Roam JSON export Convert block structure to pages

General Workflow

  1. Assess the source. What format? How many files? What structure?
  2. Plan the mapping. How do source fields map to gbrain fields (type, title, tags, compiled_truth, timeline)?
  3. Test with a sample. Import 5-10 files, verify by reading them back from gbrain and exporting.
  4. Bulk import. Import the full directory into gbrain.
  5. Verify. Check gbrain health and statistics, spot-check pages.
  6. Build links. Extract cross-references from content and create typed links in gbrain.

Obsidian Migration

  1. Import the vault directory into gbrain (Obsidian vaults are markdown directories)
  2. Convert [[wikilinks]] to gbrain links:
    • Read each page from gbrain
    • For each [[Name]] found, resolve to a slug and create a link in gbrain
    • [[Name|alias]] uses the alias for context

Obsidian-specific:

  • Tags (#tag) become gbrain tags
  • Frontmatter properties map to gbrain frontmatter
  • Attachments (images, PDFs) are noted but handled separately via file storage

Notion Migration

  1. Export from Notion: Settings > Export > Markdown & CSV
  2. Notion exports nested directories with UUIDs in filenames
  3. Strip UUIDs from filenames for clean slugs
  4. Map Notion's database properties to frontmatter
  5. Import the cleaned directory into gbrain

CSV Migration

For tabular data (e.g., CRM exports, contact lists):

  1. For each row in the CSV, create a page with column values as frontmatter
  2. Use a designated column as the slug (e.g., name)
  3. Use another column as compiled_truth (e.g., notes)
  4. Store each page in gbrain

Verification

After any migration:

  1. Check gbrain statistics to verify page count matches source
  2. Check gbrain health for orphans and missing embeddings
  3. Export pages from gbrain for round-trip verification
  4. Spot-check 5-10 pages by reading them from gbrain
  5. Test search: search gbrain for "someone you know is in the data"

Tools Used

  • Store/update pages in gbrain (put_page)
  • Read pages from gbrain (get_page)
  • Link entities in gbrain (add_link)
  • Tag pages in gbrain (add_tag)
  • Get gbrain statistics (get_stats)
  • Check gbrain health (get_health)
  • Search gbrain (query)