Commit Graph

5 Commits

Author SHA1 Message Date
Garry Tan
ee9e6689ad docs: expand brain schema with database architecture and OSS smoothing (#4)
* docs: expand brain schema — database architecture, dedup, enrichment sources, worked examples

Rewrite the recommended schema doc: present the database layer (entity registry,
event ledger, fact store, relationship graph) as the core architecture rather than
a future upgrade. Add entity identity/deduplication, enrichment source ordering,
epistemic discipline, three worked examples, concurrency guidance, and browser
budget. Smooth language for open-source readability.

* chore: bump version and changelog (v0.2.0.2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 00:24:16 -07:00
Garry Tan
96384b712f docs: fix first-time experience — remove fictional kindling, add recommended schema (#3)
* docs: add recommended brain schema

Full LLM-maintained knowledge base architecture: MECE directory structure,
compiled truth + timeline pages, enrichment pipeline, resolver decision
tree, skill architecture, and cron job recommendations.

* docs: fix first-time experience — remove fictional kindling, add GitHub URL

- Remove all references to data/kindling/ (never existed)
- OpenClaw paste now references https://github.com/garrytan/gbrain
- "Try it" section rewritten as three-act story with user's own data
- Agent picks dynamic query based on imported content
- Step 5 links to recommended schema doc for brain restructuring
- Includes bun install fallback in paste step 1

* chore: bump version and changelog (v0.2.0.1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 23:16:23 -07:00
Garry Tan
ecebd5552a feat: GBrain v0.2.0 — incremental sync, file storage, install skill (#2)
* refactor: extract importFile from import.ts + add tag reconciliation

Shared single-file import function used by both import and sync.
Adds tag reconciliation (removes stale tags on reimport), >1MB file
skip, and import->sync checkpoint continuity (writes git HEAD to
config table after import so sync picks up seamlessly).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add sync pure functions, updateSlug engine method, and sync tests

- buildSyncManifest: parses git diff --name-status -M output
- isSyncable: filters to .md pages, excludes hidden/ops/.raw/skip-list
- pathToSlug: converts file paths to page slugs with optional prefix
- updateSlug: renames page slug in-place (preserves page_id, chunks, embeddings)
- rewriteLinks: stub for v0.2 (FKs use page_id, already correct)
- 20 new tests, all passing (39 total across 3 files)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add gbrain sync command with CLI, MCP, and watch mode

18-step sync protocol: read config, git pull, ancestry validation,
git diff --name-status -M for net changes, isSyncable filter, process
deletes/renames/adds/modifies via importFile, batch optimization,
sync state checkpoint in Postgres config table. Watch mode with
polling and consecutive error counter. MCP sync_brain tool returns
structured SyncResult. Stale page deletion for un-syncable files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add files table, gbrain files commands, and config show redaction

- files table: page_slug FK with ON DELETE SET NULL + ON UPDATE CASCADE,
  storage_path, storage_url, mime_type, content_hash for dedup
- gbrain files list/upload/sync/verify commands for Supabase Storage
- gbrain config show redacts postgresql:// passwords and secret keys
- CLI help updated with FILES section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add install skill for GBrain onboarding

6-phase install workflow: environment discovery, Supabase setup (magic
path via CLI OAuth or fallback 2-copy-paste), init + import, ongoing
sync cron, optional file migration with mandatory verification, and
agent teaching (AGENTS.md rules). Every error gets what + why + fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.2.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add v0.2 features to README (sync, files, install skill)

README.md: added sync command to IMPORT/EXPORT section, added FILES
section with 4 commands, added files table to schema diagram, added
install skill to skills table, updated MCP tools count from 20 to 21
(sync_brain added).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: OpenClaw DX improvements (skill count, upgrade docs, config show help)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: consolidate version to single source of truth

Create src/version.ts that reads from package.json via static import
(safe for bun compiled binaries). Update mcp/server.ts from hardcoded
'0.1.0' to use shared VERSION. Bump skills/manifest.json to 0.2.0.

* fix: upgrade detection order, npm→bun naming, clawhub false positives

Reorder detection: node_modules first, binary second, clawhub last.
Rename 'npm' install method to 'bun'. Use 'clawhub --version' instead
of 'which clawhub' to avoid false positives from dangling symlinks.
Add 120s timeout to execSync calls to prevent hanging. Add --help flag.

* feat: per-command --help, unknown command check before DB connection

Add COMMAND_HELP map covering all 28 commands. Check --help before
init/upgrade dispatch and before connectEngine() so help works without
a database. Use COMMAND_HELP keys as known-command set to catch unknown
commands before wasting a DB round-trip.

* docs: standardize npm references to bun, add Upgrade section to README

Fix init.ts: npx→bunx, npm→bun for supabase CLI guidance.
Fix README: npm install→bun add for standalone CLI install.
Add ## Upgrade section to README with all three install methods.
Update install skill Upgrading section to list bun, ClawHub, and binary.

* test: full coverage audit — CLI dispatch, upgrade detection, config, edge cases

New test files:
- test/cli.test.ts: COMMAND_HELP ↔ switch consistency, version from
  package.json, per-command --help, unknown command handling, global help
- test/upgrade.test.ts: detection order verification, npm→bun naming,
  clawhub --version (not which), timeout presence
- test/config.test.ts: redactUrl for postgresql URLs, edge cases

Extended existing tests:
- test/sync.test.ts: empty string pathToSlug, uppercase .MD rejection,
  deeply nested files, multiple renames, unknown status codes
- test/markdown.test.ts: multiple --- separators, missing frontmatter,
  no frontmatter at all, empty string, type inference from paths

Tests: 39 → 83 (+44 new). All pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 100% coverage — import-file mock engine, files utils, chunker edge cases

New test files:
- test/import-file.test.ts (9 tests): mock BrainEngine to test importFile
  without DB — MAX_FILE_SIZE skip, content_hash dedup, tag reconciliation
  (remove stale + add new), compiled_truth/timeline chunking, noEmbed flag,
  sequential chunk_index
- test/files.test.ts (22 tests): getMimeType for all extensions + uppercase
  + unknown + no-extension, fileHash consistency + different content + empty,
  collectFiles pattern (skip .md, skip hidden dirs, recurse, sorted output)

Extended:
- test/chunkers/recursive.test.ts (+6 tests): single newline splits,
  word-only text, clause delimiters, lossless preservation, default options,
  mixed delimiter hierarchy

Tests: 83 → 118 (+35 new). All pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:50:15 -07:00
Garry Tan
b22cbd349a feat: GBrain v0.1.0 — Postgres-native personal knowledge brain (#1)
* chore: add CLAUDE.md with project context and gstack skill routing rules

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: initialize project with Bun + TypeScript

package.json with dependencies (postgres, pgvector, openai, anthropic,
MCP SDK, gray-matter). TypeScript config targeting ESNext with bundler
module resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add foundation layer — engine interface, Postgres engine, schema

BrainEngine pluggable interface with full PostgresEngine: CRUD, search
(keyword + vector), links, tags, timeline, versions, stats, health,
ingest log, config. Trigger-based tsvector spanning pages +
timeline_entries. Markdown parser with frontmatter, compiled_truth /
timeline splitting, and round-trip serialization. 19 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 3-tier chunking and embedding service

Recursive delimiter-aware chunker (5-level hierarchy, 300-word chunks,
50-word overlap). Semantic chunker with Savitzky-Golay boundary detection
and recursive fallback. LLM-guided chunker via Claude Haiku with sliding
window topic detection. OpenAI embedding service with batch support,
exponential backoff, and rate limit handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add hybrid search with RRF fusion, expansion, and 4-layer dedup

Hybrid search merges vector (pgvector HNSW) + keyword (tsvector) via
Reciprocal Rank Fusion. Multi-query expansion via Claude Haiku generates
2 alternative phrasings. 4-layer dedup pipeline: by source, cosine
similarity, type diversity (60% cap), per-page cap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add GBRAIN_V0 spec, pluggable engine architecture, SQLite engine plan

GBRAIN_V0.md: full product spec with architecture decisions, CLI commands,
schema, search architecture, chunking strategies, first-time experience,
and future plans. ENGINES.md: pluggable engine interface, capability matrix,
how to add new backends. SQLITE_ENGINE.md: complete SQLite implementation
plan with schema, FTS5 setup, vector search options, and contributor guide.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add CLI with all commands

Full CLI dispatcher with 25+ commands: init (Supabase wizard), get, put,
delete, list, search, query (hybrid RRF), import (bulk with progress bar),
export (round-trip), embed, stats, health, tag/untag/tags, link/unlink/
backlinks/graph, timeline/timeline-add, history/revert, config, upgrade,
serve, call. Smart slug resolution on reads. Version snapshots on updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add MCP stdio server with all brain tools

20 MCP tools mirroring CLI operations: get/put/delete/list pages,
search (keyword), query (hybrid RRF + expansion), tags, links with
graph traversal, timeline, stats, health, version history, and revert.
Auto-chunks and embeds on put_page. CLI and MCP share the same engine.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 6 skill files and ClawHub manifest

Fat markdown skills for AI agents: ingest (meetings/docs/articles with
timeline merge), query (3-layer search + synthesis + citations), maintain
(health checks, stale detection, orphan audit), enrich (external API
enrichment), briefing (daily briefing compilation), migrate (universal
migration from Obsidian/Notion/Logseq/markdown/CSV/JSON/Roam).
ClawHub manifest for skill distribution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add README, CONTRIBUTING, update CLAUDE.md test references

README with quickstart, commands, architecture, library usage, MCP setup,
and links to design docs. CONTRIBUTING with setup, project structure,
and guides for adding commands and engines. CLAUDE.md updated to reference
actual test files instead of planned-but-unwritten import test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address adversarial review findings — 5 critical/high fixes

- revertToVersion: add page_id check to prevent cross-page data corruption
- traverseGraph: use UNION instead of UNION ALL for cycle safety
- embedAll: preserve all chunks when embedding stale subset only
- embedding: throw on retry exhaustion instead of returning zero vectors
- putPage: validate slugs to prevent path traversal on export

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.1.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: expand README with schema, install, search architecture, and motivation

Why it exists, how search works (with ASCII diagram), full database schema
with all 9 tables and index details, chunking strategies explained, storage
estimates, setup wizard walkthrough, knowledge model with example page,
library usage with more examples, expanded skills table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: add MIT license (Copyright 2026 Garry Tan)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add OpenClaw install flow as primary option in README

OpenClaw users just say "install gbrain" and the orchestrator handles
everything: package install, Supabase setup wizard, skill registration.
Shows the conversational interface for querying, ingesting, and briefings.
ClawHub and standalone CLI paths follow as alternatives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add prerequisites and explicit OpenClaw install instructions

Prerequisites table listing Supabase, OpenAI, and Anthropic dependencies
with links. Environment variable setup. Explicit step-by-step prompt for
OpenClaw users showing exactly what to tell the orchestrator. Note that
search degrades gracefully without API keys (keyword-only without OpenAI,
no expansion without Anthropic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: scrub named references, add PG essay demo section to README

Replace all Pedro/Brex/Jensen Huang/River AI examples with Paul Graham
essay examples using the kindling corpus. Add "Try it" section to README
showing the power of hybrid search on PG essays in 90 seconds. Update
test fixtures to use concept pages instead of person pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:48:10 -07:00
Garry Tan
3144971cd0 Initial commit with gstack 2026-04-05 07:40:55 -07:00