* fix: widen validateSlug to accept any filename characters Git is the system of record. Slugs are lowercased repo-relative paths. The restrictive regex rejected spaces, parens, and special chars, blocking 5,861 Apple Notes files from importing. Now only rejects empty slugs, path traversal (..), and leading slash. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: enable RLS on all tables with BYPASSRLS safety check Without RLS, the Supabase anon key gives full read access to the DB. Enable RLS on all 10 tables with no policies — the postgres role (used by gbrain via pooler) has BYPASSRLS and is unaffected. Only enables if the current role actually has BYPASSRLS privilege to avoid locking ourselves out on non-Supabase setups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: import resilience — 5MB limit, error suppression, structured progress Raise MAX_FILE_SIZE from 1MB to 5MB for Apple Notes with attachments. Track error patterns and suppress after 5 identical errors to prevent 5,861 identical warnings from killing the agent process. Replace \r progress bar with structured log lines (rate, ETA) for agent parsing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: init detects IPv6-only Supabase URLs, adds pgvector check Detect db.*.supabase.co direct URLs and warn about IPv6 failure. On ECONNREFUSED/ETIMEDOUT to Supabase, suggest the Session pooler connection string with exact dashboard click path. Check for pgvector extension after connecting and fail with clear instructions if missing. Update wizard hints to show pooler URL format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add pre-ship requirement for E2E tests E2E tests against real Postgres+pgvector must pass before /ship or /review. Adds the requirement to CLAUDE.md so all agents enforce it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: parallel import with per-worker engine instances Refactor PostgresEngine to support instance-level DB connections instead of only the module-global singleton. Each worker gets its own connection with poolSize:2 (vs 10 for the main engine), so 8 workers = 16 connections. Add --workers N flag to gbrain import. Workers pull from a shared queue and use independent engine instances — no transaction context corruption. The bottleneck is network round-trips to Supabase (one per page upsert). Parallel workers cut import time proportionally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: automatic schema migration runner Migrations are embedded as string constants in migrate.ts (survives Bun --compile). Each migration runs in a transaction for clean rollback on failure. Runs automatically on initSchema() — no manual step needed when a user updates the gbrain binary against an older DB. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: pluggable storage backend (S3 + Supabase Storage + local) Add StorageBackend interface with three implementations: - S3Storage: works with AWS S3, Cloudflare R2, MinIO (any S3-compatible) - SupabaseStorage: uses Supabase Storage REST API with service role key - LocalStorage: filesystem-based, for testing Add file-resolver.ts with fallback chain: local file → .redirect breadcrumb → .supabase marker → storage backend. Supports the three-stage migration (mirror → redirect → clean). Add yaml-lite.ts for parsing marker and breadcrumb files without adding a YAML dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: gbrain doctor command — health checks with --json output Checks: connection, pgvector extension, RLS on all tables, schema version, embedding coverage. Outputs structured JSON with --json flag for agent parsing. Exit code 0 if healthy, 1 if issues found. Agents should run gbrain doctor --json when any command fails. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite setup skill + README for agent-first DX Setup skill: add Why Supabase, step-by-step project creation, explicit agent instructions (nohup for large imports, doctor on failure, don't ask for anon key), available init flags, file migration offer after first import. Remove ClawHub references. README: simplify to single OpenClaw install path, remove ClawHub, fix squatted npm name to github:garrytan/gbrain, add Supabase settings note about Session pooler. Add Apple Notes test fixtures with spaces and parens in filenames for E2E testing of the slug fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add RLS verification, schema health, and nohup hints to maintain skill Maintenance skill now checks RLS status and schema version as part of periodic health checks. Adds nohup pattern for large embedding refreshes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: import resume checkpoint + Supabase smart URL parsing Import resume: saves checkpoint every 100 files to ~/.gbrain/import-checkpoint.json. On restart with same directory and file count, skips already-processed files. Use --fresh to ignore checkpoint and start over. Cleared on successful completion. Supabase admin: extractProjectRef() parses any Supabase URL format (dashboard, direct, pooler, project URL) to extract the project ref. discoverPoolerUrl() uses the Management API to find the correct pooler connection string (including the exact region prefix). checkRls() verifies RLS status via the API. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add 56 unit tests for all new code 8 new test files covering every feature added in this branch: - slug-validation.test.ts: spaces, parens, unicode, path traversal (10 tests) - yaml-lite.test.ts: parse + stringify, marker/redirect formats (9 tests) - supabase-admin.test.ts: extractProjectRef for 4 URL formats (7 tests) - migrate.test.ts: version export, runMigrations callable (2 tests) - storage.test.ts: LocalStorage CRUD + createStorage factory (14 tests) - file-resolver.test.ts: fallback chain, redirect, marker parsing (6 tests) - import-resume.test.ts: checkpoint save/load/resume/fresh (6 tests) - doctor.test.ts: module export, CLI registration (3 tests) Total: 184 pass, 0 fail (up from 128). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: bulk chunk INSERT + E2E tests for all new features Bulk INSERT: upsertChunks now builds a multi-row VALUES query instead of inserting chunks one-by-one. Reduces DB round-trips by ~50x per page. E2E tests added to mechanical.test.ts: - Slug with special chars: import Apple Notes fixtures with spaces/parens, verify search finds them, verify idempotency - RLS verification: check pg_tables.rowsecurity on all tables, verify current user has BYPASSRLS - Doctor command: verify exit 0 on healthy DB, --json produces valid JSON with check structure - Parallel import: --workers 2 produces same page count as sequential Unit tests added: - setup-branching.test.ts: IPv6 detection, defaultWorkers auto-tuning, smart URL parsing across all Supabase URL formats Fixtures added: - large/big-file.md (2.1MB) for testing raised file size limit - apple-notes/ fixtures already existed Total: 200 pass, 0 fail (up from 184). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: --json on init/import, file migration CLI, lifecycle tests --json flag: init and import now support --json for structured output. Agents get parseable JSON instead of human-readable text. File migration CLI: implement mirror, unmirror, redirect, restore, clean, and status subcommands for the three-stage file migration lifecycle (local → mirrored → redirected → cloud-only). File migration tests: full lifecycle test covering every transition in the state machine (LOCAL → MIRROR → UNMIRROR → REDIRECT → RESTORE → CLEAN), including edge cases and file resolver at each stage. Bulk chunk INSERT: upsertChunks now builds multi-row parameterized VALUES query, reducing round-trips per page from ~50 to 1. Total: 207 pass, 0 fail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: thorough E2E tests for parallel import concurrency Replace the weak single-comparison parallel import test with 7 tests: - Sequential baseline: capture page count, chunk count, and all slugs - --workers 2: verify page count matches sequential - Chunk count matches (no duplicates from concurrent writes) - Page slugs match exactly - No duplicate pages (SQL GROUP BY HAVING count > 1) - No duplicate chunks (SQL GROUP BY page_id, chunk_index) - --workers 4: also works correctly - Re-import with workers is idempotent These tests catch the exact bug Codex found (db.ts singleton causing concurrent transaction corruption) by verifying data integrity after parallel writes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add batch embedding queue as P1 TODO Deferred during eng review (per-worker embedding is good enough for now). Revisit after profiling real imports to confirm embedding is the bottleneck. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: E2E test failures — fixture counts, arg parsing, doctor exit code Fix fixture count assertions: 13 → 16 pages (added apple-notes + large file), companies 2 → 3 (ohmygreen), concepts 3 → 5 (notes, big-file). Fix --workers arg parsing: the worker count value (e.g. "2") was being picked up as the directory arg. Skip flag values when finding the dir. Fix doctor exit code: warnings (like missing embeddings) should exit 0, only actual failures exit 1. E2E tests import with --no-embed, so embeddings are always WARN. Fix E2E CLI tests: add initCli() before doctor and parallel import tests so ~/.gbrain/config.json exists for the subprocess. All E2E tests pass: 63 pass, 0 fail. All unit tests pass: 207 pass, 0 fail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.4.0 New CHANGELOG entry for all post-0.3.0 features (doctor, storage backends, parallel import, resume checkpoints, RLS, schema migrations, --json output). Version bumped 0.3.0 → 0.4.0 across all manifests. CLAUDE.md: test count 9→19, skill count 8→7, added key files. CONTRIBUTING.md: fixture count 13→16, added missing source files. README.md: added gbrain doctor to commands, fixed stale welcome PRs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add GBRAIN_SKILLPACK.md reference architecture Production agent patterns from a real deployment with 14,700+ brain files. Covers: entity detection on every message, brain-first lookup protocol, 7-step enrichment pipeline with tiered API spend, compiled truth + timeline, source attribution with mandatory citations, meeting ingestion with entity propagation, cron schedule with quiet hours and travel-aware timezone, YouTube/media ingestion via Diarize.io, integration guides for ClawVisor, Circleback webhooks, and Quo/OpenPhone SMS. Opens with the Vannevar Bush memex framing and the originals folder for capturing intellectual capital. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: rewrite README opener with memex pitch and production architecture Replace code-first opener with mimetic-desire pitch: Vannevar Bush memex tagline, production brain numbers (10K+ files, 3K+ people, 13 years of calendar), "ask it anything" examples, compounding thesis. New sections: The Compounding Thesis (read-write loop), Architecture (three-column diagram), What a Production Agent Looks Like (SKILLPACK reference), How gbrain fits with OpenClaw (three-layer complement). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update skills with brain-first lookup, entity detection, heartbeat setup: Phase D rewritten with brain-first lookup protocol (gbrain search → query → get → grep fallback), sync-after-write rule, memory_search complement table. query: token-budget awareness (chunks not full pages), source precedence hierarchy (user > compiled truth > timeline > external). ingest: entity detection on every message (scan, check brain, create or enrich, commit and sync). maintain: heartbeat integration (doctor, embed --stale, sync verification, stale compiled truth detection). briefing: gbrain-native context loading (search attendees before meetings, search sender before email, daily deal/meeting/commitment queries). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add OpenClaw positioning to README opener Make it clear up top that GBrain is built for OpenClaw agents and works with any OpenClaw deployment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: credit Karpathy's Knowledge LLM vision, add origin story GBrain started as Karpathy's LLM wiki idea built for real. Worked great until the brain hit thousands of files and grep fell apart. GBrain is the search layer that had to exist once the brain outgrew grep. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
37 lines
1.3 KiB
TypeScript
37 lines
1.3 KiB
TypeScript
import { describe, test, expect } from 'bun:test';
|
|
import { extractProjectRef } from '../src/core/supabase-admin.ts';
|
|
|
|
describe('extractProjectRef', () => {
|
|
test('extracts from dashboard URL', () => {
|
|
expect(extractProjectRef('https://supabase.com/dashboard/project/rqfedtbsqoxrobdwfrsk/settings/database'))
|
|
.toBe('rqfedtbsqoxrobdwfrsk');
|
|
});
|
|
|
|
test('extracts from direct connection URL', () => {
|
|
expect(extractProjectRef('postgresql://postgres:password@db.rqfedtbsqoxrobdwfrsk.supabase.co:5432/postgres'))
|
|
.toBe('rqfedtbsqoxrobdwfrsk');
|
|
});
|
|
|
|
test('extracts from pooler URL', () => {
|
|
expect(extractProjectRef('postgresql://postgres.rqfedtbsqoxrobdwfrsk:password@aws-0-us-east-1.pooler.supabase.com:6543/postgres'))
|
|
.toBe('rqfedtbsqoxrobdwfrsk');
|
|
});
|
|
|
|
test('extracts from project URL', () => {
|
|
expect(extractProjectRef('https://rqfedtbsqoxrobdwfrsk.supabase.co'))
|
|
.toBe('rqfedtbsqoxrobdwfrsk');
|
|
});
|
|
|
|
test('returns null for non-supabase URL', () => {
|
|
expect(extractProjectRef('postgresql://user:pass@localhost:5432/mydb')).toBeNull();
|
|
});
|
|
|
|
test('returns null for empty string', () => {
|
|
expect(extractProjectRef('')).toBeNull();
|
|
});
|
|
|
|
test('returns null for random text', () => {
|
|
expect(extractProjectRef('hello world')).toBeNull();
|
|
});
|
|
});
|