* feat: migrate 8 existing skills to conformance format Add YAML frontmatter (name, version, description, triggers, tools, mutating), Contract, Anti-Patterns, and Output Format sections to all existing skills. Rename Workflow to Phases. Ingest becomes thin router delegating to specialized ingestion skills (Phase 2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add RESOLVER.md, conventions directory, and output rules RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md. Categorized routing table: Always-on, Brain ops, Ingestion, Thinking, Operational, Setup, Identity. Conventions directory extracts cross-cutting rules (quality, brain-first lookup, model routing, test-before-bulk). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add skills conformance and resolver validation tests skills-conformance.test.ts validates every skill has YAML frontmatter with required fields, Contract, Anti-Patterns, and Output Format sections, and manifest.json coverage. resolver.test.ts validates routing table categories, skill path existence, and manifest-to-resolver coverage. 50 new tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add 9 brain skills from Wintermute (Phase 2) Generalized from Wintermute's battle-tested skills: - signal-detector: always-on idea+entity capture on every message - brain-ops: brain-first lookup, read-enrich-write loop, source attribution - idea-ingest: links/articles/tweets with author people page mandatory - media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book) - meeting-ingestion: transcripts with attendee enrichment chaining - citation-fixer: audit and fix citation formatting - repo-architecture: filing rules by primary subject - skill-creator: create skills with conformance standard + MECE check - daily-task-manager: task lifecycle with priority levels All Garry-specific references generalized. Core workflows preserved. Updated RESOLVER.md and manifest.json. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add operational infrastructure + identity layer (Phase 3) Operational skills: - daily-task-prep: morning prep with calendar context and open threads - cross-modal-review: quality gate via second model with refusal routing - cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency - reports: timestamped reports with keyword routing - testing: skill validation framework (conformance checks) - soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md - webhook-transforms: external events to brain signals with dead-letter queue Identity layer: - SOUL.md template (agent identity, generated by soul-audit) - USER.md template (user profile, generated by soul-audit) - ACCESS_POLICY.md template (4-tier access control) - HEARTBEAT.md template (operational cadence) - cross-modal.yaml convention (review pairs, refusal routing chain) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates GBrain is now a GStack mod for agent platforms. Updated architecture description, key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills section (24 skills organized by resolver categories), and testing section (new conformance and resolver tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add GStack detection + mod status to gbrain init (Phase 4) After brain initialization, gbrain init now reports: - Number of skills loaded (from manifest.json) - GStack detection (checks known host paths, uses gstack-global-discover if available) - GStack install instructions if not found - Resolver and soul-audit pointers Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md deployment, and detectGStack() using gstack-global-discover with fallback to known paths (DRY: doesn't reimplement GStack's host detection logic). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: v0.10.0 release documentation - CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control, conventions, conformance standard, GStack detection in init - README: updated skill section with 24 skills, resolver, conventions - TODOS: added runtime MCP access control (P1) - VERSION: 0.9.2 → 0.10.0 - package.json + manifest.json version bumped Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add skill table to CHANGELOG v0.10.0 16-row table detailing every new skill, what it does, and why it matters. Written to sell the upgrade, not document the implementation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restore package.json version after merge conflict resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: zero-based README rewrite for GStackBrain v0.10.0 Lead with GStack mod identity. 24 skills table organized by category. Install block references RESOLVER.md and soul-audit. GBrain+GStack relationship explained. Removed redundancy (733 -> 406 lines). All essential content preserved: install, recipes, architecture, search, commands, engines, voice, knowledge model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README The 30-line copy-paste install block becomes one line: "Retrieve and follow INSTALL_FOR_AGENTS.md" Benefits: agent always gets latest instructions (no stale copy-paste), README stays clean, install details live where agents read them. README now leads with what GBrain does ("gives your agent a brain") instead of GStack relationship. Removed "requires frontier model" note. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: 3 bugs in init.ts from merge conflict resolution 1. llstatSync typo (merge corruption) → lstatSync 2. __dirname undefined in ESM module → fileURLToPath polyfill 3. require('fs') in ESM → use imported readFileSync All three would crash gbrain init at runtime. Caught by /review. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add checkResolvable shared core function for resolver validation Shared function at src/core/check-resolvable.ts validates that all skills are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for always-on/router skills), finds gaps in frontmatter triggers, and scans for DRY violations. Returns structured ResolvableIssue objects with machine-parseable fix objects alongside human-readable action strings. Three call sites: bun test, gbrain doctor, skill-creator skill. Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports from production check-resolvable.ts instead of reimplementing parsing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: expand doctor with resolver validation, filesystem-first architecture Doctor now runs filesystem checks (resolver health, skill conformance) before connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only when DB is unavailable. Adds schema_version: 2 to JSON output, composite health score (0-100), and structured issues array with action strings for agent parsing. Resolver health check calls checkResolvable() and surfaces actionable fix instructions. Link integrity check uses engine.getHealth() dead_links count. CLI routing split: doctor dispatched before connectEngine() so filesystem checks always run. Fixes Codex-identified blocker where doctor required DB. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add adaptive load-aware throttling and fail-improve loop backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem), exponential backoff with 20-attempt max guard, active hours multiplier (2x slower during waking hours), concurrent process limit (max 2). Windows-safe: defaults to "proceed" when os.loadavg returns zeros. fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure logging. Cascade failure handling: when both paths fail, throws LLM error and logs both. Log rotation at 1000 entries. Call count tracking for deterministic hit rate metrics. Auto-generates test cases from successful LLM fallbacks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add transcription service and enrichment-as-a-service transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB segmented via ffmpeg. Provider auto-detection from env vars. Clear error messages for missing API keys and unsupported formats. enrichment-service.ts: Global enrichment service callable from any ingest pathway. Entity slug generation (people/jane-doe, companies/acme-corp), mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based on mention frequency and source diversity), batch enrichment with backoff throttling, regex-based entity extraction from text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add data-research skill with recipe system, extraction, dedup, tracker New skill: data-research — one parameterized pipeline for any email-to- structured-data workflow (investor updates, donations, company metrics). 7-phase pipeline: define recipe, search, classify, extract (with extraction integrity rule), archive, deduplicate, update tracker. data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex extraction (battle-tested patterns), dedup with configurable tolerance, markdown tracker parsing/appending, quarterly/monthly date windowing, 6-phase HTML email stripping with 500KB ReDoS cap. Registers data-research in manifest.json (25th skill) and RESOLVER.md. Fixes backoff test robustness for high-load systems. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.10.0 infrastructure additions CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve, transcription, enrichment-service, data-research), 6 new test files, updated skill count to 25, test file count to 34. README.md: updated skill count to 25, added data-research to skills table. CHANGELOG.md: added Infrastructure section documenting resolver validation, doctor expansion, adaptive throttling, fail-improve loop, voice transcription, enrichment service, and data-research skill. TODOS.md: anonymized personal references. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: doctor.ts use ES module imports, harden backoff test Replace require('fs') with ES module import in doctor.ts for consistency with the rest of the file. Backoff test made resilient to parallel test execution leaking module-level state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: README rewrite with production brain stats, sample output, new infrastructure Lead with the flex: 17,888 pages, 4,383 people, 723 companies, 526 meeting transcripts built in 12 days. Show sample query output so readers see what they'll get. Document self-improving infrastructure (tier auto-escalation, fail-improve loop, doctor trajectory). Add data-research recipes to Getting Data In. Update commands section with doctor --fix, transcribe, research init/list. Fix stale "24" references to "25". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: README lead with YC President origin and production agent deployments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: README lead with skill philosophy and link to Thin Harness Fat Skills Skills section now explains: skill files are code, they encode entire workflows, they call deterministic TypeScript for the parts that shouldn't be LLM judgment. Links to the tweet and the architecture essay. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: link GStack repo, add 70K stars and 30K daily users Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove meeting transcript count from README (sensitive) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: README lead with YC President origin and production agent deployments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: rename political-donations recipe to expense-tracker (sensitivity) Renamed the built-in data-research recipe from political-donations to expense-tracker across README, CHANGELOG, SKILL.md, and reports routing. Same extraction patterns (amounts, dates, recipients), neutral framing. Also renamed social-radar keyword route to social-mentions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
282 lines
9.4 KiB
TypeScript
282 lines
9.4 KiB
TypeScript
import { describe, test, expect } from 'bun:test';
|
|
import {
|
|
validateRecipe,
|
|
extractFields,
|
|
verifyExtraction,
|
|
isDuplicate,
|
|
parseTrackerPage,
|
|
appendToTracker,
|
|
computeTotals,
|
|
buildDateWindows,
|
|
stripEmailHtml,
|
|
} from '../src/core/data-research.ts';
|
|
|
|
describe('data-research', () => {
|
|
describe('validateRecipe', () => {
|
|
test('valid recipe passes', () => {
|
|
const result = validateRecipe({
|
|
name: 'test',
|
|
source_queries: { gmail: ['subject:test'] },
|
|
extraction_schema: { amount: 'currency' },
|
|
tracker_page: 'trackers/test',
|
|
tracker_format: { group_by: 'year', columns: ['date', 'amount'] },
|
|
});
|
|
expect(result.valid).toBe(true);
|
|
expect(result.errors.length).toBe(0);
|
|
});
|
|
|
|
test('missing name fails', () => {
|
|
const result = validateRecipe({
|
|
source_queries: { gmail: ['test'] },
|
|
extraction_schema: { a: 'string' },
|
|
tracker_page: 't',
|
|
tracker_format: { group_by: 'y', columns: ['a'] },
|
|
});
|
|
expect(result.valid).toBe(false);
|
|
expect(result.errors).toContain('Missing required field: name');
|
|
});
|
|
|
|
test('empty source_queries fails', () => {
|
|
const result = validateRecipe({
|
|
name: 'test',
|
|
source_queries: {},
|
|
extraction_schema: { a: 'string' },
|
|
tracker_page: 't',
|
|
tracker_format: { group_by: 'y', columns: ['a'] },
|
|
});
|
|
expect(result.valid).toBe(false);
|
|
});
|
|
|
|
test('missing tracker_format columns fails', () => {
|
|
const result = validateRecipe({
|
|
name: 'test',
|
|
source_queries: { gmail: ['test'] },
|
|
extraction_schema: { a: 'string' },
|
|
tracker_page: 't',
|
|
tracker_format: { group_by: 'y', columns: [] },
|
|
});
|
|
expect(result.valid).toBe(false);
|
|
});
|
|
});
|
|
|
|
describe('extractFields', () => {
|
|
test('extracts MRR from text', () => {
|
|
const result = extractFields('Our MRR hit $188K this month', { mrr: 'currency' });
|
|
expect(result.mrr).toBe('188K');
|
|
});
|
|
|
|
test('extracts ARR from text', () => {
|
|
const result = extractFields('ARR: $2.3M', { arr: 'currency' });
|
|
expect(result.arr).toBe('2.3M');
|
|
});
|
|
|
|
test('extracts growth rate', () => {
|
|
const result = extractFields('We grew +14.7% MoM', { growth_mom: 'percentage' });
|
|
expect(result.growth_mom).toBe('+14.7%');
|
|
});
|
|
|
|
test('extracts runway months', () => {
|
|
const result = extractFields('We have 16 months of runway', { runway_months: 'number' });
|
|
expect(result.runway_months).toBe('16');
|
|
});
|
|
|
|
test('extracts headcount', () => {
|
|
const result = extractFields('Team of 23 employees', { headcount: 'number' });
|
|
expect(result.headcount).toBe('23');
|
|
});
|
|
|
|
test('extracts dollar amounts', () => {
|
|
const result = extractFields('Total Charged\n$5,900.00', { amount: 'currency' });
|
|
expect(result.amount).toBe('5,900.00');
|
|
});
|
|
|
|
test('returns null for unmatched fields', () => {
|
|
const result = extractFields('no metrics here', { mrr: 'currency', arr: 'currency' });
|
|
expect(result.mrr).toBeNull();
|
|
expect(result.arr).toBeNull();
|
|
});
|
|
|
|
test('extracts dates', () => {
|
|
const result = extractFields('Updated on 2026-04-15', { date: 'date' });
|
|
expect(result.date).toBe('2026-04-15');
|
|
});
|
|
});
|
|
|
|
describe('verifyExtraction', () => {
|
|
test('matching fields verify OK', () => {
|
|
const result = verifyExtraction(
|
|
{ mrr: '188K', arr: '2.3M' },
|
|
{ mrr: '188K', arr: '2.3M' },
|
|
);
|
|
expect(result.verified).toBe(true);
|
|
expect(result.mismatches.length).toBe(0);
|
|
});
|
|
|
|
test('mismatched fields are flagged', () => {
|
|
const result = verifyExtraction(
|
|
{ mrr: '188K', arr: '2.3M' },
|
|
{ mrr: '200K', arr: '2.3M' },
|
|
);
|
|
expect(result.verified).toBe(false);
|
|
expect(result.mismatches.length).toBe(1);
|
|
expect(result.mismatches[0]).toContain('mrr');
|
|
});
|
|
});
|
|
|
|
describe('isDuplicate', () => {
|
|
const existing = [
|
|
{ date: '2026-04-01', recipient: 'Alice', amount: '$100.00' },
|
|
{ date: '2026-04-01', recipient: 'Bob', amount: '$200.00' },
|
|
];
|
|
|
|
test('exact match is duplicate', () => {
|
|
const result = isDuplicate(existing, { date: '2026-04-01', recipient: 'Alice', amount: '$100.00' }, ['date', 'recipient', 'amount']);
|
|
expect(result.isDuplicate).toBe(true);
|
|
expect(result.type).toBe('exact');
|
|
});
|
|
|
|
test('new entry is not duplicate', () => {
|
|
const result = isDuplicate(existing, { date: '2026-04-02', recipient: 'Charlie', amount: '$300.00' }, ['date', 'recipient', 'amount']);
|
|
expect(result.isDuplicate).toBe(false);
|
|
expect(result.type).toBe('new');
|
|
});
|
|
|
|
test('different amount same entity+date flagged', () => {
|
|
const result = isDuplicate(
|
|
existing,
|
|
{ date: '2026-04-01', recipient: 'Alice', amount: '$150.00' },
|
|
['date', 'recipient', 'amount'],
|
|
);
|
|
expect(result.type).toBe('different_amount');
|
|
});
|
|
|
|
test('fuzzy entity matching', () => {
|
|
const result = isDuplicate(
|
|
existing,
|
|
{ date: '2026-04-01', recipient: 'Alice Smith', amount: '$100.00' },
|
|
['date', 'recipient', 'amount'],
|
|
{ entityFuzzy: true },
|
|
);
|
|
// "Alice" and "Alice Smith" share first 5 chars but fuzzy is first 15
|
|
// They won't fuzzy-match since "Alice" is only 5 chars
|
|
expect(result.type).toBe('new');
|
|
});
|
|
});
|
|
|
|
describe('parseTrackerPage', () => {
|
|
test('parses markdown table into entries', () => {
|
|
const md = `| Date | Amount | Status |
|
|
|------|--------|--------|
|
|
| 2026-04-01 | $100 | Done |
|
|
| 2026-04-02 | $200 | Pending |`;
|
|
const entries = parseTrackerPage(md, ['Date', 'Amount', 'Status']);
|
|
expect(entries.length).toBe(2);
|
|
expect(entries[0]['Date']).toBe('2026-04-01');
|
|
expect(entries[1]['Amount']).toBe('$200');
|
|
});
|
|
|
|
test('handles empty table', () => {
|
|
const entries = parseTrackerPage('No table here', ['a', 'b']);
|
|
expect(entries.length).toBe(0);
|
|
});
|
|
});
|
|
|
|
describe('appendToTracker', () => {
|
|
test('appends rows to markdown', () => {
|
|
const md = '### 2026\n\n| Date | Amount |\n|------|--------|\n| 2026-01-01 | $50 |\n';
|
|
const result = appendToTracker(md, [{ Date: '2026-04-01', Amount: '$100' }], ['Date', 'Amount']);
|
|
expect(result).toContain('2026-04-01');
|
|
expect(result).toContain('$100');
|
|
});
|
|
});
|
|
|
|
describe('computeTotals', () => {
|
|
test('sums numeric columns', () => {
|
|
const entries = [
|
|
{ amount: '$100.00', count: '5' },
|
|
{ amount: '$200.50', count: '3' },
|
|
];
|
|
const totals = computeTotals(entries, ['amount', 'count']);
|
|
expect(totals.amount).toBeCloseTo(300.50, 2);
|
|
expect(totals.count).toBe(8);
|
|
});
|
|
|
|
test('handles non-numeric values', () => {
|
|
const entries = [{ amount: 'N/A' }];
|
|
const totals = computeTotals(entries, ['amount']);
|
|
expect(totals.amount).toBe(0);
|
|
});
|
|
});
|
|
|
|
describe('buildDateWindows', () => {
|
|
test('quarterly windows for one year', () => {
|
|
const windows = buildDateWindows(2026, 2026, 'quarterly');
|
|
expect(windows.length).toBe(4);
|
|
expect(windows[0].label).toBe('Q1 2026');
|
|
expect(windows[3].label).toBe('Q4 2026');
|
|
});
|
|
|
|
test('monthly windows for one year', () => {
|
|
const windows = buildDateWindows(2026, 2026, 'monthly');
|
|
expect(windows.length).toBe(12);
|
|
expect(windows[0].label).toBe('2026-01');
|
|
expect(windows[11].label).toBe('2026-12');
|
|
});
|
|
|
|
test('multi-year quarterly windows', () => {
|
|
const windows = buildDateWindows(2024, 2026, 'quarterly');
|
|
expect(windows.length).toBe(12); // 3 years * 4 quarters
|
|
});
|
|
|
|
test('endYear < startYear throws', () => {
|
|
expect(() => buildDateWindows(2026, 2024)).toThrow('endYear');
|
|
});
|
|
});
|
|
|
|
describe('stripEmailHtml', () => {
|
|
test('strips HTML tags', () => {
|
|
const result = stripEmailHtml('<p>Hello <b>World</b></p>');
|
|
expect(result).toContain('Hello');
|
|
expect(result).toContain('World');
|
|
expect(result).not.toContain('<p>');
|
|
expect(result).not.toContain('<b>');
|
|
});
|
|
|
|
test('removes style blocks', () => {
|
|
const result = stripEmailHtml('<style>.foo { color: red; }</style><p>Content</p>');
|
|
expect(result).toContain('Content');
|
|
expect(result).not.toContain('color');
|
|
});
|
|
|
|
test('removes script blocks', () => {
|
|
const result = stripEmailHtml('<script>alert("xss")</script><p>Safe</p>');
|
|
expect(result).toContain('Safe');
|
|
expect(result).not.toContain('alert');
|
|
});
|
|
|
|
test('decodes HTML entities', () => {
|
|
const result = stripEmailHtml('& < > ');
|
|
expect(result).toContain('&');
|
|
expect(result).toContain('<');
|
|
expect(result).toContain('>');
|
|
});
|
|
|
|
test('truncates >500KB input (ReDoS prevention)', () => {
|
|
// Use a string just over 500KB to trigger truncation
|
|
const huge = '<p>' + 'x'.repeat(510 * 1024) + '</p>';
|
|
const result = stripEmailHtml(huge);
|
|
// After truncation, length should be around 500KB + "[truncated]"
|
|
expect(result).toContain('[truncated]');
|
|
});
|
|
|
|
test('completes quickly on large nested HTML', () => {
|
|
// Generate HTML that could cause ReDoS without the size cap
|
|
const nested = '<div>'.repeat(100) + 'content' + '</div>'.repeat(100);
|
|
const start = performance.now();
|
|
stripEmailHtml(nested);
|
|
const elapsed = performance.now() - start;
|
|
expect(elapsed).toBeLessThan(100); // should be well under 100ms
|
|
});
|
|
});
|
|
});
|