feat: v0.9.0 -- smart file storage, publish, production-grade skills (#62)
* feat: battle-tested skill patterns from production deployment Backport production-learned brain-operations patterns: - Iron Law of Back-Linking (mandatory bidirectional linking) - Brain filing rules (file by primary subject, not format) - Enrichment protocol (7-step pipeline, 3-tier system, person/company templates) - Media ingest workflows (articles, videos, podcasts, PDFs, screenshots) - Citation requirements (mandatory [Source: ...] on every fact) - Test Before Bulk operating principle - Voice recipe: unicode crash fix, PII scrub, identity-first prompt, DIY STT+LLM+TTS - X-to-Brain recipe: image OCR, Filtered Stream, tweet rating rubric, cron stagger * chore: bump version and changelog (v0.8.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add _brain-filing-rules.md to CLAUDE.md key files * feat: smart file upload with TUS resumable and .redirect.yaml pointers - Supabase Storage auto-selects upload method by file size: < 100 MB standard POST, >= 100 MB TUS resumable (6 MB chunks + retry) - Signed URL generation for private bucket access (1-hour expiry) - New `upload-raw` command with size routing: small text stays in git, large/media files go to cloud with .redirect.yaml pointer - New `signed-url` command for generating access links - File resolver supports both .redirect.yaml (v0.9+) and .redirect (legacy) - Redirect format upgraded: 10 fields with full metadata - All migration commands (mirror, redirect, restore, clean) handle both formats * feat: skills reference actual gbrain file commands - Filing rules document upload-raw, signed-url, and .redirect.yaml format - Ingest skill uses gbrain files upload-raw for raw source preservation - Maintain skill adds file storage health checks - Setup skill adds storage configuration phase with migration guidance - Voice recipe uses upload-raw for call audio storage - Migration v0.9.0 with complete storage setup instructions * chore: bump version and changelog (v0.9.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: gbrain publish -- shareable HTML with password protection First code+skill pair: deterministic code does the work (strip private data, encrypt with AES-256-GCM, generate self-contained HTML), the skill tells the agent when and how to use it. 34 new tests. See: https://x.com/garrytan/status/2042925773300908103 * feat: backlinks check/fix, page lint, and report commands Three new deterministic tools (zero LLM calls): - gbrain backlinks check/fix -- scans brain for entity mentions without back-links, creates them. Enforces the Iron Law from the skills. - gbrain lint [--fix] -- catches LLM preambles, code fence wrapping, placeholder dates, missing frontmatter, broken citations, empty sections. --fix auto-strips fixable artifacts. - gbrain report --type <name> -- saves timestamped reports to brain/reports/{type}/YYYY-MM-DD-HHMM.md for audit trails. 33 new tests (409 total, 0 fail). * feat: v0.9.0 migration tells agents to swap scripts for built-in commands Migration file now: - Lists all 5 new deterministic commands with usage examples - Includes a script-to-command replacement table (old -> new) - Tells the agent to find custom script references in AGENTS.md, skills, and cron jobs and replace with gbrain commands - Adds recommended cron jobs for daily backlink fix + weekly lint - References the Thin Harness, Fat Skills thread * fix: CLI routing bugs found during DX review - Fixed subArgs reference error in handleCliOnly (used wrong variable name) - Renamed gbrain backlinks check/fix to gbrain check-backlinks to avoid conflict with existing backlinks operation (per-page incoming links) - Added TOOLS section to --help output showing publish, check-backlinks, lint, report - Added upload-raw and signed-url to FILES section in --help - Updated all docs/migration references to use check-backlinks * fix: security hardening from adversarial review - XSS: sanitize marked.parse() output (strip script/iframe/on* attrs) - Path traversal: validate report --type against [a-z0-9-] pattern - TUS: HEAD request before retry to get server's actual offset (TUS spec) - Pointer: upload-raw now includes pointer content in JSON output - Symlinks: use lstatSync in all walkers to prevent directory escape --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,8 +1,8 @@
|
||||
---
|
||||
id: x-to-brain
|
||||
name: X-to-Brain
|
||||
version: 0.7.0
|
||||
description: Twitter timeline, mentions, and keyword monitoring flow into brain pages. Tracks deletions and engagement velocity.
|
||||
version: 0.8.1
|
||||
description: Twitter timeline, mentions, and keyword monitoring flow into brain pages. Tracks deletions, engagement velocity, OCR on images, and real-time alerts.
|
||||
category: sense
|
||||
requires: []
|
||||
secrets:
|
||||
@@ -201,7 +201,99 @@ The agent should review collected data 2-3x daily and run enrichment.
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.gbrain/integrations/x-to-brain
|
||||
echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","event":"setup_complete","source_version":"0.7.0","status":"ok","details":{"user_id":"X_USER_ID"}}' >> ~/.gbrain/integrations/x-to-brain/heartbeat.jsonl
|
||||
echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","event":"setup_complete","source_version":"0.8.1","status":"ok","details":{"user_id":"X_USER_ID"}}' >> ~/.gbrain/integrations/x-to-brain/heartbeat.jsonl
|
||||
```
|
||||
|
||||
## Production Patterns (v0.8.1)
|
||||
|
||||
These patterns come from a production deployment tracking 19+ accounts with
|
||||
real-time monitoring.
|
||||
|
||||
### Image OCR (NEW)
|
||||
|
||||
**Problem:** Text-only collection misses visual context in tweet images --
|
||||
screenshots, charts, memes with text overlay, quote screenshots.
|
||||
|
||||
**Fix:** Run OCR on tweet images via a vision model (Claude Sonnet or equivalent):
|
||||
- For every tweet with images, extract full text content via vision API
|
||||
- Store OCR output alongside the tweet data
|
||||
- Include extracted text in entity detection and brain page updates
|
||||
- Charts/data visualizations: extract data points, describe findings
|
||||
|
||||
This catches signal that text-only collectors miss entirely.
|
||||
|
||||
### Real-Time Monitoring via Filtered Stream (NEW)
|
||||
|
||||
**Problem:** 30-minute polling means you find out about things 30 minutes late.
|
||||
For time-sensitive content (engagement spikes, deletions, breaking threads),
|
||||
that's too slow.
|
||||
|
||||
**Fix:** Use Twitter's Filtered Stream API (`GET /2/tweets/search/stream`) for
|
||||
near-real-time monitoring. Catches outbound tweets within seconds.
|
||||
|
||||
**Setup:**
|
||||
1. Add filter rules: `POST /2/tweets/search/stream/rules` with your tracking terms
|
||||
2. Open persistent connection: `GET /2/tweets/search/stream`
|
||||
3. Process tweets as they arrive (no polling delay)
|
||||
|
||||
**Requirements:** Basic tier ($200/mo) minimum for Filtered Stream access.
|
||||
|
||||
**Use alongside polling:** Stream for real-time alerts, polling for completeness
|
||||
(stream can drop tweets during disconnects).
|
||||
|
||||
### Tweet Rating Rubric (NEW)
|
||||
|
||||
**Problem:** Not all tweets deserve the same attention. Without scoring, every
|
||||
tweet gets equal weight.
|
||||
|
||||
**Fix:** Rate tweets on a 6-dimension rubric:
|
||||
1. **Reach** -- follower count, engagement rate
|
||||
2. **Relevance** -- connection to your interests/work
|
||||
3. **Sentiment** -- positive/negative/neutral toward you
|
||||
4. **Novelty** -- new information vs rehash
|
||||
5. **Actionability** -- does this require a response?
|
||||
6. **Virality potential** -- engagement velocity, quote-tweet ratio
|
||||
|
||||
Re-rate after 60 minutes to track engagement trajectory. A tweet at 50 likes
|
||||
that hits 500 in an hour is a different signal than one that stays at 50.
|
||||
|
||||
### Outbound Tweet Monitoring (NEW)
|
||||
|
||||
**Problem:** You tweet something and don't notice engagement patterns until
|
||||
hours later.
|
||||
|
||||
**Fix:** 60-second monitoring window after every outbound tweet:
|
||||
- Check engagement velocity (likes, replies, quotes)
|
||||
- Flag unusual reply-to-like ratios (high reply ratios signal controversy)
|
||||
- Flag if quote-tweet ratio > retweet ratio (commentary, not sharing)
|
||||
- Cross-reference mentioned accounts against brain for context
|
||||
|
||||
### X-to-Brain Pipeline (NEW)
|
||||
|
||||
Every tweet interaction can automatically create/update brain pages:
|
||||
- Mentioned person has a brain page? Append to their timeline
|
||||
- New person mentioned? Check notability gate, create page if notable
|
||||
- Article URL in tweet? Fetch and ingest via article workflow
|
||||
- Video URL in tweet? Queue for transcription pipeline
|
||||
- Images? OCR and extract text content
|
||||
|
||||
Follow `skills/_brain-filing-rules.md` for filing decisions.
|
||||
|
||||
### Cron Staggering (IMPORTANT)
|
||||
|
||||
**Problem:** Multiple cron jobs firing simultaneously causes resource contention
|
||||
and timeouts.
|
||||
|
||||
**Fix:** Stagger all collection schedules so max 1 runs per minute:
|
||||
```
|
||||
# Good: staggered
|
||||
*/30 * * * * x-collector # :00, :30
|
||||
5,35 * * * * x-bundle-ingest # :05, :35
|
||||
10 */3 * * * social-monitor # :10 every 3h
|
||||
|
||||
# Bad: overlapping
|
||||
*/30 * * * * x-collector
|
||||
*/30 * * * * x-bundle-ingest # fires at same time!
|
||||
```
|
||||
|
||||
## Implementation Guide
|
||||
|
||||
Reference in New Issue
Block a user