docs: add Hermes alternatives in SKILLPACK, remove duplicate Section 16
- Section 13: agent memory table shows both OpenClaw memory_search and Hermes memory()/session_search() - Section 14a: credential gateway covers both ClawVisor (OpenClaw) and Hermes built-in gateway - Removed duplicate Section 16 (Deterministic Collectors was copy-pasted twice) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -663,7 +663,7 @@ serve different purposes.
|
||||
| Layer | What It Stores | Examples | How to Access |
|
||||
|-------|---------------|----------|---------------|
|
||||
| **GBrain** | World knowledge -- facts about people, companies, deals, meetings, concepts, ideas | Pedro's company page, meeting transcripts, original theses, deal terms | `gbrain search`, `gbrain query`, `gbrain get` |
|
||||
| **Agent memory** (`memory_search`) | Operational state -- preferences, architecture decisions, tool config, session continuity | "User prefers concise formatting", "Deploy to staging before prod", "ClawVisor task IDs" | `memory_search`, file reads |
|
||||
| **Agent memory** | Operational state -- preferences, architecture decisions, tool config, session continuity | "User prefers concise formatting", "Deploy to staging before prod" | OpenClaw: `memory_search`. Hermes: `memory(action="read")` + `session_search()` |
|
||||
| **Session context** | Current conversation window -- what was just said, what the user just asked | The last 20 messages, current task, immediate context | Already in context |
|
||||
|
||||
### When to Use Each
|
||||
@@ -685,11 +685,14 @@ store people dossiers in agent memory.
|
||||
Three integrations that make the agent real. Without these, the brain is a static
|
||||
database. With them, it's alive.
|
||||
|
||||
### 14a. ClawVisor -- Secure Gateway to Google and iMessage
|
||||
### 14a. Credential Gateway (ClawVisor / Hermes Gateway)
|
||||
|
||||
[ClawVisor](https://clawvisor.com) is a credential vaulting and authorization gateway.
|
||||
The agent never holds API keys directly. ClawVisor enforces policies, manages
|
||||
task-scoped authorization, and injects credentials at request time.
|
||||
The EA workflow needs Gmail, Calendar, Contacts, and messaging access. The agent
|
||||
should never hold API keys directly. Use a credential gateway that enforces policies
|
||||
and injects credentials at request time.
|
||||
|
||||
**OpenClaw: ClawVisor.** [ClawVisor](https://clawvisor.com) is a credential vaulting
|
||||
and authorization gateway with task-scoped authorization.
|
||||
|
||||
**Services:** Gmail (list, read, send, draft), Google Calendar (CRUD), Google Drive
|
||||
(list, search, read), Google Contacts (list, search), Apple iMessage (list, read,
|
||||
@@ -718,6 +721,13 @@ tracking threads" works. "Email triage" gets rejected. The intent verification m
|
||||
uses the purpose to judge whether each request is consistent -- if your purpose is
|
||||
narrow, legitimate requests fail verification.
|
||||
|
||||
**Hermes Agent: Built-in gateway.** Hermes has multi-platform messaging (Telegram,
|
||||
Discord, Slack, WhatsApp, Signal, Email) and tool access built into its gateway. Use
|
||||
`config.yaml` to configure API credentials. The gateway daemon manages connections
|
||||
and routes webhooks to agent sessions. For Google services, configure OAuth credentials
|
||||
in the gateway config. Hermes's scheduled automations can run the same EA workflows
|
||||
(email triage, calendar prep, contact enrichment) through the gateway's tool system.
|
||||
|
||||
### 14b. Circleback -- Meeting Ingestion via Webhooks
|
||||
|
||||
[Circleback](https://circleback.ai) records meetings, generates transcripts with
|
||||
@@ -998,158 +1008,6 @@ concatenation but 10x better at understanding what an email means. Use both.
|
||||
|
||||
---
|
||||
|
||||
## 16. Deterministic Collectors -- Code for Data, LLMs for Judgment
|
||||
|
||||
When your agent keeps failing at a mechanical task despite repeated prompt fixes, stop
|
||||
fighting the LLM. Move the mechanical work to code.
|
||||
|
||||
### The Pattern That Broke
|
||||
|
||||
We built an email triage system. The agent swept Gmail, classified emails by urgency,
|
||||
and posted a digest to the user. One rule: every email item must include a clickable
|
||||
`[Open in Gmail]` link so the user can act on it with one tap.
|
||||
|
||||
We put the rule in the skill file. We put it in MEMORY.md. We put it in the cron
|
||||
prompt. We wrote "NO EXCEPTIONS" in all caps. We wrote "ZERO TOLERANCE" after the
|
||||
fourth failure. The agent still dropped links -- on carry-forward reminders, on FYI
|
||||
items, on "still awaiting" sections. The user asked five times. Each time we added
|
||||
stronger language to the prompt.
|
||||
|
||||
The failure mode is probabilistic. The LLM understands the rule. It follows it for the
|
||||
first 10 items. Then it gets sloppy on item 11, especially on items that are
|
||||
re-surfaced from state rather than freshly pulled from the API. No amount of prompt
|
||||
engineering fixes a 90%-reliable formatting task, because 90% reliability over 20 items
|
||||
per sweep means you fail visibly about twice per day. That's enough to destroy trust.
|
||||
|
||||
### The Fix: Separate Deterministic from Analytical
|
||||
|
||||
```
|
||||
┌─────────────────────────────┐ ┌──────────────────────────────┐
|
||||
│ Deterministic Collector │────▶│ LLM Agent │
|
||||
│ (Node.js / Python script) │ │ │
|
||||
│ │ │ • Read the pre-formatted │
|
||||
│ • Pull data from API │ │ digest │
|
||||
│ • Store structured JSON │ │ • Classify items │
|
||||
│ • Generate links/URLs │ │ • Add commentary │
|
||||
│ • Detect patterns (regex) │ │ • Run brain enrichment │
|
||||
│ • Track state (seen/new) │ │ • Draft replies │
|
||||
│ • Output markdown digest │ │ • Surface to user │
|
||||
│ │ │ │
|
||||
│ CODE — deterministic, │ │ AI — judgment, context, │
|
||||
│ never forgets │ │ creativity │
|
||||
└─────────────────────────────┘ └──────────────────────────────┘
|
||||
```
|
||||
|
||||
The collector handles everything mechanical:
|
||||
|
||||
- Pulling emails from Gmail (via credential gateway)
|
||||
- Generating `[Open in Gmail](URL)` from message IDs -- **by code, not by LLM**
|
||||
- Detecting signature requests (DocuSign/Dropbox Sign regex patterns)
|
||||
- Tracking which messages are new vs. already seen (state file)
|
||||
- Storing structured JSON with full metadata
|
||||
- Generating a pre-formatted markdown digest with every link already embedded
|
||||
|
||||
The LLM reads the pre-formatted digest and does what LLMs are good at:
|
||||
|
||||
- Classifying urgency (requires understanding relationships, deadlines, context)
|
||||
- Writing commentary ("this is the $110M acquisition thread, 7 days dropped")
|
||||
- Running brain enrichment on notable entities (`gbrain search` + page updates)
|
||||
- Drafting replies
|
||||
- Deciding what to surface vs. filter
|
||||
|
||||
**The links are in the source data. The LLM can't forget them because it doesn't
|
||||
generate them.**
|
||||
|
||||
### Implementation
|
||||
|
||||
The email collector follows the same architecture as the X/Twitter collector (a
|
||||
deterministic data pipeline for social media monitoring):
|
||||
|
||||
```
|
||||
scripts/email-collector/
|
||||
├── email-collector.mjs # No LLM calls, no external deps
|
||||
├── data/
|
||||
│ ├── state.json # Last pull timestamp, known IDs, pending signatures
|
||||
│ ├── messages/ # Structured JSON per day
|
||||
│ │ └── 2026-04-09.json
|
||||
│ └── digests/ # Pre-formatted markdown
|
||||
│ └── 2026-04-09.md
|
||||
```
|
||||
|
||||
Every stored message includes:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "19d74109a811b9e7",
|
||||
"account": "work",
|
||||
"authuser": "user@example.com",
|
||||
"from": "Alex Smith",
|
||||
"subject": "Re: Next Steps",
|
||||
"snippet": "Hey, wanted to follow up on...",
|
||||
"timestamp": "2026-04-09T08:56:09Z",
|
||||
"is_unread": true,
|
||||
"is_noise": false,
|
||||
"is_signature": false,
|
||||
"gmail_link": "https://mail.google.com/mail/u/?authuser=user@example.com#inbox/19d74109a811b9e7",
|
||||
"gmail_markdown": "[Open in Gmail](https://mail.google.com/mail/u/?authuser=user@example.com#inbox/19d74109a811b9e7)"
|
||||
}
|
||||
```
|
||||
|
||||
The `gmail_link` and `gmail_markdown` fields are computed from `id` + `authuser` at
|
||||
collection time. Three lines of code. Never wrong.
|
||||
|
||||
### Cron Integration
|
||||
|
||||
The email monitoring cron runs the collector first, then invokes the LLM:
|
||||
|
||||
```
|
||||
1. node email-collector.mjs collect → deterministic API pull, store JSON
|
||||
2. node email-collector.mjs digest → generate markdown with links baked in
|
||||
3. node email-collector.mjs signatures → list pending e-signature items
|
||||
4. LLM reads digest + signatures → classifies, enriches, posts to user
|
||||
```
|
||||
|
||||
The collector runs in under 10 seconds. The LLM analysis takes 30-60 seconds. Total:
|
||||
under 90 seconds for a full inbox sweep with brain enrichment.
|
||||
|
||||
### Where Else This Pattern Applies
|
||||
|
||||
The deterministic-collector pattern works for any recurring data pull where the LLM
|
||||
was previously responsible for both fetching AND formatting:
|
||||
|
||||
| Signal Source | Collector Generates | LLM Adds |
|
||||
|--------------|-------------------|----------|
|
||||
| **Email** | Gmail links, sender metadata, signature detection | Urgency classification, enrichment, reply drafts |
|
||||
| **X/Twitter** | Tweet links, engagement metrics, deletion detection | Sentiment analysis, narrative detection, content ideas |
|
||||
| **Calendar** | Event links, attendee lists, conflict detection | Prep briefings, meeting context from brain |
|
||||
| **Slack** | Channel links, thread links, mention detection | Priority classification, action item extraction |
|
||||
| **GitHub** | PR/issue links, diff stats, CI status | Code review context, priority assessment |
|
||||
|
||||
The principle: if a piece of output MUST be present and MUST be formatted correctly
|
||||
every time, generate it in code. If a piece of output requires judgment, context, or
|
||||
creativity, generate it with the LLM. Don't ask the LLM to do both in the same pass.
|
||||
|
||||
### The Lesson
|
||||
|
||||
When an LLM keeps failing at a mechanical task despite repeated prompt fixes:
|
||||
|
||||
1. **Stop adding more prompt language.** You've already written "NO EXCEPTIONS" and
|
||||
"ZERO TOLERANCE." The LLM read it. The failure is probabilistic, not comprehension.
|
||||
2. **Identify what's mechanical vs. analytical.** URL generation is mechanical.
|
||||
Classification is analytical. State tracking is mechanical. Commentary is analytical.
|
||||
3. **Move the mechanical work to a script.** Node.js, Python, bash -- anything
|
||||
deterministic. No LLM calls, no external dependencies if possible.
|
||||
4. **Feed the LLM pre-formatted data.** The script's output becomes the LLM's input.
|
||||
Links are already there. Metadata is already structured. The LLM just adds judgment.
|
||||
5. **Wire it into your cron.** Script runs first (fast, cheap, reliable), then LLM
|
||||
reads the output (slower, expensive, creative).
|
||||
|
||||
This is not about the LLM being bad. It's about using the right tool for the right
|
||||
job. Code is 100% reliable at string concatenation. LLMs are 90% reliable at string
|
||||
concatenation but 10x better at understanding what an email means. Use both.
|
||||
|
||||
---
|
||||
|
||||
## 17. Auto-Update Notifications
|
||||
|
||||
GBrain ships updates frequently. The auto-update cron keeps users current by
|
||||
|
||||
Reference in New Issue
Block a user