Files
gbrain/test/e2e/jsonb-roundtrip.test.ts
triton6564685 7221dad83b fix(v0.18.2.fork.1): two pre-existing bugs surfaced by PW 1 part 2 prod deploy
Item 1 — sources.ts triple INSERT/UPDATE postgres-js double-encoding (root cause):
  Sites: src/commands/sources.ts:211 (runAdd), :471 (runUpdate), :407 (runFederate)
  Pattern: `JSON.stringify(config)` + `$N::jsonb` cast via `engine.executeRaw`
  → postgres-js's `unsafe()` API auto-encodes string params on `::jsonb` cast,
  re-stringifies the JSON content as a JSON STRING literal, lands in DB as
  jsonb_typeof = 'string' (not 'object'). Subsequent `jsonb_set()` migrations
  throw SQLSTATE 22023 'cannot set path in scalar'.

  Empirical verification (D-LXC fixture 189, 2026-05-07):
    Variant 1: `JSON.stringify(o)` + `$N::jsonb`           → string ✗ (current)
    Variant 2: object `o`           + `$N::jsonb`           → object ✓
    Variant 3: `JSON.stringify(o)` + no cast               → string ✗
    Variant 4: `JSON.stringify(o)` + `($N::text)::jsonb`   → object ✓ (this fix)

  Fix: `($N::text)::jsonb` double cast forces postgres-js to send param
  verbatim as TEXT (not jsonb-typed), then SQL re-parses to object at column
  boundary. Variant 4 over Variant 2 because it's defensive across postgres-js
  versions and the `unsafe()` API contract.

  Pairs with v26 step 0 healing (fork commit 71aaf22) which recovers
  pre-existing string-encoded prod data. After this commit, NEW sources
  written by `gbrain sources add` / `sources update` / `sources federate`
  land as objects directly, no heal needed for newly created rows.

  Test: e2e jsonb-roundtrip extended with sources INSERT/UPDATE coverage +
  source-grep tripwire that flags any future regressions.

Item 2 — sync.ts up_to_date path fails to advance last_sync_at:
  Site: src/commands/sync.ts:211-221 performSync `lastCommit === headCommit`
  branch returns immediately without updating sources.last_sync_at. Quiet
  sources (read-mostly repos) keep stale last_sync_at indefinitely; drift
  monitor (gbrain-projects-drift.sh) flags them stale even though the sync
  cron is firing every tick.

  Fix: advance last_sync_at on up_to_date branch via direct UPDATE (only
  last_sync_at, not last_commit since the commit anchor is genuinely
  unchanged). Preserves drift contract: "is the sync cron alive?" not
  "did the remote add commits?".

  Surfaced 2026-05-07 PW 1 part 2 prod deploy on LXC 107 — first drift
  tick post-deploy reported stock-dashboard 'stale 6197min ago' 30 seconds
  after a successful sync tick.

  Test: tests/sync-up-to-date-stamping.test.ts (3 cases) — quiet repo
  bumps last_sync_at, last_commit anchor stable, legacy non-sourceId path
  no-throws + records sync.last_run.

Both bugs were pre-existing (not introduced by PW 1 part 2 fork patches).
Both surfaced during prod deploy because v26 was the first migration to
hit jsonb_set on long-existing string-encoded configs, and PW 1 part 2's
new drift monitor read sources.last_sync_at directly (vs sync.sh's own
audit log in the prior implementation).

88/88 tests pass across allowlist / migration-v26 / sync-walk-dispatch /
sync-up-to-date-stamping / manifest-routing / manifest-edge-cases /
source-resolver / brain-allowlist / ingest-log-source-id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 22:22:13 +08:00

187 lines
7.9 KiB
TypeScript

/**
* E2E JSONB Roundtrip Tests — v0.12.1 Reliability Wave
*
* Guards the four JSONB write sites against double-encoding regressions:
* 1. PostgresEngine.putPage → pages.frontmatter
* 2. PostgresEngine.putRawData → raw_data.data
* 3. PostgresEngine.logIngest → ingest_log.pages_updated
* 4. commands/files.ts:254 → files.metadata
*
* The v0.12.0 bug: `${JSON.stringify(x)}::jsonb` sends a JSON-encoded string
* to postgres.js, which stores it as a JSONB *string literal* instead of an
* object. `col ->> 'key'` returns NULL; GIN indexes are ineffective.
* PGLite masks this because its driver parses the string. Real Postgres does not.
*
* The fix: `sql.json(x)` uses postgres.js v3's native JSONB serialization.
*/
import { describe, test, expect, beforeAll, afterAll } from 'bun:test';
import { hasDatabase, setupDB, teardownDB, getEngine, getConn } from './helpers.ts';
const skip = !hasDatabase();
const describeE2E = skip ? describe.skip : describe;
describeE2E('E2E: JSONB roundtrip — v0.12.1 reliability wave', () => {
beforeAll(async () => { await setupDB(); });
afterAll(async () => { await teardownDB(); });
test('putPage writes frontmatter as object, not double-encoded string', async () => {
const engine = getEngine();
await engine.putPage('test/jsonb-putpage', {
type: 'concept',
title: 'JSONB putPage test',
compiled_truth: 'body',
timeline: '',
frontmatter: { marker: 'putpage-value', tags: ['a', 'b'] },
});
const sql = getConn();
const [row] = await sql`
SELECT jsonb_typeof(frontmatter) AS t, frontmatter ->> 'marker' AS marker
FROM pages WHERE slug = 'test/jsonb-putpage'
`;
expect(row.t).toBe('object');
expect(row.marker).toBe('putpage-value');
});
test('putRawData writes raw_data.data as object, not double-encoded string', async () => {
const engine = getEngine();
await engine.putPage('test/jsonb-rawdata', {
type: 'concept',
title: 'RawData test',
compiled_truth: 'body',
timeline: '',
frontmatter: {},
});
await engine.putRawData('test/jsonb-rawdata', 'unit-test', {
marker: 'rawdata-value',
nested: { k: 'v' },
});
const sql = getConn();
const [row] = await sql`
SELECT jsonb_typeof(rd.data) AS t, rd.data ->> 'marker' AS marker
FROM raw_data rd
JOIN pages p ON p.id = rd.page_id
WHERE p.slug = 'test/jsonb-rawdata'
`;
expect(row.t).toBe('object');
expect(row.marker).toBe('rawdata-value');
});
test('logIngest writes pages_updated as array, not double-encoded string', async () => {
const engine = getEngine();
await engine.logIngest({
source_type: 'unit-test',
source_ref: 'jsonb-roundtrip',
pages_updated: ['test/a', 'test/b', 'test/c'],
summary: 'jsonb logingest check',
});
const sql = getConn();
const [row] = await sql`
SELECT jsonb_typeof(pages_updated) AS t,
jsonb_array_length(pages_updated) AS n,
pages_updated ->> 0 AS first
FROM ingest_log
WHERE source_ref = 'jsonb-roundtrip'
ORDER BY id DESC LIMIT 1
`;
expect(row.t).toBe('array');
expect(Number(row.n)).toBe(3);
expect(row.first).toBe('test/a');
});
// files.ts:254 (uploadRaw's cloud-upload branch) was changed from
// `${JSON.stringify({...})}::jsonb` to `${sql.json({...})}` in v0.12.1.
// The function reads config and touches cloud storage, so we exercise the
// driver-level pattern directly against the same table/column.
test('files.metadata writes as object via sql.json(), not double-encoded string', async () => {
const sql = getConn();
const payload = { type: 'pdf', upload_method: 'TUS resumable' };
await sql`
INSERT INTO files (page_slug, filename, storage_path, mime_type, size_bytes, content_hash, metadata)
VALUES (NULL, 'jsonb-check.bin', 'unsorted/jsonb-check.bin', 'application/octet-stream', 1, 'sha256:deadbeef', ${sql.json(payload)})
ON CONFLICT (storage_path) DO UPDATE SET metadata = EXCLUDED.metadata
`;
const [row] = await sql`
SELECT jsonb_typeof(metadata) AS t,
metadata ->> 'type' AS type,
metadata ->> 'upload_method' AS method
FROM files WHERE storage_path = 'unsorted/jsonb-check.bin'
`;
expect(row.t).toBe('object');
expect(row.type).toBe('pdf');
expect(row.method).toBe('TUS resumable');
});
// Source-level tripwire: if anyone re-introduces the old `${JSON.stringify(x)}::jsonb`
// pattern for the fixed sites, fail loudly. Greps actual source files per the
// files-test-reimplements-production tripwire (CLAUDE.md).
test('no ${JSON.stringify(x)}::jsonb pattern remains in fixed sites', async () => {
const files = [
'../../src/core/postgres-engine.ts',
'../../src/commands/files.ts',
];
const bad = /\$\{[^}]*JSON\.stringify\([^}]*\)[^}]*\}::jsonb/;
for (const rel of files) {
const source = await Bun.file(new URL(rel, import.meta.url)).text();
expect(source.match(bad)?.[0] ?? null).toBeNull();
}
});
// v0.18.2.fork.1: sources.ts triple INSERT/UPDATE missed in v0.12.1 wave.
// Different fix variant — the unsafe()-API path uses `$N::jsonb` cast on a
// JSON.stringify'd param (not template-tag `${..}::jsonb`). postgres-js's
// unsafe() detects the cast and re-stringifies the param, landing as a
// JSON STRING scalar (jsonb_typeof = 'string'). v26 migration's jsonb_set
// then throws SQLSTATE 22023 "cannot set path in scalar".
// Fix: `($N::text)::jsonb` double cast forces postgres-js to send param
// verbatim as TEXT, then SQL re-parses to object at column boundary.
// Verified empirically on D-LXC fixture 189 (2026-05-07).
test('sources INSERT writes config as object, not double-encoded string', async () => {
const sql = getConn();
const { runAdd } = await import('../../src/commands/sources.ts') as any;
const engine = getEngine();
const testId = 'jsonb-sources-add-' + Math.floor(Math.random() * 1e6);
await runAdd(engine, [testId, '--federated', '--slug-prefix', 'test-prefix/']);
const [row] = await sql`
SELECT jsonb_typeof(config) AS t,
config -> 'federated' AS federated,
config -> 'slug_prefix_rules' AS rules
FROM sources WHERE id = ${testId}
`;
expect(row.t).toBe('object');
expect(row.federated).toBe(true);
expect(row.rules).toEqual(['test-prefix/']);
await sql`DELETE FROM sources WHERE id = ${testId}`;
});
test('sources UPDATE (federate/unfederate) preserves config as object', async () => {
const sql = getConn();
const { runAdd } = await import('../../src/commands/sources.ts') as any;
const { runFederate } = await import('../../src/commands/sources.ts') as any;
const engine = getEngine();
const testId = 'jsonb-sources-update-' + Math.floor(Math.random() * 1e6);
await runAdd(engine, [testId, '--federated']);
// Toggle to isolated then back — exercises the runFederate UPDATE path.
if (runFederate) {
await runFederate(engine, [testId], false);
const [row] = await sql`
SELECT jsonb_typeof(config) AS t, config -> 'federated' AS federated
FROM sources WHERE id = ${testId}
`;
expect(row.t).toBe('object');
expect(row.federated).toBe(false);
}
await sql`DELETE FROM sources WHERE id = ${testId}`;
});
test('no $N::jsonb pattern (without ::text intermediate) remains in sources.ts', async () => {
const source = await Bun.file(new URL('../../src/commands/sources.ts', import.meta.url)).text();
// Match `$<digit>::jsonb` not preceded by `::text)` — the bad pattern.
// Allow `($N::text)::jsonb` (the fix). Strip the safe pattern first then check.
const safePattern = /\(\$\d+::text\)::jsonb/g;
const stripped = source.replace(safePattern, '<SAFE_DOUBLE_CAST>');
const bad = /\$\d+::jsonb/;
expect(stripped.match(bad)?.[0] ?? null).toBeNull();
});
});