mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-05-25 00:26:26 +00:00
fb744d895f
Fixes #1143. ## Summary Replaces the structurally unsound `decoded_json LIKE '%pubkey%'` (and `OR LIKE '%name%'`) attribution path with an exact-match lookup on a dedicated, indexed `transmissions.from_pubkey` column. This closes both holes documented in #1143: - **Hole 1** — same-name false positives via `OR LIKE '%name%'` - **Hole 2a** — adversarial spoofing: a malicious node names itself with another node's pubkey and gets attributed to the victim - **Hole 2b** — accidental false positive when any free-text field (path elements, channel names, message bodies) contains a 64-char hex substring matching a real pubkey - **Perf** — query now uses an index instead of a full-table scan against `LIKE '%substring%'` ## TDD Two-commit history shows red-then-green: | Commit | Status | Purpose | |---|---|---| | `7f0f08e` | RED — tests assertion-fail on master behaviour | Adversarial fixtures + spec | | `59327db` | GREEN — schema + ingestor + server + migration | Implementation | The red commit's test schema includes the new column so the file compiles, but the production code still uses LIKE — the assertions fail because the malicious / same-name / free-text rows are returned. The green commit changes the query plus adds the migration/ingest path. ## Changes ### Schema - new column `transmissions.from_pubkey TEXT` - new index `idx_transmissions_from_pubkey` ### Ingestor (`cmd/ingestor/`) - `PacketData.FromPubkey` populated from decoded ADVERT `pubKey` at write time. Cheap — already parsing `decoded_json`. Non-ADVERTs stay NULL. - `stmtInsertTransmission` writes the column. - Migration `from_pubkey_v1` ALTERs legacy DBs to add the column + index. - Bonus: rewrote the recipe in the gated one-shot `advert_count_unique_v1` migration to use `from_pubkey` (already marked done on existing DBs; kept correct for fresh installs). ### Server (`cmd/server/`) - `ensureFromPubkeyColumn` mirrors the ingestor migration so the server can boot against a DB the ingestor has never touched (e2e fixture, fresh installs). - `backfillFromPubkeyAsync` runs **after** HTTP starts. Scans `WHERE from_pubkey IS NULL AND payload_type = 4` in 5000-row chunks with a 100ms yield between chunks. Cannot block boot even on prod-sized DBs (100K+ transmissions). Queries handle NULL gracefully (return empty for that pubkey, same as today's unknown-pubkey path). - All in-scope LIKE call sites switched to exact match: | Site | Before | After | |---|---|---| | `buildPacketWhere` (was db.go:582) | `decoded_json LIKE '%pubkey%'` | `from_pubkey = ?` | | `buildTransmissionWhere` (was db.go:626) | `t.decoded_json LIKE '%pubkey%'` | `t.from_pubkey = ?` | | `GetRecentTransmissionsForNode` (was db.go:910) | `LIKE '%pubkey%' OR LIKE '%name%'` | `t.from_pubkey = ?` | | `QueryMultiNodePackets` (was db.go:1785) | `decoded_json LIKE '%pubkey%' OR ...` | `t.from_pubkey IN (?, ?, ...)` | | `advert_count_unique_v1` (was ingestor/db.go:257) | `decoded_json LIKE '%' \|\| nodes.public_key \|\| '%'` | `t.from_pubkey = nodes.public_key` | `GetRecentTransmissionsForNode` signature simplifies: the `name` parameter is gone (it was only ever used for the legacy `OR LIKE '%name%'` fallback). Sole caller in `routes.go:1243` updated. ### Tests - `cmd/server/from_pubkey_attribution_test.go` — adversarial fixtures + Hole 1/2a/2b/QueryMultiNodePackets exact-match assertions, EXPLAIN QUERY PLAN index check, migration backfill correctness. - `cmd/ingestor/from_pubkey_test.go` — write-time correctness (BuildPacketData populates FromPubkey for ADVERT only; InsertTransmission persists it; non-ADVERTs stay NULL). - Existing test schemas (server v2, server v3, coverage) get the new column **plus a SQLite trigger** that auto-populates `from_pubkey` from `decoded_json` on ADVERT inserts. This means existing fixtures (which only seed `decoded_json`) keep attributing correctly without per-test edits. - `seedTestData`'s ADVERTs explicitly set `from_pubkey`. ## Performance — index is used ``` $ EXPLAIN QUERY PLAN SELECT id FROM transmissions WHERE from_pubkey = ? SEARCH transmissions USING INDEX idx_transmissions_from_pubkey (from_pubkey=?) ``` Asserted in `TestFromPubkeyIndexUsed`. ## Migration approach - **Sync at boot**: `ALTER TABLE transmissions ADD COLUMN from_pubkey TEXT` is a metadata-only operation in SQLite — microseconds regardless of table size. `CREATE INDEX IF NOT EXISTS idx_transmissions_from_pubkey` is **not** metadata-only: it scans the table once. Empirically a few hundred ms on a 100K-row table; expect a few seconds on a 10M-row table (one-time cost, blocking boot during that window). Subsequent boots no-op via `IF NOT EXISTS`. If this boot delay becomes an operational concern at prod scale we can defer the `CREATE INDEX` to a goroutine — for now a few-second one-time delay is acceptable. - **Async**: row-level backfill of legacy NULL ADVERTs (chunked 5000 / 100ms yield). On a 100K-ADVERT prod DB, this completes in seconds in the background; HTTP is fully available throughout. - **Safety**: queries handle NULL gracefully — a node whose ADVERTs haven't backfilled yet returns empty, identical to today's behaviour for unknown pubkeys. No half-state regression. ## Out of scope (intentionally) The free-text `LIKE` paths the issue explicitly leaves alone (e.g. user-typed packet search) are untouched. Only the pubkey-attribution sites get the column treatment. ## Cycle-3 review fixes | Finding | Status | Commit | |---|---|---| | **M1c** — async-contract test was tautological (test's own `go`, not production's) | Fixed | `23ace71` (red) → `a05b50c` (green) | | **m1c** — package-global atomic resets unsafe under `t.Parallel()` | Fixed (`// DO NOT t.Parallel` comment + `Reset()` helper) | rolled into `23ace71` / `241ec69` | | **m2c** — `/api/healthz` read 3 atomics non-atomically (torn snapshot) | Fixed (single RWMutex-guarded snapshot + race test) | `241ec69` | | **n3c.m1** — vestigial OR-scaffolding in `QueryMultiNodePackets` | Fixed (cleanup) | `5a53ceb` | | **n3c.m2** — verify PR body language about `ALTER` vs `CREATE INDEX` | Verified accurate (already corrected in cycle 2) | (no change) | | **n3c.m3** — `json.Unmarshal` per row in backfill → could use SQL `json_extract` | **Deferred as known followup** — pure perf optimization (current per-row Unmarshal is correct, just slower); SQL rewrite would unwind the chunked-yield architecture and is non-trivial. Acceptable for one-time backfill at boot on legacy DBs. | ### M1c implementation detail `startFromPubkeyBackfill(dbPath, chunkSize, yieldDuration)` is now the single production entry point used by `main.go`. It internally does `go backfillFromPubkeyAsync(...)`. The test calls `startFromPubkeyBackfill` (no `go` prefix) and asserts the dispatch returns within 50ms — so if anyone removes the `go` keyword inside the wrapper, the test fails. **Manually verified**: removing the `go` keyword causes `TestBackfillFromPubkey_DoesNotBlockBoot` to fail with "backfill dispatch took ~1s (>50ms): not async — would block boot." ### m2c implementation detail `fromPubkeyBackfillTotal/Processed/Done` are now plain `int64`/`bool` package globals guarded by a single `sync.RWMutex`. `fromPubkeyBackfillSnapshot()` returns all three under one RLock. `TestHealthzFromPubkeyBackfillConsistentSnapshot` races a writer (lock-step total/processed updates with periodic done flips) against 8 readers hammering `/api/healthz`, asserting `processed<=total` and `(done => processed==total)` on every response. Verified the test catches torn reads (manually injected a 3-RLock implementation; test failed within milliseconds with "processed>total" and "done=true but processed!=total" errors). --------- Co-authored-by: openclaw-bot <bot@openclaw.local> Co-authored-by: openclaw-bot <bot@openclaw.dev>
435 lines
14 KiB
Go
435 lines
14 KiB
Go
package main
|
|
|
|
// Tests for issue #1143: pubkey attribution must use exact-match on a
|
|
// dedicated `from_pubkey` column, not `decoded_json LIKE '%pubkey%'`.
|
|
//
|
|
// These tests demonstrate the structural holes documented in #1143:
|
|
// Hole 1: name-LIKE fallback surfaces same-name nodes
|
|
// Hole 2a: an attacker can name themselves with someone else's pubkey
|
|
// and get their transmissions attributed to the victim
|
|
// Hole 2b: any 64-char hex substring inside decoded_json (path elements,
|
|
// channel names, message bodies) produces false positives
|
|
|
|
import (
|
|
"database/sql"
|
|
"fmt"
|
|
"strings"
|
|
"testing"
|
|
"time"
|
|
|
|
_ "modernc.org/sqlite"
|
|
)
|
|
|
|
const (
|
|
pkVictim = "f7181c468dfe7c55aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
|
|
pkAttacker = "deadbeefdeadbeefcccccccccccccccccccccccccccccccccccccccccccccccc"
|
|
pkOther = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
|
|
)
|
|
|
|
// seedAttribution inserts the standard adversarial fixture used by the
|
|
// issue #1143 tests. It returns the victim pubkey for convenience.
|
|
func seedAttribution(t *testing.T, db *DB) string {
|
|
t.Helper()
|
|
now := time.Now().UTC().Format(time.RFC3339)
|
|
|
|
// (1) Legitimate ADVERT from the victim.
|
|
mustExec(t, db, `INSERT INTO transmissions
|
|
(raw_hex, hash, first_seen, route_type, payload_type, decoded_json, from_pubkey)
|
|
VALUES ('AA','h_victim_advert',?,1,4,
|
|
'{"type":"ADVERT","pubKey":"`+pkVictim+`","name":"VictimNode"}',
|
|
?)`, now, pkVictim)
|
|
|
|
// (2) Hole 1: a different node sharing the *display name* "VictimNode".
|
|
mustExec(t, db, `INSERT INTO transmissions
|
|
(raw_hex, hash, first_seen, route_type, payload_type, decoded_json, from_pubkey)
|
|
VALUES ('BB','h_namespoof_advert',?,1,4,
|
|
'{"type":"ADVERT","pubKey":"`+pkOther+`","name":"VictimNode"}',
|
|
?)`, now, pkOther)
|
|
|
|
// (3) Hole 2a: malicious node whose *name* is the victim's pubkey.
|
|
// decoded_json contains pkVictim as a substring (in the name field),
|
|
// but the actual originator is pkAttacker.
|
|
mustExec(t, db, `INSERT INTO transmissions
|
|
(raw_hex, hash, first_seen, route_type, payload_type, decoded_json, from_pubkey)
|
|
VALUES ('CC','h_spoof_advert',?,1,4,
|
|
'{"type":"ADVERT","pubKey":"`+pkAttacker+`","name":"`+pkVictim+`"}',
|
|
?)`, now, pkAttacker)
|
|
|
|
// (4) Hole 2b: free-text packet (e.g. channel message) whose body
|
|
// coincidentally contains the victim's pubkey as a substring.
|
|
// Real originator is pkAttacker; from_pubkey reflects that.
|
|
mustExec(t, db, `INSERT INTO transmissions
|
|
(raw_hex, hash, first_seen, route_type, payload_type, decoded_json, from_pubkey)
|
|
VALUES ('DD','h_freetext_msg',?,1,5,
|
|
'{"type":"GRP_TXT","text":"hello `+pkVictim+` how are you"}',
|
|
?)`, now, pkAttacker)
|
|
|
|
return pkVictim
|
|
}
|
|
|
|
func mustExec(t *testing.T, db *DB, q string, args ...interface{}) {
|
|
t.Helper()
|
|
if _, err := db.conn.Exec(q, args...); err != nil {
|
|
t.Fatalf("exec failed: %v\nquery: %s", err, q)
|
|
}
|
|
}
|
|
|
|
func hashesOf(rows []map[string]interface{}) []string {
|
|
out := make([]string, 0, len(rows))
|
|
for _, r := range rows {
|
|
if h, ok := r["hash"].(string); ok {
|
|
out = append(out, h)
|
|
}
|
|
}
|
|
return out
|
|
}
|
|
|
|
func TestRecentTransmissions_Hole1_SameNameDifferentPubkey(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
victim := seedAttribution(t, db)
|
|
|
|
got, err := db.GetRecentTransmissionsForNode(victim, 20)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
|
|
hashes := hashesOf(got)
|
|
for _, h := range hashes {
|
|
if h == "h_namespoof_advert" {
|
|
t.Fatalf("Hole 1: same-name node was attributed to the victim. got hashes=%v", hashes)
|
|
}
|
|
}
|
|
}
|
|
|
|
func TestRecentTransmissions_Hole2a_PubkeyAsNameSpoof(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
victim := seedAttribution(t, db)
|
|
|
|
got, err := db.GetRecentTransmissionsForNode(victim, 20)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
|
|
hashes := hashesOf(got)
|
|
for _, h := range hashes {
|
|
if h == "h_spoof_advert" {
|
|
t.Fatalf("Hole 2a: attacker who named themselves with victim's pubkey "+
|
|
"was attributed to the victim. got hashes=%v", hashes)
|
|
}
|
|
}
|
|
}
|
|
|
|
func TestRecentTransmissions_Hole2b_FreeTextHexFalsePositive(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
victim := seedAttribution(t, db)
|
|
|
|
got, err := db.GetRecentTransmissionsForNode(victim, 20)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
|
|
hashes := hashesOf(got)
|
|
for _, h := range hashes {
|
|
if h == "h_freetext_msg" {
|
|
t.Fatalf("Hole 2b: free-text containing the victim's pubkey as a "+
|
|
"substring produced a false positive. got hashes=%v", hashes)
|
|
}
|
|
}
|
|
}
|
|
|
|
func TestRecentTransmissions_LegitimateAdvertReturned(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
victim := seedAttribution(t, db)
|
|
|
|
got, err := db.GetRecentTransmissionsForNode(victim, 20)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
|
|
hashes := hashesOf(got)
|
|
found := false
|
|
for _, h := range hashes {
|
|
if h == "h_victim_advert" {
|
|
found = true
|
|
break
|
|
}
|
|
}
|
|
if !found {
|
|
t.Fatalf("expected legitimate victim advert (h_victim_advert) in result, got %v", hashes)
|
|
}
|
|
}
|
|
|
|
// --- Multi-pubkey OR query (#1143 — db.go:1785) ---
|
|
|
|
func TestQueryMultiNodePackets_ExactMatchOnly(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
seedAttribution(t, db)
|
|
|
|
// Query the victim's pubkey via the multi-node API. The malicious
|
|
// "name = victim pubkey" row and the free-text row must NOT show up.
|
|
res, err := db.QueryMultiNodePackets([]string{pkVictim}, 50, 0, "DESC", "", "")
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
hashes := hashesOf(res.Packets)
|
|
for _, bad := range []string{"h_spoof_advert", "h_freetext_msg", "h_namespoof_advert"} {
|
|
for _, h := range hashes {
|
|
if h == bad {
|
|
t.Fatalf("QueryMultiNodePackets returned spurious match %q (pubkey %s as substring); hashes=%v",
|
|
bad, pkVictim, hashes)
|
|
}
|
|
}
|
|
}
|
|
// The legitimate one must still be present.
|
|
if !contains(hashes, "h_victim_advert") {
|
|
t.Fatalf("expected h_victim_advert in QueryMultiNodePackets result, got %v", hashes)
|
|
}
|
|
}
|
|
|
|
func contains(haystack []string, needle string) bool {
|
|
for _, s := range haystack {
|
|
if s == needle {
|
|
return true
|
|
}
|
|
}
|
|
return false
|
|
}
|
|
|
|
// --- Index sanity check (#1143 perf): verify EXPLAIN QUERY PLAN uses the
|
|
// new index, not a SCAN. ---
|
|
|
|
func TestFromPubkeyIndexUsed(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
mustExec(t, db, `CREATE INDEX IF NOT EXISTS idx_transmissions_from_pubkey ON transmissions(from_pubkey)`)
|
|
|
|
rows, err := db.conn.Query(
|
|
`EXPLAIN QUERY PLAN SELECT id FROM transmissions WHERE from_pubkey = ?`,
|
|
pkVictim)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
defer rows.Close()
|
|
plan := ""
|
|
for rows.Next() {
|
|
var id, parent, notused int
|
|
var detail string
|
|
if err := rows.Scan(&id, &parent, ¬used, &detail); err == nil {
|
|
plan += detail + "\n"
|
|
}
|
|
}
|
|
if !strings.Contains(plan, "idx_transmissions_from_pubkey") {
|
|
t.Fatalf("expected EXPLAIN QUERY PLAN to use idx_transmissions_from_pubkey, got:\n%s", plan)
|
|
}
|
|
}
|
|
|
|
// TestFromPubkeyIndexUsedForInClause verifies the index is used for the
|
|
// IN (?, ?, ...) query path used by QueryMultiNodePackets (db.go ~1787).
|
|
// Coverage extension — the equality path is covered above; this asserts
|
|
// the multi-node path doesn't silently regress to a full scan when the
|
|
// planner can't use the index for set membership.
|
|
func TestFromPubkeyIndexUsedForInClause(t *testing.T) {
|
|
db := setupTestDB(t)
|
|
defer db.Close()
|
|
mustExec(t, db, `CREATE INDEX IF NOT EXISTS idx_transmissions_from_pubkey ON transmissions(from_pubkey)`)
|
|
|
|
rows, err := db.conn.Query(
|
|
`EXPLAIN QUERY PLAN SELECT id FROM transmissions WHERE from_pubkey IN (?, ?)`,
|
|
pkVictim, pkOther)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
defer rows.Close()
|
|
plan := ""
|
|
for rows.Next() {
|
|
var id, parent, notused int
|
|
var detail string
|
|
if err := rows.Scan(&id, &parent, ¬used, &detail); err == nil {
|
|
plan += detail + "\n"
|
|
}
|
|
}
|
|
if !strings.Contains(plan, "idx_transmissions_from_pubkey") {
|
|
t.Fatalf("expected EXPLAIN QUERY PLAN for IN(...) to use idx_transmissions_from_pubkey, got:\n%s", plan)
|
|
}
|
|
}
|
|
|
|
// --- Migration / backfill ---
|
|
|
|
func TestBackfillFromPubkey_AdvertRowsPopulated(t *testing.T) {
|
|
dir := t.TempDir()
|
|
dbPath := dir + "/test.db"
|
|
|
|
// Create a legacy-style DB: transmissions table WITHOUT from_pubkey,
|
|
// then run ensureFromPubkeyColumn to ALTER it in.
|
|
rw, err := sql.Open("sqlite", dbPath)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
if _, err := rw.Exec(`CREATE TABLE transmissions (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
raw_hex TEXT, hash TEXT UNIQUE, first_seen TEXT,
|
|
route_type INTEGER, payload_type INTEGER, payload_version INTEGER,
|
|
decoded_json TEXT, created_at TEXT
|
|
)`); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
// Two ADVERTs (different pubkeys) and a non-ADVERT.
|
|
if _, err := rw.Exec(`INSERT INTO transmissions (raw_hex, hash, first_seen, payload_type, decoded_json) VALUES
|
|
('AA','m1','2026-01-01T00:00:00Z',4,'{"type":"ADVERT","pubKey":"`+pkVictim+`","name":"V"}'),
|
|
('BB','m2','2026-01-01T00:00:00Z',4,'{"type":"ADVERT","pubKey":"`+pkOther+`","name":"O"}'),
|
|
('CC','m3','2026-01-01T00:00:00Z',5,'{"type":"GRP_TXT","text":"hi"}')`); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
rw.Close()
|
|
|
|
if err := ensureFromPubkeyColumn(dbPath); err != nil {
|
|
t.Fatalf("ensureFromPubkeyColumn: %v", err)
|
|
}
|
|
|
|
// Run synchronously by calling the function directly.
|
|
backfillFromPubkeyAsync(dbPath, 100, 0)
|
|
|
|
// Verify backfill populated the ADVERT rows.
|
|
rw2, err := sql.Open("sqlite", dbPath)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
defer rw2.Close()
|
|
rows, err := rw2.Query("SELECT hash, from_pubkey FROM transmissions ORDER BY hash")
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
defer rows.Close()
|
|
got := map[string]string{}
|
|
for rows.Next() {
|
|
var h string
|
|
var pk sql.NullString
|
|
if err := rows.Scan(&h, &pk); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
got[h] = pk.String
|
|
}
|
|
if got["m1"] != pkVictim {
|
|
t.Errorf("m1 from_pubkey = %q, want %q", got["m1"], pkVictim)
|
|
}
|
|
if got["m2"] != pkOther {
|
|
t.Errorf("m2 from_pubkey = %q, want %q", got["m2"], pkOther)
|
|
}
|
|
// Non-ADVERT row was not in the backfill scope; from_pubkey stays NULL.
|
|
if got["m3"] != "" {
|
|
t.Errorf("m3 from_pubkey = %q, want empty (NULL)", got["m3"])
|
|
}
|
|
}
|
|
|
|
// TestBackfillFromPubkey_DoesNotBlockBoot exercises the async contract:
|
|
// main.go (cmd/server/main.go) calls startFromPubkeyBackfill, which is the
|
|
// SAME entry point used at production startup. The wrapper must dispatch
|
|
// the backfill in a goroutine; if anyone removes the `go` keyword inside
|
|
// startFromPubkeyBackfill, this test fails because the call no longer
|
|
// returns within the 50ms boot dispatch budget. The test does NOT use `go`
|
|
// itself — that would test only the test's own scheduler, not the
|
|
// production code path (cycle-3 M1c).
|
|
//
|
|
// DO NOT t.Parallel — uses package-global atomics
|
|
// (fromPubkeyBackfillTotal/Processed/Done). Concurrent tests would clobber
|
|
// the resets (cycle-3 m1c).
|
|
func TestBackfillFromPubkey_DoesNotBlockBoot(t *testing.T) {
|
|
dir := t.TempDir()
|
|
dbPath := dir + "/async_boot.db"
|
|
|
|
rw, err := sql.Open("sqlite", dbPath)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
if _, err := rw.Exec(`CREATE TABLE transmissions (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
raw_hex TEXT, hash TEXT UNIQUE, first_seen TEXT,
|
|
route_type INTEGER, payload_type INTEGER, payload_version INTEGER,
|
|
decoded_json TEXT, created_at TEXT
|
|
)`); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
// Insert N=1000 legacy ADVERT rows. With chunkSize=100 + yield=100ms
|
|
// between chunks, sync would be ~900ms; we assert dispatch is <50ms.
|
|
tx, err := rw.Begin()
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
stmt, err := tx.Prepare(`INSERT INTO transmissions
|
|
(raw_hex, hash, first_seen, payload_type, decoded_json) VALUES (?, ?, ?, 4, ?)`)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
const N = 1000
|
|
for i := 0; i < N; i++ {
|
|
hash := fmt.Sprintf("h_async_boot_%d", i)
|
|
dj := fmt.Sprintf(`{"type":"ADVERT","pubKey":"%s","name":"N%d"}`, pkVictim, i)
|
|
if _, err := stmt.Exec("AA", hash, "2026-01-01T00:00:00Z", dj); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
}
|
|
stmt.Close()
|
|
if err := tx.Commit(); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
rw.Close()
|
|
|
|
if err := ensureFromPubkeyColumn(dbPath); err != nil {
|
|
t.Fatalf("ensureFromPubkeyColumn: %v", err)
|
|
}
|
|
|
|
// Reset all backfill state — other tests may have set it.
|
|
fromPubkeyBackfillReset()
|
|
defer fromPubkeyBackfillReset()
|
|
|
|
// Dispatch via the production wrapper. startFromPubkeyBackfill is the
|
|
// same entry point main.go calls at boot; it must launch the backfill
|
|
// in a goroutine internally. We deliberately do NOT prefix `go` here —
|
|
// if the wrapper is ever made synchronous, the dispatch budget below
|
|
// fires first.
|
|
t0 := time.Now()
|
|
startFromPubkeyBackfill(dbPath, 100, 100*time.Millisecond)
|
|
dispatchElapsed := time.Since(t0)
|
|
|
|
// (a) Boot-time dispatch budget: must return ~immediately.
|
|
if dispatchElapsed > 50*time.Millisecond {
|
|
t.Fatalf("backfill dispatch took %v (>50ms): not async — would block boot", dispatchElapsed)
|
|
}
|
|
|
|
// (b) Eventual completion via the fromPubkeyBackfill snapshot.
|
|
deadline := time.Now().Add(30 * time.Second)
|
|
for time.Now().Before(deadline) {
|
|
if _, _, done := fromPubkeyBackfillSnapshot(); done {
|
|
break
|
|
}
|
|
time.Sleep(50 * time.Millisecond)
|
|
}
|
|
if _, _, done := fromPubkeyBackfillSnapshot(); !done {
|
|
t.Fatalf("backfill never flipped Done within 30s; dispatched=%v", dispatchElapsed)
|
|
}
|
|
|
|
// (c) Backfill actually populated rows.
|
|
rw2, err := sql.Open("sqlite", dbPath)
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
defer rw2.Close()
|
|
var nullCount int
|
|
if err := rw2.QueryRow(
|
|
`SELECT COUNT(*) FROM transmissions WHERE payload_type = 4 AND from_pubkey IS NULL`,
|
|
).Scan(&nullCount); err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
if nullCount > 0 {
|
|
t.Errorf("backfill left %d ADVERT rows with NULL from_pubkey", nullCount)
|
|
}
|
|
if _, processed, _ := fromPubkeyBackfillSnapshot(); processed != int64(N) {
|
|
t.Errorf("fromPubkeyBackfillProcessed = %d, want %d", processed, N)
|
|
}
|
|
}
|