mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-07-02 12:51:44 +00:00
3feb97f16f
# fix(ingestor): write resolved_path on new observations (full restore — closes #1547 + #1560) Fixes #1547. Closes #1560. ## Root cause PR #1289 (the "ingestor owns the neighbor graph; server is read-only" refactor, ~2026-05-21) moved the neighbor graph + schema writes to the ingestor, and as a side-effect removed the server-side writer that populated `observations.resolved_path` AND the context-aware `pm.resolveWithContext` that disambiguated 1-byte prefix collisions. Result: every observation inserted after the deploy has `resolved_path = NULL` (3.1M/6.3M NULL on staging; 100% NULL on fresh deploys; symptom on Cascadia: hops fail to resolve because the small-mesh client-side fallback breaks on prefix collisions). ## Full restore This PR resolves both single-byte and multi-byte prefix paths. Single-byte disambiguation uses NeighborGraph adjacency and ADVERT `from_pubkey` anchoring, ported from pre-#1289 `pm.resolveWithContext` logic (last good at cmd/server/store.go @ commit 450236d5) and the #1144 / #1352 fixes. New file `cmd/ingestor/path_resolver.go`: - `NeighborGraph` + `neighborGraphHolder` — in-memory adjacency snapshot, atomic-published. - `loadNeighborGraph(db)` — one-shot SELECT from `neighbor_edges`. - `resolveHopWithContext(hop, anchor, graph, idx, exclude) *string` — single-hop, tier-1 disambiguator. - `resolvePathWithContext(hops, fromPubkey, graph, idx) []*string` — walks the path, anchoring hop 0 on `from_pubkey` (ADVERTs) and each subsequent hop on the previous resolved hop, excluding already-resolved pubkeys. - `Store.RefreshNeighborGraph()` — called on warm-up and every 60s tick in the neighbor-edges builder alongside `RefreshPrefixIndex`. Existing file `cmd/ingestor/resolved_path.go` (PR #1547 base) is untouched: `resolvePath` + `marshalResolvedPath` + the all-nil → empty-string clobber-guard contract are preserved verbatim. `cmd/ingestor/db.go` — `InsertTransmission` now calls `resolvePathWithContext` instead of the naive `resolvePath`. ## Algorithm (per hop) 1. Look up candidate pubkeys by prefix-match (existing `prefixIndex`). 2. `len==0 → nil`; `len==1 → that pubkey`. 3. `len>1` → filter by `NeighborGraph` adjacency to the anchor. Anchor is `from_pubkey` for hop 0 on ADVERTs, the previous resolved hop otherwise. Exactly 1 surviving candidate → use it; else nil. 4. Previously resolved hops (and the originator) are excluded from downstream candidate pools — a packet does not revisit a node. Tier-2/3/4 from pre-#1289 (geo proximity, GPS preference, observation-count fallback) are intentionally NOT ported — those were noisy in practice and belong in a separate enhancement, not in this regression restore. ## Out of scope - The ~3.1M existing NULL rows from the regression window. Filed as a follow-up backfill task — too risky to bundle here (touches a 6M-row table). - The dead-flag bug #1546 — separate concern. ## TDD red → green - Red commit `80b0f476` — adds five new context-resolver tests; stub `resolvePathWithContext` falls back to naive `resolvePath`. CI run 26946935615 → **failure** with assertion errors on the three collision tests (`TestResolveHopWithContext_OneByteCollision_AdjacencyResolves`, `TestResolvePathWithContext_TwoHopChainAnchoredOnFromNode`, `TestResolvePathWithContext_AdvertAnchoring`); the two regression tests (multi-byte still works + all-nil contract) stayed green. - Green commit `7b4950ce` — real algorithm + InsertTransmission wiring + RefreshNeighborGraph in the builder tick. All five new tests pass; original four `resolved_path` tests stay green. ## Verification - `go test -race ./cmd/ingestor/...` for the 11 affected tests — pass. - `bash ~/.openclaw/skills/pr-preflight/scripts/run-all.sh origin/master` — exit 0 (all gates clean). - PII grep on body + diff: clean. Tested with: existing `TestInsertTransmissionWritesResolvedPath` + `TestInsertTransmissionDoesNotClobberResolvedPathOnAllNil` (PR #1547 base) plus the new collision-resolution suite: - `TestResolveHopWithContext_OneByteCollision_AdjacencyResolves` — 3-of-5 nodes share `0x5c`, chain A↔B↔C↔D↔E; anchored on A, hop `5c` → B. - `TestResolvePathWithContext_TwoHopChainAnchoredOnFromNode` — path `[5c, 5c]` from_node A → `[B, C]`. - `TestResolveHopWithContext_NoAdjacencyContext_ReturnsNil` — 3 ambiguous candidates, no anchor / non-adjacent anchor → nil. - `TestResolvePathWithContext_AdvertAnchoring` — ADVERT, `from_pubkey=A`, path `[5c]` → only-adjacent neighbor B. - `TestResolvePathWithContext_RegressionMultiByteStillWorks` — unique-prefix path with no graph context still resolves. - `TestResolvePathWithContext_AllNilContractPreserved` — unresolvable path → `marshalResolvedPath==""` (clobber-guard from PR #1548 untouched). ## Browser-validated N/A — backend-only change. Frontend already handles populated `resolved_path` via `getResolvedPath` in `cmd/server/db.go` and `public/packets.js`. ## Round-1 fixes addressed - **MUST-FIX #1 (data-loss clobber on all-nil resolution):** when every hop fails to resolve, `marshalResolvedPath` returns `""` instead of `"[null,null,...]"`, so `nilIfEmpty` → SQL NULL and the `COALESCE(excluded.resolved_path, resolved_path)` UPSERT preserves any previously stored good value on re-ingest. Regression test asserts: insert a transmission, observe `resolved_path` populated, wipe the prefix index, re-ingest the same packet, assert the existing `resolved_path` is unchanged. --------- Co-authored-by: corescope-bot <bot@corescope> Co-authored-by: openclaw-bot <bot@openclaw> Co-authored-by: openclaw-bot <bot@openclaw.local>