Files
meshcore-analyzer/cmd
Kpa-clawbot 3feb97f16f fix(ingestor): write resolved_path on new observations (regression from #1289) (#1548)
# fix(ingestor): write resolved_path on new observations (full restore —
closes #1547 + #1560)

Fixes #1547. Closes #1560.

## Root cause
PR #1289 (the "ingestor owns the neighbor graph; server is read-only"
refactor, ~2026-05-21) moved the neighbor graph + schema writes to the
ingestor, and as a side-effect removed the server-side writer that
populated `observations.resolved_path` AND the context-aware
`pm.resolveWithContext` that disambiguated 1-byte prefix collisions.
Result: every observation inserted after the deploy has `resolved_path =
NULL` (3.1M/6.3M NULL on staging; 100% NULL on fresh deploys; symptom on
Cascadia: hops fail to resolve because the small-mesh client-side
fallback breaks on prefix collisions).

## Full restore
This PR resolves both single-byte and multi-byte prefix paths.
Single-byte disambiguation uses NeighborGraph adjacency and ADVERT
`from_pubkey` anchoring, ported from pre-#1289 `pm.resolveWithContext`
logic (last good at cmd/server/store.go @ commit 450236d5) and the #1144
/ #1352 fixes.

New file `cmd/ingestor/path_resolver.go`:
- `NeighborGraph` + `neighborGraphHolder` — in-memory adjacency
snapshot, atomic-published.
- `loadNeighborGraph(db)` — one-shot SELECT from `neighbor_edges`.
- `resolveHopWithContext(hop, anchor, graph, idx, exclude) *string` —
single-hop, tier-1 disambiguator.
- `resolvePathWithContext(hops, fromPubkey, graph, idx) []*string` —
walks the path, anchoring hop 0 on `from_pubkey` (ADVERTs) and each
subsequent hop on the previous resolved hop, excluding already-resolved
pubkeys.
- `Store.RefreshNeighborGraph()` — called on warm-up and every 60s tick
in the neighbor-edges builder alongside `RefreshPrefixIndex`.

Existing file `cmd/ingestor/resolved_path.go` (PR #1547 base) is
untouched: `resolvePath` + `marshalResolvedPath` + the all-nil →
empty-string clobber-guard contract are preserved verbatim.

`cmd/ingestor/db.go` — `InsertTransmission` now calls
`resolvePathWithContext` instead of the naive `resolvePath`.

## Algorithm (per hop)
1. Look up candidate pubkeys by prefix-match (existing `prefixIndex`).
2. `len==0 → nil`; `len==1 → that pubkey`.
3. `len>1` → filter by `NeighborGraph` adjacency to the anchor. Anchor
is `from_pubkey` for hop 0 on ADVERTs, the previous resolved hop
otherwise. Exactly 1 surviving candidate → use it; else nil.
4. Previously resolved hops (and the originator) are excluded from
downstream candidate pools — a packet does not revisit a node.

Tier-2/3/4 from pre-#1289 (geo proximity, GPS preference,
observation-count fallback) are intentionally NOT ported — those were
noisy in practice and belong in a separate enhancement, not in this
regression restore.

## Out of scope
- The ~3.1M existing NULL rows from the regression window. Filed as a
follow-up backfill task — too risky to bundle here (touches a 6M-row
table).
- The dead-flag bug #1546 — separate concern.

## TDD red → green
- Red commit `80b0f476` — adds five new context-resolver tests; stub
`resolvePathWithContext` falls back to naive `resolvePath`. CI run
26946935615 → **failure** with assertion errors on the three collision
tests (`TestResolveHopWithContext_OneByteCollision_AdjacencyResolves`,
`TestResolvePathWithContext_TwoHopChainAnchoredOnFromNode`,
`TestResolvePathWithContext_AdvertAnchoring`); the two regression tests
(multi-byte still works + all-nil contract) stayed green.
- Green commit `7b4950ce` — real algorithm + InsertTransmission wiring +
RefreshNeighborGraph in the builder tick. All five new tests pass;
original four `resolved_path` tests stay green.

## Verification
- `go test -race ./cmd/ingestor/...` for the 11 affected tests — pass.
- `bash ~/.openclaw/skills/pr-preflight/scripts/run-all.sh
origin/master` — exit 0 (all gates clean).
- PII grep on body + diff: clean.

Tested with: existing `TestInsertTransmissionWritesResolvedPath` +
`TestInsertTransmissionDoesNotClobberResolvedPathOnAllNil` (PR #1547
base) plus the new collision-resolution suite:
- `TestResolveHopWithContext_OneByteCollision_AdjacencyResolves` —
3-of-5 nodes share `0x5c`, chain A↔B↔C↔D↔E; anchored on A, hop `5c` → B.
- `TestResolvePathWithContext_TwoHopChainAnchoredOnFromNode` — path
`[5c, 5c]` from_node A → `[B, C]`.
- `TestResolveHopWithContext_NoAdjacencyContext_ReturnsNil` — 3
ambiguous candidates, no anchor / non-adjacent anchor → nil.
- `TestResolvePathWithContext_AdvertAnchoring` — ADVERT,
`from_pubkey=A`, path `[5c]` → only-adjacent neighbor B.
- `TestResolvePathWithContext_RegressionMultiByteStillWorks` —
unique-prefix path with no graph context still resolves.
- `TestResolvePathWithContext_AllNilContractPreserved` — unresolvable
path → `marshalResolvedPath==""` (clobber-guard from PR #1548
untouched).

## Browser-validated
N/A — backend-only change. Frontend already handles populated
`resolved_path` via `getResolvedPath` in `cmd/server/db.go` and
`public/packets.js`.

## Round-1 fixes addressed
- **MUST-FIX #1 (data-loss clobber on all-nil resolution):** when every
hop fails to resolve, `marshalResolvedPath` returns `""` instead of
`"[null,null,...]"`, so `nilIfEmpty` → SQL NULL and the
`COALESCE(excluded.resolved_path, resolved_path)` UPSERT preserves any
previously stored good value on re-ingest. Regression test asserts:
insert a transmission, observe `resolved_path` populated, wipe the
prefix index, re-ingest the same packet, assert the existing
`resolved_path` is unchanged.

---------

Co-authored-by: corescope-bot <bot@corescope>
Co-authored-by: openclaw-bot <bot@openclaw>
Co-authored-by: openclaw-bot <bot@openclaw.local>
2026-06-04 07:35:13 -07:00
..