diff --git a/docs/specs/resolved-path.md b/docs/specs/resolved-path.md index 61e7286..6d316c0 100644 --- a/docs/specs/resolved-path.md +++ b/docs/specs/resolved-path.md @@ -17,21 +17,22 @@ Currently, hop paths are stored as short uppercase hex prefixes in `path_json` ( ## Solution -Resolve hop prefixes to full pubkeys **once at ingest time** on the server, using `resolveWithContext()` with 4-tier priority (affinity → geo → GPS → first match) and a **persisted neighbor graph**. Store the result as a new `resolved_path` field alongside `path_json`. +Resolve hop prefixes to full pubkeys **once at ingest time** on the server, using `resolveWithContext()` with 4-tier priority (affinity → geo → GPS → first match) and a **persisted neighbor graph**. Store the result as a new `resolved_path` column on observations alongside `path_json`. ## Design decisions (locked) 1. **`path_json` stays unchanged** — raw firmware prefixes, uppercase hex. Ground truth. -2. **`resolved_path` is NEW** — `[]*string` on observations. Full 64-char lowercase hex pubkeys, `null` for unresolved. +2. **`resolved_path` is a column on observations** — full 64-char lowercase hex pubkeys, `null` for unresolved. 3. **Resolved at ingest** using `resolveWithContext(hop, context, graph)` — 4-tier priority: affinity → geo → GPS → first match. -4. **Stored alongside `path_json`** — raw firmware prefixes stay unchanged. -5. **`null` = unresolved** — ambiguous prefixes store `null`. Frontend falls back to prefix display. -6. **Both fields coexist** — not interchangeable. Different consumers use different fields. +4. **`null` = unresolved** — ambiguous prefixes store `null`. Frontend falls back to prefix display. +5. **Both fields coexist** — not interchangeable. Different consumers use different fields. ## Persisted neighbor graph ### SQLite table: `neighbor_edges` +Thin and normalized. Stores ONLY the relationship. SNR, observer names, GPS, roles — all join from existing tables when needed. No duplication. + ```sql CREATE TABLE IF NOT EXISTS neighbor_edges ( node_a TEXT NOT NULL, @@ -42,24 +43,42 @@ CREATE TABLE IF NOT EXISTS neighbor_edges ( ); ``` -### Order of operations +### Edge extraction rules (ADVERT vs non-ADVERT) -1. **First run:** Built on startup from existing packet data, then persisted to SQLite. -2. **Subsequent runs:** Loaded from SQLite — instant startup, no 7-second rebuild. -3. **Available at ingest** — lives on `PacketStore`, not `Server`. -4. **Updated incrementally** — each new packet upserts edges. +At ingest, for each packet: -This replaces the previous "in-memory only for v1" approach. The graph is always persisted, always fast to load. +- **ADVERT packets** (payload_type 4): originator pubkey is known from `decoded_json.pubKey`. Extract edge: `originator ↔ path[0]` (the first hop is a direct neighbor of the originator). +- **ALL packets**: observer pubkey is known. Extract edge: `observer ↔ path[last]` (the last hop is a direct neighbor of the observer). +- **Non-ADVERT packets**: originator is unknown (encrypted). ONLY extract `observer ↔ path[last]`. +- Each packet produces **1 or 2 edge upserts** depending on type. + +Edge upsert: `INSERT OR REPLACE INTO neighbor_edges` with `count = count + 1` and `last_seen = now`. + +### Cold startup and backfill + +On startup: + +1. **Load `neighbor_edges` from SQLite** → build in-memory graph. +2. **If table empty (first run):** `BuildFromStore(packets)` — scan all existing packets, extract edges per the rules above, INSERT into `neighbor_edges`. +3. **Load observations from SQLite.** +4. **For observations without `resolved_path`:** resolve using the graph, UPDATE `resolved_path` in SQLite. +5. **Ready to serve.** + +On subsequent runs, step 2 is skipped (table already populated). Step 4 only processes observations with NULL `resolved_path` (new or previously unresolved). ## Data model ### Where does `resolved_path` live? -**On observations**, not transmissions. +**On observations**, as a column: + +```sql +ALTER TABLE observations ADD COLUMN resolved_path TEXT; +``` Rationale: Each observer sees the packet from a different vantage point. The same 2-char prefix may resolve to different full pubkeys depending on which observer's neighborhood is considered. The observer's own pubkey provides critical context for `resolveWithContext` (tier 2: neighbor affinity). Storing on observations preserves this per-observer resolution. -`path_json` already lives on observations in the DB schema. `resolved_path` follows the same pattern. +`resolved_path` is written in the same INSERT that creates the observation — one write, no double-write problem. ### Field shape @@ -72,18 +91,37 @@ resolved_path TEXT -- JSON array: ["aabb...64chars", null, "ccdd...64chars"] - Stored as a JSON text column (same approach as `path_json`) - Uses `omitempty` — absent from JSON when not set +## Every path resolution uses the graph — no exceptions + +All existing `pm.resolve()` call sites MUST be migrated to `resolveWithContext` with the persisted graph. No "we'll get to it later." + +### Call sites to migrate (exhaustive) + +Found via `grep -n "pm.resolve" cmd/server/store.go`: + +| Line | Function | Current | After | +|------|----------|---------|-------| +| 1192 | `IngestNewFromDB()` | `pm.resolve(hop)` | `resolveWithContext(hop, ctx, graph)` — resolve at ingest, store as `resolved_path` | +| 1876 | `buildDistanceIndex()` | `pm.resolve(hop)` | Read `resolved_path` from observation — already resolved at ingest | +| 3537 | `computeAnalyticsTopology()` | `pm.resolve(hop)` | Read `resolved_path` from observation | +| 5528 | `computeAnalyticsSubpaths()` | `pm.resolve(hop)` | Read `resolved_path` from observation | +| 5665 | `GetSubpathDetail()` | `pm.resolve(hop)` | `resolveWithContext(hop, ctx, graph)` — ad-hoc resolution for user-provided hops | +| 5744 | `GetSubpathDetail()` | `pm.resolve(h)` | `resolveWithContext(h, ctx, graph)` — same function, second usage | + +**After migration:** `pm.resolve()` (naive prefix-only lookup) is dead code. Remove it. All resolution goes through `resolveWithContext` which uses the persisted neighbor graph for affinity-based disambiguation. + ## Ingest pipeline changes ### Where resolution happens -In `PacketStore.IngestNewFromDB()` and `PacketStore.IngestNewObservations()` in `cmd/server/store.go`. Resolution is added **after** the observation is read from DB and **before** it's stored in memory and broadcast. +In `PacketStore.IngestNewFromDB()` in `cmd/server/store.go`. Resolution is added **during** the observation INSERT — same write, same transaction. Resolution flow per observation: 1. Parse `path_json` into hop prefixes 2. Build context pubkeys from the observation (observer pubkey, source/dest from decoded packet) 3. Call `resolveWithContext(hop, contextPubkeys, neighborGraph)` for each hop -4. Store result as `resolved_path` on the observation -5. Upsert neighbor edges into the graph (incremental update) +4. Store result as `resolved_path` column on the observation (same INSERT) +5. Upsert neighbor edges into `neighbor_edges` table (incremental update) ### Performance @@ -102,8 +140,10 @@ Per-hop cost: ~1–5μs. A typical packet has 0–5 hops. At 100 packets/second | Map Show Route | Client HopResolver (naive) | Read `resolved_path` | | Live map animated paths | Client HopResolver (naive) | Read `resolved_path` | | Node detail paths | Client HopResolver (naive) | Read `resolved_path` | -| Analytics topology/subpaths | Server `prefixMap.resolve()` (naive) | Read `resolved_path` from observations | -| Analytics hop distances | Server `resolveWithContext` | Already correct | +| Analytics topology | Server `pm.resolve()` (naive) | Read `resolved_path` from observations | +| Analytics subpaths | Server `pm.resolve()` (naive) | Read `resolved_path` from observations | +| Analytics hop distances | Server `pm.resolve()` (naive) | Read `resolved_path` from observations | +| Subpath detail | Server `pm.resolve()` (naive) | `resolveWithContext` with graph | | Show Neighbors | Server neighbors API | Already correct | | `/api/resolve-hops` | Server `resolveWithContext` | Already correct | | Hex breakdown display | `path_json` raw | Unchanged — shows raw bytes | @@ -133,11 +173,10 @@ All endpoints that currently return `path_json` also return `resolved_path`: - **`path_json` display prefixes:** uppercase (raw firmware) - **`resolved_path`:** lowercase full pubkeys - **Comparison code:** normalizes to lowercase -- **No case standardization crusade** — document the convention, don't change existing behavior ## Backward compatibility -- Old observations without `resolved_path`: frontend falls back to client-side HopResolver (same as today). +- Old observations without `resolved_path`: resolved during cold startup backfill (step 4). If still `null` after backfill, frontend falls back to client-side HopResolver. - `resolved_path` field uses `omitempty` — absent from JSON when not set. ### Fallback pattern (frontend) @@ -153,19 +192,21 @@ function getResolvedHops(packet) { ## Implementation milestones ### M1: Persist graph to SQLite + load on startup + incremental updates at ingest -- Create `neighbor_edges` table in SQLite -- Build graph from existing packet data on first run -- Load from SQLite on subsequent runs (instant startup) +- Create `neighbor_edges` table in SQLite (schema above) +- On first run: `BuildFromStore(packets)` — scan all packets, extract edges per ADVERT/non-ADVERT rules, INSERT into table +- On subsequent runs: load from SQLite → build in-memory graph (instant startup) - Upsert edges incrementally during packet ingest - Graph lives on `PacketStore`, not `Server` -- Tests: graph persistence, load, incremental update +- Tests: graph persistence, load, incremental update, ADVERT vs non-ADVERT edge extraction -### M2: Add `resolved_path` to observations at ingest time +### M2: Add `resolved_path` column to observations + resolve at ingest +- `ALTER TABLE observations ADD COLUMN resolved_path TEXT` - Add `ResolvedPath []*string` to `Observation` struct -- Resolve during `IngestNewFromDB`/`IngestNewObservations` -- Write `resolved_path` back to SQLite observations table -- Schema migration to add `resolved_path` column -- Tests: unit test resolution at ingest, verify stored values +- Resolve during `IngestNewFromDB` — same INSERT, one write +- Cold startup backfill: resolve observations with NULL `resolved_path`, UPDATE in SQLite +- Migrate ALL 6 `pm.resolve()` call sites to `resolveWithContext` or read from `resolved_path` +- Remove dead `pm.resolve()` code +- Tests: unit test resolution at ingest, verify stored values, verify all call sites use graph ### M3: Update all API responses to include `resolved_path` - Include `resolved_path` in all packet/observation API responses @@ -173,7 +214,7 @@ function getResolvedHops(packet) { - Tests: verify API response shape, WS broadcast shape ### M4: Update frontend consumers to prefer `resolved_path` -- Update `packets.js`, `map.js`, `live.js`, `analytics.js`, `nodes.js`, `traces.js` +- Update `packets.js`, `map.js`, `live.js`, `analytics.js`, `nodes.js` - Add fallback to `path_json` + `HopResolver` for old packets - `hop-resolver.js` becomes fallback only - Tests: Playwright tests for path display