# Hash Prefix Disambiguation in MeshCore Analyzer ## Section 1: Executive Summary ### What Are Hash Prefixes? MeshCore is a LoRa mesh network where every packet records the nodes it passed through (its "path"). To save bandwidth on a constrained radio link, each hop in the path is stored as a **truncated hash** of the node's public key — typically just **1 byte** (2 hex characters), though the firmware supports 1–3 bytes per hop. With 1-byte hashes, there are only 256 possible values. In any mesh with more than ~20 nodes, **collisions are inevitable** — multiple nodes share the same prefix. ### How Disambiguation Works When displaying a packet's path (e.g., `A3 → 7F → B1`), the system must figure out *which* node each prefix refers to. The algorithm is the same everywhere: 1. **Prefix lookup** — Find all known nodes whose public key starts with the hop's hex prefix 2. **Trivial case** — If exactly one match, use it 3. **Regional filtering** (server `/api/resolve-hops` only) — If the packet came from a known geographic region (via observer IATA code), filter candidates to nodes near that region 4. **Forward pass** — Walk the path left-to-right; for each ambiguous hop, pick the candidate closest to the previous resolved hop 5. **Backward pass** — Walk right-to-left for any still-unresolved hops, using the next hop as anchor 6. **Sanity check** — Flag hops that are geographically implausible (>~200 km from both neighbors) as `unreliable` ### When It Matters - **Packet path display** (packets page, node detail, live feed) — every path shown to users goes through disambiguation - **Topology analysis** (analytics subpaths) — route patterns rely on correctly identifying repeaters - **Map route overlay** — drawing lines between hops on a map requires resolved coordinates - **Auto-learning** — the server creates stub node records for unknown 2+ byte hop prefixes ### Known Limitations - **1-byte prefixes are inherently lossy** — 256 possible values for potentially thousands of nodes. Regional filtering helps but can't solve all collisions. - **Nodes without GPS** — If no candidates have coordinates, geographic disambiguation can't help; the first candidate wins arbitrarily. - **Regional filtering is server-only** — The `/api/resolve-hops` endpoint has observer-based fallback filtering (for GPS-less nodes seen by regional observers). The client-side `HopResolver` only does geographic regional filtering. - **Stale prefix index** — The server caches the prefix index on the `allNodes` array object. It's cleared on node upsert but could theoretically serve stale data briefly. ### How the Two-Pass Algorithm Works A packet path arrives as truncated hex prefixes. Some resolve to one node (unique), some match multiple (ambiguous). Two passes guarantee every hop gets resolved: ```mermaid flowchart LR subgraph raw["① Candidate Lookup"] direction LR r1(("A3
3 matches")):::ambig r2(("7F
1 match")):::known r3(("B1
2 matches")):::ambig r4(("E4
1 match")):::known r5(("A3
4 matches")):::ambig end r1---r2---r3---r4---r5 classDef known fill:#166534,color:#fff,stroke:#22c55e classDef ambig fill:#991b1b,color:#fff,stroke:#ef4444 ``` ```mermaid flowchart LR subgraph fwd["② Forward Pass → pick nearest to previous resolved hop"] direction LR f1(("A3
skip ❌")):::ambig f2(("7F
anchor")):::known f3(("B1→✅
nearest 7F")):::resolved f4(("E4
anchor")):::known f5(("A3→✅
nearest E4")):::resolved end f1-- "→" ---f2-- "→" ---f3-- "→" ---f4-- "→" ---f5 classDef known fill:#166534,color:#fff,stroke:#22c55e classDef ambig fill:#991b1b,color:#fff,stroke:#ef4444 classDef resolved fill:#1e40af,color:#fff,stroke:#3b82f6 ``` ```mermaid flowchart RL subgraph bwd["③ Backward Pass ← catch hops the forward pass missed"] direction RL b5(("A3 ✅")):::known b4(("E4 ✅")):::known b3(("B1 ✅")):::known b2(("7F
anchor")):::known b1(("A3→✅
nearest 7F")):::resolved end b5-- "←" ---b4-- "←" ---b3-- "←" ---b2-- "←" ---b1 classDef known fill:#166534,color:#fff,stroke:#22c55e classDef resolved fill:#1e40af,color:#fff,stroke:#3b82f6 ``` **Forward** resolves hops that have a known node to their left. **Backward** catches the ones at the start of the path that had no left anchor. After both passes, every hop either resolved to a specific node or has no candidates at all. --- ## Section 2: Technical Details ### 2.1 Firmware: How Hops Are Encoded From `firmware/src/MeshCore.h`: ```c #define PATH_HASH_SIZE 1 // Default: 1 byte per hop #define MAX_HASH_SIZE 8 // Maximum hash size for dedup tables ``` The **path_length byte** in each packet encodes both hop count and hash size: - **Bits 0–5**: hop count (0–63) - **Bits 6–7**: hash size minus 1 (`0b00` = 1 byte, `0b01` = 2 bytes, `0b10` = 3 bytes) From `firmware/docs/packet_format.md`: the path section is `hop_count × hash_size` bytes, with a maximum of 64 bytes (`MAX_PATH_SIZE`). Each hop hash is the first N bytes of the node's public key hash. The `sendFlood()` function accepts `path_hash_size` parameter (default 1), allowing nodes to use larger hashes when configured. ### 2.2 Decoder: Extracting Hops `decoder.js` — `decodePath(pathByte, buf, offset)`: ```javascript const hashSize = (pathByte >> 6) + 1; // 1-4 bytes per hash const hashCount = pathByte & 0x3F; // 0-63 hops ``` Each hop is extracted as `hashSize` bytes of hex. The decoder is straightforward and doesn't do any disambiguation — it outputs raw hex prefixes. ### 2.3 Server-Side Disambiguation There are **three** disambiguation implementations on the server: #### 2.3.1 `disambiguateHops()` — in both `server.js` (line 498) and `server-helpers.js` (line 149) The primary workhorse. Used by most API endpoints. Algorithm: 1. **Build prefix index** (cached on the `allNodes` array): - For each node, index its public key at 1-byte (2 hex), 2-byte (4 hex), and 3-byte (6 hex) prefix lengths - `_prefixIdx[prefix]` → array of matching nodes - `_prefixIdxName[prefix]` → first matching node (name fallback) 2. **First pass — candidate matching**: - Look up `prefixIdx[hop]`; filter to nodes with valid coordinates - 1 match with coords → resolved - Multiple matches with coords → ambiguous (keep candidate list) - 0 matches with coords → fall back to `prefixIdxName` for name only 3. **Forward pass**: Walk left→right, sort ambiguous candidates by distance to last known position, pick closest. 4. **Backward pass**: Walk right→left, same logic with next known position. 5. **Sanity check**: Mark hops as `unreliable` if they're >MAX_HOP_DIST (default 1.8° ≈ 200 km) from both neighbors. Clear their lat/lon. **Callsites** (all in `server.js`): | Line | Context | |------|---------| | 2480 | `/api/paths` — group and resolve path display | | 2659 | `/api/analytics/topology` — topology graph | | 2720 | `/api/analytics/subpaths` — route pattern analysis | | 2788 | `/api/node/:key` — node detail page, packet paths | | 2822 | `/api/node/:key` — parent path resolution | #### 2.3.2 `/api/resolve-hops` endpoint (server.js line 1944) The most sophisticated version — used by the client-side packets page as fallback (though `HopResolver` handles most cases now). Additional features beyond `disambiguateHops()`: - **Regional filtering**: Uses observer IATA codes to determine packet region - **Layer 1 (Geographic)**: If candidate has GPS, check distance to IATA region center (≤300 km) - **Layer 2 (Observer-based)**: If candidate has no GPS, check if its adverts were seen by regional observers - **Origin/observer anchoring**: Accepts `originLat/originLon` (sender position) as forward anchor and derives observer position as backward anchor - **Linear scan for candidates**: Uses `allNodes.filter(startsWith)` instead of prefix index (slower but always fresh) #### 2.3.3 Inline `resolveHop()` in analytics endpoints Two analytics endpoints (`/api/analytics/topology` line 1432 and `/api/analytics/hash-issues` line 1699) define local `resolveHop()` closures that do simple prefix matching without the full forward/backward pass — they resolve hops individually without path context. ### 2.4 `autoLearnHopNodes()` (server.js line 569) When packets arrive, this function checks each hop: - Skips 1-byte hops (too ambiguous to learn from) - For 2+ byte hops not already in the DB, creates a stub node record with `role: 'repeater'` - Uses an in-memory `hopNodeCache` Set to avoid redundant DB queries Called during: - MQTT packet ingestion (line 686) - HTTP packet submission (line 1019) ### 2.5 Client-Side: `HopResolver` (public/hop-resolver.js) A client-side IIFE (`window.HopResolver`) that mirrors the server's algorithm to avoid HTTP round-trips. Key differences from server: | Aspect | Server `disambiguateHops()` | Server `/api/resolve-hops` | Client `HopResolver` | |--------|---------------------------|---------------------------|---------------------| | Prefix index | Cached on allNodes array | Linear filter | Built in `init()` | | Regional filtering | None | IATA geo + observer-based | IATA geo only | | Origin anchor | None | Yes (from query params) | Yes (from params) | | Observer anchor | None | Yes (derived from DB) | Yes (from params) | | Sanity check | unreliable + clear coords | unreliable only | unreliable flag | | Distance function | `geoDist()` (Euclidean) | `dist()` (Euclidean) | `dist()` (Euclidean) | **Initialization**: `HopResolver.init(nodes, { observers, iataCoords })` — builds prefix index for 1–3 byte prefixes. **Resolution**: `HopResolver.resolve(hops, originLat, originLon, observerLat, observerLon, observerId)` — runs the same 3-phase algorithm (candidates → forward → backward → sanity check). ### 2.6 Client-Side: `resolveHopPositions()` in live.js (line 1562) The live feed page has its **own independent implementation** that doesn't use `HopResolver`. It: - Filters from `nodeData` (live feed's own node cache) - Uses the same forward/backward/sanity algorithm - Also prepends the sender as position anchor - Includes "ghost hop" rendering for unresolved hops ### 2.7 Where Disambiguation Is Applied (All Callsites) **Server-side:** | File | Function/Line | What | |------|--------------|------| | server.js:498 | `disambiguateHops()` | Core algorithm | | server.js:569 | `autoLearnHopNodes()` | Stub node creation for 2+ byte hops | | server.js:1432 | inline `resolveHop()` | Analytics topology tab | | server.js:1699 | inline `resolveHop()` | Analytics hash-issues tab | | server.js:1944 | `/api/resolve-hops` | Full resolution with regional filtering | | server.js:2480 | paths endpoint | Path grouping | | server.js:2659 | topology endpoint | Topology graph | | server.js:2720 | subpaths endpoint | Route patterns | | server.js:2788 | node detail | Packet paths for a node | | server.js:2822 | node detail | Parent paths | | server-helpers.js:149 | `disambiguateHops()` | Extracted copy (used by tests) | **Client-side:** | File | Function | What | |------|---------|------| | hop-resolver.js | `HopResolver.resolve()` | Main client resolver | | packets.js:121+ | `resolveHops()` wrapper | Packets page path display | | packets.js:1388 | Direct `HopResolver.resolve()` | Packet detail pane with sender context | | live.js:1562 | `resolveHopPositions()` | Live feed path lines (independent impl) | | map.js:273+ | Route overlay | Map path drawing (uses server-resolved data) | | analytics.js | subpaths display | Renders server-resolved names | ### 2.8 Consistency Analysis **Core algorithm**: All implementations use the same 3-phase approach (candidate lookup → forward pass → backward pass → sanity check). The logic is consistent. **Discrepancies found:** 1. **`server.js disambiguateHops()` vs `server-helpers.js disambiguateHops()`**: These are near-identical copies. The server-helpers version is extracted for testing. Both use `geoDist()` (Euclidean approximation). No functional discrepancy. 2. **`/api/resolve-hops` vs `disambiguateHops()`**: The API endpoint is significantly more capable — it has regional filtering (IATA geo + observer-based) and origin/observer anchoring. Endpoints that use `disambiguateHops()` directly (paths, topology, subpaths, node detail) **do not benefit from regional filtering**, which may produce different results for the same hops. 3. **`live.js resolveHopPositions()` vs `HopResolver`**: The live feed reimplements disambiguation independently. It lacks: - Regional/IATA filtering - Origin/observer GPS anchoring (it does use sender position as anchor, but differently) - The prefix index optimization (uses linear `Array.filter()`) 4. **Inline `resolveHop()` in analytics**: These resolve hops individually without path context (no forward/backward pass). A hop ambiguous between two nodes will always get the first match rather than the geographically consistent one. 5. **`disambiguateHops()` only considers nodes with coordinates** for the candidate list. Nodes without GPS are filtered out in the first pass. The `/api/resolve-hops` endpoint also returns no-GPS nodes in its candidate list and uses observer-based region filtering as fallback. ### 2.9 Edge Cases | Edge Case | Behavior | |-----------|----------| | **No candidates** | Hop displayed as raw hex prefix | | **All candidates lack GPS** | `disambiguateHops()`: name from `prefixIdxName` (first indexed), no position. `HopResolver`: first candidate wins | | **Ambiguous after both passes** | First candidate in list wins (effectively random without position data) | | **Mixed hash sizes in same path** | Each hop is whatever length the decoder extracted. Prefix index handles variable lengths (indexed at 1, 2, 3 byte prefixes) | | **Self-loops in subpaths** | Same prefix appearing twice likely means a collision, not an actual loop. Analytics UI flags these with 🔄 and offers "hide collisions" checkbox | | **Unknown observers** | Regional filtering falls back to no filtering; all candidates considered | | **0,0 coordinates** | Explicitly excluded everywhere (`!(lat === 0 && lon === 0)`) | ### 2.10 Visual Decollision on the Map (Different System) The **map label deconfliction** in `map.js` (`deconflictLabels()`, line 367+) is a completely different system. It handles **visual overlap** of hop labels on the Leaflet map — when two resolved hops are at nearby coordinates, their text labels would overlap. The function offsets labels to prevent visual collision using bounding-box checks. This is **not related** to hash prefix disambiguation — it operates on already-resolved, positioned hops and only affects label rendering, not which node a prefix maps to. Similarly, the map's "cluster" mode (`L.markerClusterGroup`) groups nearby node markers visually and is unrelated to hash disambiguation. ### 2.11 Data Flow Diagram ``` FIRMWARE (LoRa) │ Packet with 1-3 byte hop hashes │ ▼ ┌─────────────────┐ │ MQTT Broker(s) │ └────────┬────────┘ │ ▼ ┌──────────────────────┐ │ server.js │ │ ┌────────────────┐ │ │ │ decoder.js │ │ Extract raw hop hex prefixes │ └───────┬────────┘ │ │ │ │ │ ▼ │ │ autoLearnHopNodes() │ Create stub nodes for 2+ byte hops │ │ │ │ packet-store.js │ Store packet with raw hops └──────────┬──────────┘ │ ┌────────────┼────────────┐ │ │ │ ▼ ▼ ▼ REST API calls WebSocket /api/resolve-hops │ broadcast │ │ │ │ ▼ │ ▼ disambiguateHops() │ Regional filtering + (no regional filter) │ geo disambiguation │ │ │ ▼ ▼ ▼ ┌────────────────────────────────────┐ │ BROWSER │ │ │ │ packets.js ──► HopResolver.resolve() │ (geo + IATA regional filtering) │ │ │ │ live.js ──► resolveHopPositions() │ │ (geo only, independent impl) │ │ │ │ map.js ──► deconflictLabels() │ │ (visual label offsets only) │ │ │ │ analytics.js ──► server-resolved │ └─────────────────────────────────────┘ ``` ### 2.12 Hash Size Detection Separate from disambiguation but closely related: the system tracks which hash size each node uses. `server-helpers.js` has `updateHashSizeForPacket()` and `rebuildHashSizeMap()` which extract the hash_size from the path_length byte. This feeds the analytics "hash issues" tab which detects nodes that flip-flop between hash sizes (a firmware behavior that complicates analysis).