Files
meshcore-analyzer/docs/HASH-PREFIX-DISAMBIGUATION.md
you ae7010ae0c docs: move diagrams to exec summary, compact horizontal layout
Smaller circle nodes, horizontal flowcharts, clear color coding
(red=ambiguous, green=known, blue=just resolved). Removed duplicate
diagrams from section 2.
2026-03-24 20:52:06 +00:00

18 KiB
Raw Permalink Blame History

Hash Prefix Disambiguation in MeshCore Analyzer

Section 1: Executive Summary

What Are Hash Prefixes?

MeshCore is a LoRa mesh network where every packet records the nodes it passed through (its "path"). To save bandwidth on a constrained radio link, each hop in the path is stored as a truncated hash of the node's public key — typically just 1 byte (2 hex characters), though the firmware supports 13 bytes per hop.

With 1-byte hashes, there are only 256 possible values. In any mesh with more than ~20 nodes, collisions are inevitable — multiple nodes share the same prefix.

How Disambiguation Works

When displaying a packet's path (e.g., A3 → 7F → B1), the system must figure out which node each prefix refers to. The algorithm is the same everywhere:

  1. Prefix lookup — Find all known nodes whose public key starts with the hop's hex prefix
  2. Trivial case — If exactly one match, use it
  3. Regional filtering (server /api/resolve-hops only) — If the packet came from a known geographic region (via observer IATA code), filter candidates to nodes near that region
  4. Forward pass — Walk the path left-to-right; for each ambiguous hop, pick the candidate closest to the previous resolved hop
  5. Backward pass — Walk right-to-left for any still-unresolved hops, using the next hop as anchor
  6. Sanity check — Flag hops that are geographically implausible (>~200 km from both neighbors) as unreliable

When It Matters

  • Packet path display (packets page, node detail, live feed) — every path shown to users goes through disambiguation
  • Topology analysis (analytics subpaths) — route patterns rely on correctly identifying repeaters
  • Map route overlay — drawing lines between hops on a map requires resolved coordinates
  • Auto-learning — the server creates stub node records for unknown 2+ byte hop prefixes

Known Limitations

  • 1-byte prefixes are inherently lossy — 256 possible values for potentially thousands of nodes. Regional filtering helps but can't solve all collisions.
  • Nodes without GPS — If no candidates have coordinates, geographic disambiguation can't help; the first candidate wins arbitrarily.
  • Regional filtering is server-only — The /api/resolve-hops endpoint has observer-based fallback filtering (for GPS-less nodes seen by regional observers). The client-side HopResolver only does geographic regional filtering.
  • Stale prefix index — The server caches the prefix index on the allNodes array object. It's cleared on node upsert but could theoretically serve stale data briefly.

How the Two-Pass Algorithm Works

A packet path arrives as truncated hex prefixes. Some resolve to one node (unique), some match multiple (ambiguous). Two passes guarantee every hop gets resolved:

flowchart LR
    subgraph raw["① Candidate Lookup"]
        direction LR
        r1(("A3<br>3 matches")):::ambig
        r2(("7F<br>1 match")):::known
        r3(("B1<br>2 matches")):::ambig
        r4(("E4<br>1 match")):::known
        r5(("A3<br>4 matches")):::ambig
    end
    r1---r2---r3---r4---r5

    classDef known fill:#166534,color:#fff,stroke:#22c55e
    classDef ambig fill:#991b1b,color:#fff,stroke:#ef4444
flowchart LR
    subgraph fwd["② Forward Pass →  pick nearest to previous resolved hop"]
        direction LR
        f1(("A3<br>skip ❌")):::ambig
        f2(("7F<br>anchor")):::known
        f3(("B1→✅<br>nearest 7F")):::resolved
        f4(("E4<br>anchor")):::known
        f5(("A3→✅<br>nearest E4")):::resolved
    end
    f1-- "→" ---f2-- "→" ---f3-- "→" ---f4-- "→" ---f5

    classDef known fill:#166534,color:#fff,stroke:#22c55e
    classDef ambig fill:#991b1b,color:#fff,stroke:#ef4444
    classDef resolved fill:#1e40af,color:#fff,stroke:#3b82f6
flowchart RL
    subgraph bwd["③ Backward Pass ←  catch hops the forward pass missed"]
        direction RL
        b5(("A3 ✅")):::known
        b4(("E4 ✅")):::known
        b3(("B1 ✅")):::known
        b2(("7F<br>anchor")):::known
        b1(("A3→✅<br>nearest 7F")):::resolved
    end
    b5-- "←" ---b4-- "←" ---b3-- "←" ---b2-- "←" ---b1

    classDef known fill:#166534,color:#fff,stroke:#22c55e
    classDef resolved fill:#1e40af,color:#fff,stroke:#3b82f6

Forward resolves hops that have a known node to their left. Backward catches the ones at the start of the path that had no left anchor. After both passes, every hop either resolved to a specific node or has no candidates at all.


Section 2: Technical Details

2.1 Firmware: How Hops Are Encoded

From firmware/src/MeshCore.h:

#define PATH_HASH_SIZE  1    // Default: 1 byte per hop
#define MAX_HASH_SIZE   8    // Maximum hash size for dedup tables

The path_length byte in each packet encodes both hop count and hash size:

  • Bits 05: hop count (063)
  • Bits 67: hash size minus 1 (0b00 = 1 byte, 0b01 = 2 bytes, 0b10 = 3 bytes)

From firmware/docs/packet_format.md: the path section is hop_count × hash_size bytes, with a maximum of 64 bytes (MAX_PATH_SIZE). Each hop hash is the first N bytes of the node's public key hash.

The sendFlood() function accepts path_hash_size parameter (default 1), allowing nodes to use larger hashes when configured.

2.2 Decoder: Extracting Hops

decoder.jsdecodePath(pathByte, buf, offset):

const hashSize = (pathByte >> 6) + 1;   // 1-4 bytes per hash
const hashCount = pathByte & 0x3F;       // 0-63 hops

Each hop is extracted as hashSize bytes of hex. The decoder is straightforward and doesn't do any disambiguation — it outputs raw hex prefixes.

2.3 Server-Side Disambiguation

There are three disambiguation implementations on the server:

2.3.1 disambiguateHops() — in both server.js (line 498) and server-helpers.js (line 149)

The primary workhorse. Used by most API endpoints. Algorithm:

  1. Build prefix index (cached on the allNodes array):

    • For each node, index its public key at 1-byte (2 hex), 2-byte (4 hex), and 3-byte (6 hex) prefix lengths
    • _prefixIdx[prefix] → array of matching nodes
    • _prefixIdxName[prefix] → first matching node (name fallback)
  2. First pass — candidate matching:

    • Look up prefixIdx[hop]; filter to nodes with valid coordinates
    • 1 match with coords → resolved
    • Multiple matches with coords → ambiguous (keep candidate list)
    • 0 matches with coords → fall back to prefixIdxName for name only
  3. Forward pass: Walk left→right, sort ambiguous candidates by distance to last known position, pick closest.

  4. Backward pass: Walk right→left, same logic with next known position.

  5. Sanity check: Mark hops as unreliable if they're >MAX_HOP_DIST (default 1.8° ≈ 200 km) from both neighbors. Clear their lat/lon.

Callsites (all in server.js):

Line Context
2480 /api/paths — group and resolve path display
2659 /api/analytics/topology — topology graph
2720 /api/analytics/subpaths — route pattern analysis
2788 /api/node/:key — node detail page, packet paths
2822 /api/node/:key — parent path resolution

2.3.2 /api/resolve-hops endpoint (server.js line 1944)

The most sophisticated version — used by the client-side packets page as fallback (though HopResolver handles most cases now). Additional features beyond disambiguateHops():

  • Regional filtering: Uses observer IATA codes to determine packet region
    • Layer 1 (Geographic): If candidate has GPS, check distance to IATA region center (≤300 km)
    • Layer 2 (Observer-based): If candidate has no GPS, check if its adverts were seen by regional observers
  • Origin/observer anchoring: Accepts originLat/originLon (sender position) as forward anchor and derives observer position as backward anchor
  • Linear scan for candidates: Uses allNodes.filter(startsWith) instead of prefix index (slower but always fresh)

2.3.3 Inline resolveHop() in analytics endpoints

Two analytics endpoints (/api/analytics/topology line 1432 and /api/analytics/hash-issues line 1699) define local resolveHop() closures that do simple prefix matching without the full forward/backward pass — they resolve hops individually without path context.

2.4 autoLearnHopNodes() (server.js line 569)

When packets arrive, this function checks each hop:

  • Skips 1-byte hops (too ambiguous to learn from)
  • For 2+ byte hops not already in the DB, creates a stub node record with role: 'repeater'
  • Uses an in-memory hopNodeCache Set to avoid redundant DB queries

Called during:

  • MQTT packet ingestion (line 686)
  • HTTP packet submission (line 1019)

2.5 Client-Side: HopResolver (public/hop-resolver.js)

A client-side IIFE (window.HopResolver) that mirrors the server's algorithm to avoid HTTP round-trips. Key differences from server:

Aspect Server disambiguateHops() Server /api/resolve-hops Client HopResolver
Prefix index Cached on allNodes array Linear filter Built in init()
Regional filtering None IATA geo + observer-based IATA geo only
Origin anchor None Yes (from query params) Yes (from params)
Observer anchor None Yes (derived from DB) Yes (from params)
Sanity check unreliable + clear coords unreliable only unreliable flag
Distance function geoDist() (Euclidean) dist() (Euclidean) dist() (Euclidean)

Initialization: HopResolver.init(nodes, { observers, iataCoords }) — builds prefix index for 13 byte prefixes.

Resolution: HopResolver.resolve(hops, originLat, originLon, observerLat, observerLon, observerId) — runs the same 3-phase algorithm (candidates → forward → backward → sanity check).

2.6 Client-Side: resolveHopPositions() in live.js (line 1562)

The live feed page has its own independent implementation that doesn't use HopResolver. It:

  • Filters from nodeData (live feed's own node cache)
  • Uses the same forward/backward/sanity algorithm
  • Also prepends the sender as position anchor
  • Includes "ghost hop" rendering for unresolved hops

2.7 Where Disambiguation Is Applied (All Callsites)

Server-side:

File Function/Line What
server.js:498 disambiguateHops() Core algorithm
server.js:569 autoLearnHopNodes() Stub node creation for 2+ byte hops
server.js:1432 inline resolveHop() Analytics topology tab
server.js:1699 inline resolveHop() Analytics hash-issues tab
server.js:1944 /api/resolve-hops Full resolution with regional filtering
server.js:2480 paths endpoint Path grouping
server.js:2659 topology endpoint Topology graph
server.js:2720 subpaths endpoint Route patterns
server.js:2788 node detail Packet paths for a node
server.js:2822 node detail Parent paths
server-helpers.js:149 disambiguateHops() Extracted copy (used by tests)

Client-side:

File Function What
hop-resolver.js HopResolver.resolve() Main client resolver
packets.js:121+ resolveHops() wrapper Packets page path display
packets.js:1388 Direct HopResolver.resolve() Packet detail pane with sender context
live.js:1562 resolveHopPositions() Live feed path lines (independent impl)
map.js:273+ Route overlay Map path drawing (uses server-resolved data)
analytics.js subpaths display Renders server-resolved names

2.8 Consistency Analysis

Core algorithm: All implementations use the same 3-phase approach (candidate lookup → forward pass → backward pass → sanity check). The logic is consistent.

Discrepancies found:

  1. server.js disambiguateHops() vs server-helpers.js disambiguateHops(): These are near-identical copies. The server-helpers version is extracted for testing. Both use geoDist() (Euclidean approximation). No functional discrepancy.

  2. /api/resolve-hops vs disambiguateHops(): The API endpoint is significantly more capable — it has regional filtering (IATA geo + observer-based) and origin/observer anchoring. Endpoints that use disambiguateHops() directly (paths, topology, subpaths, node detail) do not benefit from regional filtering, which may produce different results for the same hops.

  3. live.js resolveHopPositions() vs HopResolver: The live feed reimplements disambiguation independently. It lacks:

    • Regional/IATA filtering
    • Origin/observer GPS anchoring (it does use sender position as anchor, but differently)
    • The prefix index optimization (uses linear Array.filter())
  4. Inline resolveHop() in analytics: These resolve hops individually without path context (no forward/backward pass). A hop ambiguous between two nodes will always get the first match rather than the geographically consistent one.

  5. disambiguateHops() only considers nodes with coordinates for the candidate list. Nodes without GPS are filtered out in the first pass. The /api/resolve-hops endpoint also returns no-GPS nodes in its candidate list and uses observer-based region filtering as fallback.

2.9 Edge Cases

Edge Case Behavior
No candidates Hop displayed as raw hex prefix
All candidates lack GPS disambiguateHops(): name from prefixIdxName (first indexed), no position. HopResolver: first candidate wins
Ambiguous after both passes First candidate in list wins (effectively random without position data)
Mixed hash sizes in same path Each hop is whatever length the decoder extracted. Prefix index handles variable lengths (indexed at 1, 2, 3 byte prefixes)
Self-loops in subpaths Same prefix appearing twice likely means a collision, not an actual loop. Analytics UI flags these with 🔄 and offers "hide collisions" checkbox
Unknown observers Regional filtering falls back to no filtering; all candidates considered
0,0 coordinates Explicitly excluded everywhere (!(lat === 0 && lon === 0))

2.10 Visual Decollision on the Map (Different System)

The map label deconfliction in map.js (deconflictLabels(), line 367+) is a completely different system. It handles visual overlap of hop labels on the Leaflet map — when two resolved hops are at nearby coordinates, their text labels would overlap. The function offsets labels to prevent visual collision using bounding-box checks.

This is not related to hash prefix disambiguation — it operates on already-resolved, positioned hops and only affects label rendering, not which node a prefix maps to.

Similarly, the map's "cluster" mode (L.markerClusterGroup) groups nearby node markers visually and is unrelated to hash disambiguation.

2.11 Data Flow Diagram

                    FIRMWARE (LoRa)
                         │
              Packet with 1-3 byte hop hashes
                         │
                         ▼
                ┌─────────────────┐
                │  MQTT Broker(s) │
                └────────┬────────┘
                         │
                         ▼
              ┌──────────────────────┐
              │     server.js        │
              │  ┌────────────────┐  │
              │  │   decoder.js   │  │  Extract raw hop hex prefixes
              │  └───────┬────────┘  │
              │          │           │
              │          ▼           │
              │  autoLearnHopNodes() │  Create stub nodes for 2+ byte hops
              │          │           │
              │  packet-store.js     │  Store packet with raw hops
              └──────────┬──────────┘
                         │
            ┌────────────┼────────────┐
            │            │            │
            ▼            ▼            ▼
    REST API calls   WebSocket    /api/resolve-hops
            │        broadcast          │
            │            │              │
            ▼            │              ▼
   disambiguateHops()    │     Regional filtering +
   (no regional filter)  │     geo disambiguation
            │            │              │
            ▼            ▼              ▼
        ┌────────────────────────────────────┐
        │           BROWSER                   │
        │                                     │
        │  packets.js ──► HopResolver.resolve()
        │    (geo + IATA regional filtering)  │
        │                                     │
        │  live.js ──► resolveHopPositions()  │
        │    (geo only, independent impl)     │
        │                                     │
        │  map.js ──► deconflictLabels()      │
        │    (visual label offsets only)       │
        │                                     │
        │  analytics.js ──► server-resolved   │
        └─────────────────────────────────────┘

2.12 Hash Size Detection

Separate from disambiguation but closely related: the system tracks which hash size each node uses. server-helpers.js has updateHashSizeForPacket() and rebuildHashSizeMap() which extract the hash_size from the path_length byte. This feeds the analytics "hash issues" tab which detects nodes that flip-flop between hash sizes (a firmware behavior that complicates analysis).