# Hash Prefix Disambiguation in MeshCore Analyzer
## Section 1: Executive Summary
### What Are Hash Prefixes?
MeshCore is a LoRa mesh network where every packet records the nodes it passed through (its "path"). To save bandwidth on a constrained radio link, each hop in the path is stored as a **truncated hash** of the node's public key — typically just **1 byte** (2 hex characters), though the firmware supports 1–3 bytes per hop.
With 1-byte hashes, there are only 256 possible values. In any mesh with more than ~20 nodes, **collisions are inevitable** — multiple nodes share the same prefix.
### How Disambiguation Works
When displaying a packet's path (e.g., `A3 → 7F → B1`), the system must figure out *which* node each prefix refers to. The algorithm is the same everywhere:
1. **Prefix lookup** — Find all known nodes whose public key starts with the hop's hex prefix
2. **Trivial case** — If exactly one match, use it
3. **Regional filtering** (server `/api/resolve-hops` only) — If the packet came from a known geographic region (via observer IATA code), filter candidates to nodes near that region
4. **Forward pass** — Walk the path left-to-right; for each ambiguous hop, pick the candidate closest to the previous resolved hop
5. **Backward pass** — Walk right-to-left for any still-unresolved hops, using the next hop as anchor
6. **Sanity check** — Flag hops that are geographically implausible (>~200 km from both neighbors) as `unreliable`
### When It Matters
- **Packet path display** (packets page, node detail, live feed) — every path shown to users goes through disambiguation
- **Topology analysis** (analytics subpaths) — route patterns rely on correctly identifying repeaters
- **Map route overlay** — drawing lines between hops on a map requires resolved coordinates
- **Auto-learning** — the server creates stub node records for unknown 2+ byte hop prefixes
### Known Limitations
- **1-byte prefixes are inherently lossy** — 256 possible values for potentially thousands of nodes. Regional filtering helps but can't solve all collisions.
- **Nodes without GPS** — If no candidates have coordinates, geographic disambiguation can't help; the first candidate wins arbitrarily.
- **Regional filtering is server-only** — The `/api/resolve-hops` endpoint has observer-based fallback filtering (for GPS-less nodes seen by regional observers). The client-side `HopResolver` only does geographic regional filtering.
- **Stale prefix index** — The server caches the prefix index on the `allNodes` array object. It's cleared on node upsert but could theoretically serve stale data briefly.
### How the Two-Pass Algorithm Works
A packet path arrives as truncated hex prefixes. Some resolve to one node (unique), some match multiple (ambiguous). Two passes guarantee every hop gets resolved:
```mermaid
flowchart LR
subgraph raw["① Candidate Lookup"]
direction LR
r1(("A3
3 matches")):::ambig
r2(("7F
1 match")):::known
r3(("B1
2 matches")):::ambig
r4(("E4
1 match")):::known
r5(("A3
4 matches")):::ambig
end
r1---r2---r3---r4---r5
classDef known fill:#166534,color:#fff,stroke:#22c55e
classDef ambig fill:#991b1b,color:#fff,stroke:#ef4444
```
```mermaid
flowchart LR
subgraph fwd["② Forward Pass → pick nearest to previous resolved hop"]
direction LR
f1(("A3
skip ❌")):::ambig
f2(("7F
anchor")):::known
f3(("B1→✅
nearest 7F")):::resolved
f4(("E4
anchor")):::known
f5(("A3→✅
nearest E4")):::resolved
end
f1-- "→" ---f2-- "→" ---f3-- "→" ---f4-- "→" ---f5
classDef known fill:#166534,color:#fff,stroke:#22c55e
classDef ambig fill:#991b1b,color:#fff,stroke:#ef4444
classDef resolved fill:#1e40af,color:#fff,stroke:#3b82f6
```
```mermaid
flowchart RL
subgraph bwd["③ Backward Pass ← catch hops the forward pass missed"]
direction RL
b5(("A3 ✅")):::known
b4(("E4 ✅")):::known
b3(("B1 ✅")):::known
b2(("7F
anchor")):::known
b1(("A3→✅
nearest 7F")):::resolved
end
b5-- "←" ---b4-- "←" ---b3-- "←" ---b2-- "←" ---b1
classDef known fill:#166534,color:#fff,stroke:#22c55e
classDef resolved fill:#1e40af,color:#fff,stroke:#3b82f6
```
**Forward** resolves hops that have a known node to their left. **Backward** catches the ones at the start of the path that had no left anchor. After both passes, every hop either resolved to a specific node or has no candidates at all.
---
## Section 2: Technical Details
### 2.1 Firmware: How Hops Are Encoded
From `firmware/src/MeshCore.h`:
```c
#define PATH_HASH_SIZE 1 // Default: 1 byte per hop
#define MAX_HASH_SIZE 8 // Maximum hash size for dedup tables
```
The **path_length byte** in each packet encodes both hop count and hash size:
- **Bits 0–5**: hop count (0–63)
- **Bits 6–7**: hash size minus 1 (`0b00` = 1 byte, `0b01` = 2 bytes, `0b10` = 3 bytes)
From `firmware/docs/packet_format.md`: the path section is `hop_count × hash_size` bytes, with a maximum of 64 bytes (`MAX_PATH_SIZE`). Each hop hash is the first N bytes of the node's public key hash.
The `sendFlood()` function accepts `path_hash_size` parameter (default 1), allowing nodes to use larger hashes when configured.
### 2.2 Decoder: Extracting Hops
`decoder.js` — `decodePath(pathByte, buf, offset)`:
```javascript
const hashSize = (pathByte >> 6) + 1; // 1-4 bytes per hash
const hashCount = pathByte & 0x3F; // 0-63 hops
```
Each hop is extracted as `hashSize` bytes of hex. The decoder is straightforward and doesn't do any disambiguation — it outputs raw hex prefixes.
### 2.3 Server-Side Disambiguation
There are **three** disambiguation implementations on the server:
#### 2.3.1 `disambiguateHops()` — in both `server.js` (line 498) and `server-helpers.js` (line 149)
The primary workhorse. Used by most API endpoints. Algorithm:
1. **Build prefix index** (cached on the `allNodes` array):
- For each node, index its public key at 1-byte (2 hex), 2-byte (4 hex), and 3-byte (6 hex) prefix lengths
- `_prefixIdx[prefix]` → array of matching nodes
- `_prefixIdxName[prefix]` → first matching node (name fallback)
2. **First pass — candidate matching**:
- Look up `prefixIdx[hop]`; filter to nodes with valid coordinates
- 1 match with coords → resolved
- Multiple matches with coords → ambiguous (keep candidate list)
- 0 matches with coords → fall back to `prefixIdxName` for name only
3. **Forward pass**: Walk left→right, sort ambiguous candidates by distance to last known position, pick closest.
4. **Backward pass**: Walk right→left, same logic with next known position.
5. **Sanity check**: Mark hops as `unreliable` if they're >MAX_HOP_DIST (default 1.8° ≈ 200 km) from both neighbors. Clear their lat/lon.
**Callsites** (all in `server.js`):
| Line | Context |
|------|---------|
| 2480 | `/api/paths` — group and resolve path display |
| 2659 | `/api/analytics/topology` — topology graph |
| 2720 | `/api/analytics/subpaths` — route pattern analysis |
| 2788 | `/api/node/:key` — node detail page, packet paths |
| 2822 | `/api/node/:key` — parent path resolution |
#### 2.3.2 `/api/resolve-hops` endpoint (server.js line 1944)
The most sophisticated version — used by the client-side packets page as fallback (though `HopResolver` handles most cases now). Additional features beyond `disambiguateHops()`:
- **Regional filtering**: Uses observer IATA codes to determine packet region
- **Layer 1 (Geographic)**: If candidate has GPS, check distance to IATA region center (≤300 km)
- **Layer 2 (Observer-based)**: If candidate has no GPS, check if its adverts were seen by regional observers
- **Origin/observer anchoring**: Accepts `originLat/originLon` (sender position) as forward anchor and derives observer position as backward anchor
- **Linear scan for candidates**: Uses `allNodes.filter(startsWith)` instead of prefix index (slower but always fresh)
#### 2.3.3 Inline `resolveHop()` in analytics endpoints
Two analytics endpoints (`/api/analytics/topology` line 1432 and `/api/analytics/hash-issues` line 1699) define local `resolveHop()` closures that do simple prefix matching without the full forward/backward pass — they resolve hops individually without path context.
### 2.4 `autoLearnHopNodes()` (server.js line 569)
When packets arrive, this function checks each hop:
- Skips 1-byte hops (too ambiguous to learn from)
- For 2+ byte hops not already in the DB, creates a stub node record with `role: 'repeater'`
- Uses an in-memory `hopNodeCache` Set to avoid redundant DB queries
Called during:
- MQTT packet ingestion (line 686)
- HTTP packet submission (line 1019)
### 2.5 Client-Side: `HopResolver` (public/hop-resolver.js)
A client-side IIFE (`window.HopResolver`) that mirrors the server's algorithm to avoid HTTP round-trips. Key differences from server:
| Aspect | Server `disambiguateHops()` | Server `/api/resolve-hops` | Client `HopResolver` |
|--------|---------------------------|---------------------------|---------------------|
| Prefix index | Cached on allNodes array | Linear filter | Built in `init()` |
| Regional filtering | None | IATA geo + observer-based | IATA geo only |
| Origin anchor | None | Yes (from query params) | Yes (from params) |
| Observer anchor | None | Yes (derived from DB) | Yes (from params) |
| Sanity check | unreliable + clear coords | unreliable only | unreliable flag |
| Distance function | `geoDist()` (Euclidean) | `dist()` (Euclidean) | `dist()` (Euclidean) |
**Initialization**: `HopResolver.init(nodes, { observers, iataCoords })` — builds prefix index for 1–3 byte prefixes.
**Resolution**: `HopResolver.resolve(hops, originLat, originLon, observerLat, observerLon, observerId)` — runs the same 3-phase algorithm (candidates → forward → backward → sanity check).
### 2.6 Client-Side: `resolveHopPositions()` in live.js (line 1562)
The live feed page has its **own independent implementation** that doesn't use `HopResolver`. It:
- Filters from `nodeData` (live feed's own node cache)
- Uses the same forward/backward/sanity algorithm
- Also prepends the sender as position anchor
- Includes "ghost hop" rendering for unresolved hops
### 2.7 Where Disambiguation Is Applied (All Callsites)
**Server-side:**
| File | Function/Line | What |
|------|--------------|------|
| server.js:498 | `disambiguateHops()` | Core algorithm |
| server.js:569 | `autoLearnHopNodes()` | Stub node creation for 2+ byte hops |
| server.js:1432 | inline `resolveHop()` | Analytics topology tab |
| server.js:1699 | inline `resolveHop()` | Analytics hash-issues tab |
| server.js:1944 | `/api/resolve-hops` | Full resolution with regional filtering |
| server.js:2480 | paths endpoint | Path grouping |
| server.js:2659 | topology endpoint | Topology graph |
| server.js:2720 | subpaths endpoint | Route patterns |
| server.js:2788 | node detail | Packet paths for a node |
| server.js:2822 | node detail | Parent paths |
| server-helpers.js:149 | `disambiguateHops()` | Extracted copy (used by tests) |
**Client-side:**
| File | Function | What |
|------|---------|------|
| hop-resolver.js | `HopResolver.resolve()` | Main client resolver |
| packets.js:121+ | `resolveHops()` wrapper | Packets page path display |
| packets.js:1388 | Direct `HopResolver.resolve()` | Packet detail pane with sender context |
| live.js:1562 | `resolveHopPositions()` | Live feed path lines (independent impl) |
| map.js:273+ | Route overlay | Map path drawing (uses server-resolved data) |
| analytics.js | subpaths display | Renders server-resolved names |
### 2.8 Consistency Analysis
**Core algorithm**: All implementations use the same 3-phase approach (candidate lookup → forward pass → backward pass → sanity check). The logic is consistent.
**Discrepancies found:**
1. **`server.js disambiguateHops()` vs `server-helpers.js disambiguateHops()`**: These are near-identical copies. The server-helpers version is extracted for testing. Both use `geoDist()` (Euclidean approximation). No functional discrepancy.
2. **`/api/resolve-hops` vs `disambiguateHops()`**: The API endpoint is significantly more capable — it has regional filtering (IATA geo + observer-based) and origin/observer anchoring. Endpoints that use `disambiguateHops()` directly (paths, topology, subpaths, node detail) **do not benefit from regional filtering**, which may produce different results for the same hops.
3. **`live.js resolveHopPositions()` vs `HopResolver`**: The live feed reimplements disambiguation independently. It lacks:
- Regional/IATA filtering
- Origin/observer GPS anchoring (it does use sender position as anchor, but differently)
- The prefix index optimization (uses linear `Array.filter()`)
4. **Inline `resolveHop()` in analytics**: These resolve hops individually without path context (no forward/backward pass). A hop ambiguous between two nodes will always get the first match rather than the geographically consistent one.
5. **`disambiguateHops()` only considers nodes with coordinates** for the candidate list. Nodes without GPS are filtered out in the first pass. The `/api/resolve-hops` endpoint also returns no-GPS nodes in its candidate list and uses observer-based region filtering as fallback.
### 2.9 Edge Cases
| Edge Case | Behavior |
|-----------|----------|
| **No candidates** | Hop displayed as raw hex prefix |
| **All candidates lack GPS** | `disambiguateHops()`: name from `prefixIdxName` (first indexed), no position. `HopResolver`: first candidate wins |
| **Ambiguous after both passes** | First candidate in list wins (effectively random without position data) |
| **Mixed hash sizes in same path** | Each hop is whatever length the decoder extracted. Prefix index handles variable lengths (indexed at 1, 2, 3 byte prefixes) |
| **Self-loops in subpaths** | Same prefix appearing twice likely means a collision, not an actual loop. Analytics UI flags these with 🔄 and offers "hide collisions" checkbox |
| **Unknown observers** | Regional filtering falls back to no filtering; all candidates considered |
| **0,0 coordinates** | Explicitly excluded everywhere (`!(lat === 0 && lon === 0)`) |
### 2.10 Visual Decollision on the Map (Different System)
The **map label deconfliction** in `map.js` (`deconflictLabels()`, line 367+) is a completely different system. It handles **visual overlap** of hop labels on the Leaflet map — when two resolved hops are at nearby coordinates, their text labels would overlap. The function offsets labels to prevent visual collision using bounding-box checks.
This is **not related** to hash prefix disambiguation — it operates on already-resolved, positioned hops and only affects label rendering, not which node a prefix maps to.
Similarly, the map's "cluster" mode (`L.markerClusterGroup`) groups nearby node markers visually and is unrelated to hash disambiguation.
### 2.11 Data Flow Diagram
```
FIRMWARE (LoRa)
│
Packet with 1-3 byte hop hashes
│
▼
┌─────────────────┐
│ MQTT Broker(s) │
└────────┬────────┘
│
▼
┌──────────────────────┐
│ server.js │
│ ┌────────────────┐ │
│ │ decoder.js │ │ Extract raw hop hex prefixes
│ └───────┬────────┘ │
│ │ │
│ ▼ │
│ autoLearnHopNodes() │ Create stub nodes for 2+ byte hops
│ │ │
│ packet-store.js │ Store packet with raw hops
└──────────┬──────────┘
│
┌────────────┼────────────┐
│ │ │
▼ ▼ ▼
REST API calls WebSocket /api/resolve-hops
│ broadcast │
│ │ │
▼ │ ▼
disambiguateHops() │ Regional filtering +
(no regional filter) │ geo disambiguation
│ │ │
▼ ▼ ▼
┌────────────────────────────────────┐
│ BROWSER │
│ │
│ packets.js ──► HopResolver.resolve()
│ (geo + IATA regional filtering) │
│ │
│ live.js ──► resolveHopPositions() │
│ (geo only, independent impl) │
│ │
│ map.js ──► deconflictLabels() │
│ (visual label offsets only) │
│ │
│ analytics.js ──► server-resolved │
└─────────────────────────────────────┘
```
### 2.12 Hash Size Detection
Separate from disambiguation but closely related: the system tracks which hash size each node uses. `server-helpers.js` has `updateHashSizeForPacket()` and `rebuildHashSizeMap()` which extract the hash_size from the path_length byte. This feeds the analytics "hash issues" tab which detects nodes that flip-flop between hash sizes (a firmware behavior that complicates analysis).