Make ADVERT node names clickable links to node detail page

2026-03-29 08:29:55 +00:00 · 2026-03-21 00:09:30 +00:00
parent 8c2d5d770f
commit d621a9f34a
4 changed files with 332 additions and 3 deletions
--- a/DEDUP-DESIGN.md
+++ b/DEDUP-DESIGN.md
@@ -0,0 +1,93 @@
+# Packet Deduplication Design
+
+## The Problem
+
+A single physical RF transmission gets recorded as N rows in the DB, where N = number of observers that heard it. Each row has the same `hash` but different `path_json` and `observer_id`.
+
+### Example
+```
+Pkt 1 repeat 1: Path: A→B→C→D→E   (observer E)
+Pkt 1 repeat 2: Path: A→B→F→G      (observer G)
+Pkt 1 repeat 3: Path: A→C→H→J→K    (observer K)
+```
+
+- Repeater A sent 1 packet, not 3
+- Repeater B sent 1 packet, not 2 (C and F both heard the same broadcast)
+- The hash is identical across all 3 rows
+
+### Why the hash works
+
+`computeContentHash()` = `SHA256(header_byte + payload)`, skipping path hops. Two observations of the same original packet through different paths produce the same hash. This is the dedup key.
+
+## What's inflated (and what's not)
+
+| Context | Current (inflated?) | Correct behavior |
+|---------|-------------------|------------------|
+| Node "total packets" | COUNT(*) — inflated | COUNT(DISTINCT hash) for transmissions |
+| Packets/hour on observer page | Raw count | Correct — each observer DID receive it |
+| Node analytics throughput | Inflated | DISTINCT hash |
+| Live map animations | N animations per physical packet | 1 animation? Or 1 per path? TBD |
+| "Heard By" table | Observations per observer | Correct as-is |
+| RF analytics (SNR/RSSI) | Mixes observations | Each observation has its own SNR — all valid |
+| Topology/path analysis | All paths shown | All paths are valuable — don't discard |
+| Packet list (grouped mode) | Groups by hash already | Probably fine |
+| Packet list (ungrouped) | Shows every observation | Maybe show distinct, expand for repeats? |
+
+## Key Principle
+
+**Observations are valuable data — never discard them.** The paths tell you about mesh topology, coverage, and redundancy. But **counts displayed to users should reflect reality** (1 transmission = 1 count).
+
+## Design Decisions Needed
+
+1. **What does "packets" mean in node detail?** Unique transmissions? Total observations? Both?
+2. **Live map**: 1 animation with multiple path lines? Or 1 per observation?
+3. **Analytics charts**: Should throughput charts show transmissions or observations?
+4. **Packet list default view**: Group by hash by default?
+5. **New metric: "observation ratio"?** — avg observations per transmission tells you about mesh redundancy/coverage
+
+## Work Items
+
+- [ ] **DB/API: Add distinct counts** — `findPacketsForNode()` and health endpoint should return both `totalTransmissions` (DISTINCT hash) and `totalObservations` (COUNT(*))
+- [ ] **Node detail UI** — show "X transmissions seen Y times" or similar
+- [ ] **Bulk health / network status** — use distinct hash counts
+- [ ] **Node analytics charts** — throughput should use distinct hashes
+- [ ] **Packets page default** — consider grouping by hash by default
+- [ ] **Live map** — decide on animation strategy for repeated observations
+- [ ] **Observer page** — observation count is correct, but could add "unique packets" column
+- [ ] **In-memory store** — add hash→[packets] index if not already there (check `pktStore.byHash`)
+- [ ] **API: packet siblings** — `/api/packets/:id/siblings` or `?groupByHash=true` (may already exist)
+- [ ] **RF analytics** — keep all observations for SNR/RSSI (each is a real measurement) but label counts correctly
+- [ ] **"Coverage ratio" metric** — avg(observations per unique hash) per node/observer — measures mesh redundancy
+
+## Live Map Animation Design
+
+### Current behavior
+Every observation triggers a separate animation. Same packet heard by 3 observers = 3 independent route animations. Looks like 3 packets when it was 1.
+
+### Options considered
+
+**Option A: Single animation, all paths simultaneously (PREFERRED)**
+When a hash first arrives, buffer briefly (500ms-2s) for sibling observations, then animate all paths at once. One pulse from origin, multiple route lines fanning out simultaneously. Most accurate — this IS what physically happened: one RF burst propagating through the mesh along multiple paths at once.
+
+Timing challenge: observations don't arrive simultaneously (seconds apart). Need to buffer the first observation, wait for siblings, then render all together. Adds slight latency to "live" feel.
+
+**Option B: Single animation, "best" path only** — REJECTED
+Pick shortest/highest-SNR path, animate only that. Clean but loses coverage/redundancy info.
+
+**Option C: Single origin pulse, staggered path reveals** — REJECTED
+Origin pulses once, paths draw in sequence with delay. Dramatic but busy, and doesn't reflect reality (the propagation is simultaneous).
+
+**Option D: Animate first, suppress siblings** — REJECTED (pragmatic but inaccurate)
+First observation gets animation, subsequent same-hash observations silently logged. Simple but you never see alternate paths on the live map.
+
+### Implementation notes (for when we build this)
+- Need a client-side hash buffer: `Map<hash, {timer, packets[]}>` 
+- On first WS packet with new hash: start timer (configurable, ~1-2s)
+- On subsequent packets with same hash: add to buffer, reset/extend timer
+- On timer expiry: animate all buffered paths for that hash simultaneously
+- Feed sidebar could show consolidated entry: "1 packet, 3 paths" with expand
+- Buffer window should be configurable (config.json)
+
+## Status
+
+**Discussion phase** — no code changes yet. User wants to finalize design before implementation. Live map changes tabled for later.
--- a/DEDUP-MIGRATION-PLAN.md
+++ b/DEDUP-MIGRATION-PLAN.md
@@ -0,0 +1,236 @@
+# Packet Deduplication — Normalized Schema Migration Plan
+
+## Overview
+
+Split the monolithic `packets` table into two tables:
+- **`packets`** — one row per unique physical transmission (keyed by content hash)
+- **`observations`** — one row per observer sighting (SNR, RSSI, path, observer, timestamp)
+
+This fixes inflated packet counts across the entire app and enables proper "1 transmission seen N times" semantics.
+
+## Current State
+
+**`packets` table**: 1 row per observation. ~61MB, ~30K+ rows. Same hash appears N times (once per observer). Fields mix transmission data (raw_hex, payload_type, decoded_json, hash) with observation data (observer_id, snr, rssi, path_json).
+
+**`packet-store.js`**: In-memory mirror of packets table. Indexes: `byId`, `byHash` (hash → [packets]), `byObserver`, `byNode`. All reads served from RAM. SQLite is write-only for packets.
+
+**Touch surface**: ~66 SQL queries across db.js/server.js/packet-store.js. ~12 frontend files consume packet data.
+
+---
+
+## Milestone 1: Schema Migration (Backend Only)
+
+**Goal**: New tables exist, data migrated, old table preserved as backup. No behavioral changes yet.
+
+### Tasks
+1. **Create new schema** in `db.js` init:
+   ```sql
+   CREATE TABLE IF NOT EXISTS transmissions (
+     id INTEGER PRIMARY KEY AUTOINCREMENT,
+     raw_hex TEXT NOT NULL,
+     hash TEXT NOT NULL UNIQUE,
+     first_seen TEXT NOT NULL,
+     route_type INTEGER,
+     payload_type INTEGER,
+     payload_version INTEGER,
+     decoded_json TEXT,
+     created_at TEXT DEFAULT (datetime('now'))
+   );
+   
+   CREATE TABLE IF NOT EXISTS observations (
+     id INTEGER PRIMARY KEY AUTOINCREMENT,
+     transmission_id INTEGER NOT NULL REFERENCES transmissions(id),
+     hash TEXT NOT NULL,
+     observer_id TEXT,
+     observer_name TEXT,
+     direction TEXT,
+     snr REAL,
+     rssi REAL,
+     score INTEGER,
+     path_json TEXT,
+     timestamp TEXT NOT NULL,
+     created_at TEXT DEFAULT (datetime('now'))
+   );
+   
+   CREATE INDEX idx_transmissions_hash ON transmissions(hash);
+   CREATE INDEX idx_transmissions_first_seen ON transmissions(first_seen);
+   CREATE INDEX idx_transmissions_payload_type ON transmissions(payload_type);
+   CREATE INDEX idx_observations_hash ON observations(hash);
+   CREATE INDEX idx_observations_transmission_id ON observations(transmission_id);
+   CREATE INDEX idx_observations_observer_id ON observations(observer_id);
+   CREATE INDEX idx_observations_timestamp ON observations(timestamp);
+   ```
+
+2. **Write migration script** (`scripts/migrate-dedup.js`):
+   - Read all rows from `packets` ordered by timestamp
+   - Group by hash
+   - For each unique hash: INSERT into `transmissions` (use first observation's raw_hex, decoded_json, etc.)
+   - For each row: INSERT into `observations` with foreign key to transmission
+   - Verify counts: `SELECT COUNT(*) FROM observations` = old packets count
+   - Verify: `SELECT COUNT(*) FROM transmissions` < observations count
+   - **Do NOT drop old `packets` table** — rename to `packets_backup`
+
+3. **Print migration stats**: total packets, unique transmissions, dedup ratio, time taken
+
+### Validation
+- `COUNT(*) FROM observations` = `COUNT(*) FROM packets_backup`
+- `COUNT(*) FROM transmissions` = `COUNT(DISTINCT hash) FROM packets_backup`
+- Spot-check: pick 5 known multi-observer packets, verify transmission + observations match
+
+### Risk: LOW — additive only, old data preserved
+
+---
+
+## Milestone 2: Dual-Write Ingest
+
+**Goal**: New packets written to both old and new tables. Read path unchanged. Zero downtime.
+
+### Tasks
+1. **Update `db.js` `insertPacket()`**:
+   - On new packet: check if `transmissions` row exists for hash
+   - If not: INSERT into `transmissions`, get id
+   - If yes: UPDATE `first_seen` if this timestamp is earlier
+   - INSERT into `observations` with transmission_id
+   - **Still also write to old `packets` table** (dual-write for safety)
+
+2. **Update `packet-store.js` `insert()`**: Mirror the dual-write in memory model
+   - Maintain both old flat array AND new `byTransmission` Map
+
+### Validation
+- Send test packets, verify they appear in both old and new tables
+- Verify multi-observer packet creates 1 transmission + N observations
+
+### Risk: LOW — old read path still works as fallback
+
+---
+
+## Milestone 3: In-Memory Store Restructure
+
+**Goal**: `packet-store.js` switches from flat packet array to transmission-centric model.
+
+### Tasks
+1. **New in-memory data model**:
+   ```
+   transmissions: Map<hash, {id, raw_hex, hash, first_seen, payload_type, decoded_json, observations: []}>
+   ```
+   Each observation: `{id, observer_id, observer_name, snr, rssi, path_json, timestamp}`
+
+2. **Update indexes**:
+   - `byHash`: hash → transmission object (1:1 instead of 1:N)
+   - `byObserver`: observer_id → [observation references]
+   - `byNode`: pubkey → [transmission references] (deduped!)
+   - `byId`: observation.id → observation (for backward compat with packet detail links)
+
+3. **Update `load()`**: Read from `transmissions` JOIN `observations` instead of `packets`
+
+4. **Update query methods**:
+   - `findPackets()` — returns transmissions by default, with `.observations` attached
+   - `findPacketsForNode()` — returns transmissions where node appears in ANY observation's path/decoded_json
+   - `getSiblings()` — becomes `getObservations(hash)` — trivial, just return `transmission.observations`
+   - `countForNode()` — returns `{transmissions: N, observations: M}`
+
+### Validation
+- All existing API endpoints return valid data
+- Packet counts decrease (correctly!) for multi-observer nodes
+- `/api/perf` shows no regression
+
+### Risk: MEDIUM — core read path changes. Test thoroughly.
+
+---
+
+## Milestone 4: API Response Changes
+
+**Goal**: APIs return deduped data with observation counts.
+
+### Tasks
+1. **`GET /api/packets`**:
+   - Default: return transmissions (1 row per unique packet)
+   - Each transmission includes `observation_count` and optionally `observations[]`
+   - `?expand=observations` to include full observation list
+   - `?groupByHash` becomes the default behavior (deprecate param)
+   - Preserve `observer` filter: return transmissions where at least one observation matches
+
+2. **`GET /api/nodes/:pubkey/health`**:
+   - `stats.totalPackets` → `stats.totalTransmissions` (distinct hashes)
+   - Add `stats.totalObservations` (old count, for reference)
+   - `recentPackets` → returns transmissions with observation_count
+
+3. **`GET /api/nodes/bulk-health`**: Same changes as health
+
+4. **`GET /api/nodes/network-status`**: Use transmission counts
+
+5. **`GET /api/nodes/:pubkey/analytics`**: All throughput charts use transmission counts
+
+6. **WebSocket broadcast**: Include `observation_count` when sibling observations exist for same hash
+
+### Backward Compatibility
+- Add `?legacy=1` param that returns old-style flat observations (for any external consumers)
+- Include both `totalTransmissions` and `totalObservations` in health responses during transition
+
+### Risk: MEDIUM — frontend expects certain shapes. May need coordinated deploy with Milestone 5.
+
+---
+
+## Milestone 5: Frontend Updates
+
+**Goal**: UI shows correct counts and leverages observation data.
+
+### Tasks
+1. **Packets page**:
+   - Default view shows transmissions (already has groupByHash mode — make it default)
+   - Expand row to see individual observations with their paths/SNR/RSSI
+   - Badge: "×3 observers" on grouped rows
+
+2. **Node detail panel** (nodes.js + live.js):
+   - Show "X transmissions" not "X packets"  
+   - Or "X packets (seen Y times)" to show both
+
+3. **Home page**: Network stats use transmission counts
+
+4. **Node analytics**: Throughput charts use transmissions
+
+5. **Observer detail**: Keep observation counts (correct metric for observers)
+
+6. **Analytics page**: Topology/RF analysis uses all observations (SNR per observation is valid data)
+
+### Risk: LOW-MEDIUM — mostly display changes
+
+---
+
+## Milestone 6: Cleanup
+
+**Goal**: Remove dual-write, drop old table, clean up.
+
+### Tasks
+1. Remove dual-write from `insertPacket()`
+2. Drop `packets_backup` table (after confirming everything works for 1+ week)
+3. Remove `?legacy=1` support if unused
+4. Update DEDUP-DESIGN.md → mark as complete
+5. VACUUM the database
+6. Tag release (v2.3.0?)
+
+### Risk: LOW — cleanup only, all functional changes already proven
+
+---
+
+## Estimated Scope
+
+| Milestone | Files Modified | Complexity | Can Deploy Independently? |
+|-----------|---------------|------------|--------------------------|
+| 1. Schema Migration | db.js, new script | Low | Yes — additive only |
+| 2. Dual-Write | db.js, packet-store.js | Low | Yes — old reads unchanged |
+| 3. Memory Store | packet-store.js | Medium | No — must deploy with M4 |
+| 4. API Changes | server.js, db.js | Medium | No — must deploy with M5 |
+| 5. Frontend | 8+ public/*.js files | Medium | No — must deploy with M4 |
+| 6. Cleanup | db.js, server.js | Low | Yes — after bake period |
+
+**Milestones 1-2**: Safe to deploy independently, no user-visible changes.  
+**Milestones 3-5**: Must ship together (API shape changes + frontend expects new shape).  
+**Milestone 6**: Ship after 1 week bake.
+
+## Open Questions
+
+1. **Table naming**: `transmissions` + `observations`? Or keep `packets` + add `observations`? The word "transmission" is more accurate but "packet" is what the whole UI calls them.
+2. **Packet detail URLs**: Currently `#/packet/123` uses the observation ID. Keep observation IDs as the URL key? Or switch to hash?
+3. **Path dedup in paths table**: The `paths` table also has per-observation entries. Normalize that too, or leave as-is?
+4. **Migration on prod**: Run migration script before deploying new code, or make new code handle both old and new schema?
--- a/public/index.html
+++ b/public/index.html
@@ -82,7 +82,7 @@
  <script src="roles.js?v=1774028201"></script>
  <script src="app.js?v=1774034748"></script>
  <script src="home.js?v=1774042199"></script>
-  <script src="packets.js?v=1774051434"></script>
+  <script src="packets.js?v=1774051770"></script>
  <script src="map.js?v=1774028201" onerror="console.error('Failed to load:', this.src)"></script>
  <script src="channels.js?v=1774050030" onerror="console.error('Failed to load:', this.src)"></script>
  <script src="nodes.js?v=1774050030" onerror="console.error('Failed to load:', this.src)"></script>
--- a/public/packets.js
+++ b/public/packets.js
@@ -698,7 +698,7 @@
    // Advertisements — show node name and role
    if (decoded.type === 'ADVERT' && decoded.name) {
      const role = decoded.flags?.repeater ? '📡' : decoded.flags?.room ? '🏠' : decoded.flags?.sensor ? '🌡' : '📻';
-      return `${role} ${escapeHtml(decoded.name)}`;
+      return `${role} <a href="#/nodes/${encodeURIComponent(decoded.pubKey)}" class="hop-link hop-named" data-hop-link="true">${escapeHtml(decoded.name)}</a>`;
    }
    // Direct messages
    if (decoded.type === 'TXT_MSG') return `✉️ ${decoded.srcHash?.slice(0,8) || '?'} → ${decoded.destHash?.slice(0,8) || '?'}`;
@@ -944,7 +944,7 @@
          fOff += 8;
        }
        if (decoded.flags.hasName) {
-          rows += fieldRow(fOff, 'Node Name', escapeHtml(decoded.name || ''), '');
+          rows += fieldRow(fOff, 'Node Name', decoded.pubKey ? `<a href="#/nodes/${encodeURIComponent(decoded.pubKey)}" class="hop-link hop-named" data-hop-link="true">${escapeHtml(decoded.name || '')}</a>` : escapeHtml(decoded.name || ''), '');
        }
      }
    } else if (decoded.type === 'GRP_TXT') {