mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-03-29 08:29:55 +00:00
Make ADVERT node names clickable links to node detail page
This commit is contained in:
93
DEDUP-DESIGN.md
Normal file
93
DEDUP-DESIGN.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# Packet Deduplication Design
|
||||
|
||||
## The Problem
|
||||
|
||||
A single physical RF transmission gets recorded as N rows in the DB, where N = number of observers that heard it. Each row has the same `hash` but different `path_json` and `observer_id`.
|
||||
|
||||
### Example
|
||||
```
|
||||
Pkt 1 repeat 1: Path: A→B→C→D→E (observer E)
|
||||
Pkt 1 repeat 2: Path: A→B→F→G (observer G)
|
||||
Pkt 1 repeat 3: Path: A→C→H→J→K (observer K)
|
||||
```
|
||||
|
||||
- Repeater A sent 1 packet, not 3
|
||||
- Repeater B sent 1 packet, not 2 (C and F both heard the same broadcast)
|
||||
- The hash is identical across all 3 rows
|
||||
|
||||
### Why the hash works
|
||||
|
||||
`computeContentHash()` = `SHA256(header_byte + payload)`, skipping path hops. Two observations of the same original packet through different paths produce the same hash. This is the dedup key.
|
||||
|
||||
## What's inflated (and what's not)
|
||||
|
||||
| Context | Current (inflated?) | Correct behavior |
|
||||
|---------|-------------------|------------------|
|
||||
| Node "total packets" | COUNT(*) — inflated | COUNT(DISTINCT hash) for transmissions |
|
||||
| Packets/hour on observer page | Raw count | Correct — each observer DID receive it |
|
||||
| Node analytics throughput | Inflated | DISTINCT hash |
|
||||
| Live map animations | N animations per physical packet | 1 animation? Or 1 per path? TBD |
|
||||
| "Heard By" table | Observations per observer | Correct as-is |
|
||||
| RF analytics (SNR/RSSI) | Mixes observations | Each observation has its own SNR — all valid |
|
||||
| Topology/path analysis | All paths shown | All paths are valuable — don't discard |
|
||||
| Packet list (grouped mode) | Groups by hash already | Probably fine |
|
||||
| Packet list (ungrouped) | Shows every observation | Maybe show distinct, expand for repeats? |
|
||||
|
||||
## Key Principle
|
||||
|
||||
**Observations are valuable data — never discard them.** The paths tell you about mesh topology, coverage, and redundancy. But **counts displayed to users should reflect reality** (1 transmission = 1 count).
|
||||
|
||||
## Design Decisions Needed
|
||||
|
||||
1. **What does "packets" mean in node detail?** Unique transmissions? Total observations? Both?
|
||||
2. **Live map**: 1 animation with multiple path lines? Or 1 per observation?
|
||||
3. **Analytics charts**: Should throughput charts show transmissions or observations?
|
||||
4. **Packet list default view**: Group by hash by default?
|
||||
5. **New metric: "observation ratio"?** — avg observations per transmission tells you about mesh redundancy/coverage
|
||||
|
||||
## Work Items
|
||||
|
||||
- [ ] **DB/API: Add distinct counts** — `findPacketsForNode()` and health endpoint should return both `totalTransmissions` (DISTINCT hash) and `totalObservations` (COUNT(*))
|
||||
- [ ] **Node detail UI** — show "X transmissions seen Y times" or similar
|
||||
- [ ] **Bulk health / network status** — use distinct hash counts
|
||||
- [ ] **Node analytics charts** — throughput should use distinct hashes
|
||||
- [ ] **Packets page default** — consider grouping by hash by default
|
||||
- [ ] **Live map** — decide on animation strategy for repeated observations
|
||||
- [ ] **Observer page** — observation count is correct, but could add "unique packets" column
|
||||
- [ ] **In-memory store** — add hash→[packets] index if not already there (check `pktStore.byHash`)
|
||||
- [ ] **API: packet siblings** — `/api/packets/:id/siblings` or `?groupByHash=true` (may already exist)
|
||||
- [ ] **RF analytics** — keep all observations for SNR/RSSI (each is a real measurement) but label counts correctly
|
||||
- [ ] **"Coverage ratio" metric** — avg(observations per unique hash) per node/observer — measures mesh redundancy
|
||||
|
||||
## Live Map Animation Design
|
||||
|
||||
### Current behavior
|
||||
Every observation triggers a separate animation. Same packet heard by 3 observers = 3 independent route animations. Looks like 3 packets when it was 1.
|
||||
|
||||
### Options considered
|
||||
|
||||
**Option A: Single animation, all paths simultaneously (PREFERRED)**
|
||||
When a hash first arrives, buffer briefly (500ms-2s) for sibling observations, then animate all paths at once. One pulse from origin, multiple route lines fanning out simultaneously. Most accurate — this IS what physically happened: one RF burst propagating through the mesh along multiple paths at once.
|
||||
|
||||
Timing challenge: observations don't arrive simultaneously (seconds apart). Need to buffer the first observation, wait for siblings, then render all together. Adds slight latency to "live" feel.
|
||||
|
||||
**Option B: Single animation, "best" path only** — REJECTED
|
||||
Pick shortest/highest-SNR path, animate only that. Clean but loses coverage/redundancy info.
|
||||
|
||||
**Option C: Single origin pulse, staggered path reveals** — REJECTED
|
||||
Origin pulses once, paths draw in sequence with delay. Dramatic but busy, and doesn't reflect reality (the propagation is simultaneous).
|
||||
|
||||
**Option D: Animate first, suppress siblings** — REJECTED (pragmatic but inaccurate)
|
||||
First observation gets animation, subsequent same-hash observations silently logged. Simple but you never see alternate paths on the live map.
|
||||
|
||||
### Implementation notes (for when we build this)
|
||||
- Need a client-side hash buffer: `Map<hash, {timer, packets[]}>`
|
||||
- On first WS packet with new hash: start timer (configurable, ~1-2s)
|
||||
- On subsequent packets with same hash: add to buffer, reset/extend timer
|
||||
- On timer expiry: animate all buffered paths for that hash simultaneously
|
||||
- Feed sidebar could show consolidated entry: "1 packet, 3 paths" with expand
|
||||
- Buffer window should be configurable (config.json)
|
||||
|
||||
## Status
|
||||
|
||||
**Discussion phase** — no code changes yet. User wants to finalize design before implementation. Live map changes tabled for later.
|
||||
236
DEDUP-MIGRATION-PLAN.md
Normal file
236
DEDUP-MIGRATION-PLAN.md
Normal file
@@ -0,0 +1,236 @@
|
||||
# Packet Deduplication — Normalized Schema Migration Plan
|
||||
|
||||
## Overview
|
||||
|
||||
Split the monolithic `packets` table into two tables:
|
||||
- **`packets`** — one row per unique physical transmission (keyed by content hash)
|
||||
- **`observations`** — one row per observer sighting (SNR, RSSI, path, observer, timestamp)
|
||||
|
||||
This fixes inflated packet counts across the entire app and enables proper "1 transmission seen N times" semantics.
|
||||
|
||||
## Current State
|
||||
|
||||
**`packets` table**: 1 row per observation. ~61MB, ~30K+ rows. Same hash appears N times (once per observer). Fields mix transmission data (raw_hex, payload_type, decoded_json, hash) with observation data (observer_id, snr, rssi, path_json).
|
||||
|
||||
**`packet-store.js`**: In-memory mirror of packets table. Indexes: `byId`, `byHash` (hash → [packets]), `byObserver`, `byNode`. All reads served from RAM. SQLite is write-only for packets.
|
||||
|
||||
**Touch surface**: ~66 SQL queries across db.js/server.js/packet-store.js. ~12 frontend files consume packet data.
|
||||
|
||||
---
|
||||
|
||||
## Milestone 1: Schema Migration (Backend Only)
|
||||
|
||||
**Goal**: New tables exist, data migrated, old table preserved as backup. No behavioral changes yet.
|
||||
|
||||
### Tasks
|
||||
1. **Create new schema** in `db.js` init:
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS transmissions (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
raw_hex TEXT NOT NULL,
|
||||
hash TEXT NOT NULL UNIQUE,
|
||||
first_seen TEXT NOT NULL,
|
||||
route_type INTEGER,
|
||||
payload_type INTEGER,
|
||||
payload_version INTEGER,
|
||||
decoded_json TEXT,
|
||||
created_at TEXT DEFAULT (datetime('now'))
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS observations (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
transmission_id INTEGER NOT NULL REFERENCES transmissions(id),
|
||||
hash TEXT NOT NULL,
|
||||
observer_id TEXT,
|
||||
observer_name TEXT,
|
||||
direction TEXT,
|
||||
snr REAL,
|
||||
rssi REAL,
|
||||
score INTEGER,
|
||||
path_json TEXT,
|
||||
timestamp TEXT NOT NULL,
|
||||
created_at TEXT DEFAULT (datetime('now'))
|
||||
);
|
||||
|
||||
CREATE INDEX idx_transmissions_hash ON transmissions(hash);
|
||||
CREATE INDEX idx_transmissions_first_seen ON transmissions(first_seen);
|
||||
CREATE INDEX idx_transmissions_payload_type ON transmissions(payload_type);
|
||||
CREATE INDEX idx_observations_hash ON observations(hash);
|
||||
CREATE INDEX idx_observations_transmission_id ON observations(transmission_id);
|
||||
CREATE INDEX idx_observations_observer_id ON observations(observer_id);
|
||||
CREATE INDEX idx_observations_timestamp ON observations(timestamp);
|
||||
```
|
||||
|
||||
2. **Write migration script** (`scripts/migrate-dedup.js`):
|
||||
- Read all rows from `packets` ordered by timestamp
|
||||
- Group by hash
|
||||
- For each unique hash: INSERT into `transmissions` (use first observation's raw_hex, decoded_json, etc.)
|
||||
- For each row: INSERT into `observations` with foreign key to transmission
|
||||
- Verify counts: `SELECT COUNT(*) FROM observations` = old packets count
|
||||
- Verify: `SELECT COUNT(*) FROM transmissions` < observations count
|
||||
- **Do NOT drop old `packets` table** — rename to `packets_backup`
|
||||
|
||||
3. **Print migration stats**: total packets, unique transmissions, dedup ratio, time taken
|
||||
|
||||
### Validation
|
||||
- `COUNT(*) FROM observations` = `COUNT(*) FROM packets_backup`
|
||||
- `COUNT(*) FROM transmissions` = `COUNT(DISTINCT hash) FROM packets_backup`
|
||||
- Spot-check: pick 5 known multi-observer packets, verify transmission + observations match
|
||||
|
||||
### Risk: LOW — additive only, old data preserved
|
||||
|
||||
---
|
||||
|
||||
## Milestone 2: Dual-Write Ingest
|
||||
|
||||
**Goal**: New packets written to both old and new tables. Read path unchanged. Zero downtime.
|
||||
|
||||
### Tasks
|
||||
1. **Update `db.js` `insertPacket()`**:
|
||||
- On new packet: check if `transmissions` row exists for hash
|
||||
- If not: INSERT into `transmissions`, get id
|
||||
- If yes: UPDATE `first_seen` if this timestamp is earlier
|
||||
- INSERT into `observations` with transmission_id
|
||||
- **Still also write to old `packets` table** (dual-write for safety)
|
||||
|
||||
2. **Update `packet-store.js` `insert()`**: Mirror the dual-write in memory model
|
||||
- Maintain both old flat array AND new `byTransmission` Map
|
||||
|
||||
### Validation
|
||||
- Send test packets, verify they appear in both old and new tables
|
||||
- Verify multi-observer packet creates 1 transmission + N observations
|
||||
|
||||
### Risk: LOW — old read path still works as fallback
|
||||
|
||||
---
|
||||
|
||||
## Milestone 3: In-Memory Store Restructure
|
||||
|
||||
**Goal**: `packet-store.js` switches from flat packet array to transmission-centric model.
|
||||
|
||||
### Tasks
|
||||
1. **New in-memory data model**:
|
||||
```
|
||||
transmissions: Map<hash, {id, raw_hex, hash, first_seen, payload_type, decoded_json, observations: []}>
|
||||
```
|
||||
Each observation: `{id, observer_id, observer_name, snr, rssi, path_json, timestamp}`
|
||||
|
||||
2. **Update indexes**:
|
||||
- `byHash`: hash → transmission object (1:1 instead of 1:N)
|
||||
- `byObserver`: observer_id → [observation references]
|
||||
- `byNode`: pubkey → [transmission references] (deduped!)
|
||||
- `byId`: observation.id → observation (for backward compat with packet detail links)
|
||||
|
||||
3. **Update `load()`**: Read from `transmissions` JOIN `observations` instead of `packets`
|
||||
|
||||
4. **Update query methods**:
|
||||
- `findPackets()` — returns transmissions by default, with `.observations` attached
|
||||
- `findPacketsForNode()` — returns transmissions where node appears in ANY observation's path/decoded_json
|
||||
- `getSiblings()` — becomes `getObservations(hash)` — trivial, just return `transmission.observations`
|
||||
- `countForNode()` — returns `{transmissions: N, observations: M}`
|
||||
|
||||
### Validation
|
||||
- All existing API endpoints return valid data
|
||||
- Packet counts decrease (correctly!) for multi-observer nodes
|
||||
- `/api/perf` shows no regression
|
||||
|
||||
### Risk: MEDIUM — core read path changes. Test thoroughly.
|
||||
|
||||
---
|
||||
|
||||
## Milestone 4: API Response Changes
|
||||
|
||||
**Goal**: APIs return deduped data with observation counts.
|
||||
|
||||
### Tasks
|
||||
1. **`GET /api/packets`**:
|
||||
- Default: return transmissions (1 row per unique packet)
|
||||
- Each transmission includes `observation_count` and optionally `observations[]`
|
||||
- `?expand=observations` to include full observation list
|
||||
- `?groupByHash` becomes the default behavior (deprecate param)
|
||||
- Preserve `observer` filter: return transmissions where at least one observation matches
|
||||
|
||||
2. **`GET /api/nodes/:pubkey/health`**:
|
||||
- `stats.totalPackets` → `stats.totalTransmissions` (distinct hashes)
|
||||
- Add `stats.totalObservations` (old count, for reference)
|
||||
- `recentPackets` → returns transmissions with observation_count
|
||||
|
||||
3. **`GET /api/nodes/bulk-health`**: Same changes as health
|
||||
|
||||
4. **`GET /api/nodes/network-status`**: Use transmission counts
|
||||
|
||||
5. **`GET /api/nodes/:pubkey/analytics`**: All throughput charts use transmission counts
|
||||
|
||||
6. **WebSocket broadcast**: Include `observation_count` when sibling observations exist for same hash
|
||||
|
||||
### Backward Compatibility
|
||||
- Add `?legacy=1` param that returns old-style flat observations (for any external consumers)
|
||||
- Include both `totalTransmissions` and `totalObservations` in health responses during transition
|
||||
|
||||
### Risk: MEDIUM — frontend expects certain shapes. May need coordinated deploy with Milestone 5.
|
||||
|
||||
---
|
||||
|
||||
## Milestone 5: Frontend Updates
|
||||
|
||||
**Goal**: UI shows correct counts and leverages observation data.
|
||||
|
||||
### Tasks
|
||||
1. **Packets page**:
|
||||
- Default view shows transmissions (already has groupByHash mode — make it default)
|
||||
- Expand row to see individual observations with their paths/SNR/RSSI
|
||||
- Badge: "×3 observers" on grouped rows
|
||||
|
||||
2. **Node detail panel** (nodes.js + live.js):
|
||||
- Show "X transmissions" not "X packets"
|
||||
- Or "X packets (seen Y times)" to show both
|
||||
|
||||
3. **Home page**: Network stats use transmission counts
|
||||
|
||||
4. **Node analytics**: Throughput charts use transmissions
|
||||
|
||||
5. **Observer detail**: Keep observation counts (correct metric for observers)
|
||||
|
||||
6. **Analytics page**: Topology/RF analysis uses all observations (SNR per observation is valid data)
|
||||
|
||||
### Risk: LOW-MEDIUM — mostly display changes
|
||||
|
||||
---
|
||||
|
||||
## Milestone 6: Cleanup
|
||||
|
||||
**Goal**: Remove dual-write, drop old table, clean up.
|
||||
|
||||
### Tasks
|
||||
1. Remove dual-write from `insertPacket()`
|
||||
2. Drop `packets_backup` table (after confirming everything works for 1+ week)
|
||||
3. Remove `?legacy=1` support if unused
|
||||
4. Update DEDUP-DESIGN.md → mark as complete
|
||||
5. VACUUM the database
|
||||
6. Tag release (v2.3.0?)
|
||||
|
||||
### Risk: LOW — cleanup only, all functional changes already proven
|
||||
|
||||
---
|
||||
|
||||
## Estimated Scope
|
||||
|
||||
| Milestone | Files Modified | Complexity | Can Deploy Independently? |
|
||||
|-----------|---------------|------------|--------------------------|
|
||||
| 1. Schema Migration | db.js, new script | Low | Yes — additive only |
|
||||
| 2. Dual-Write | db.js, packet-store.js | Low | Yes — old reads unchanged |
|
||||
| 3. Memory Store | packet-store.js | Medium | No — must deploy with M4 |
|
||||
| 4. API Changes | server.js, db.js | Medium | No — must deploy with M5 |
|
||||
| 5. Frontend | 8+ public/*.js files | Medium | No — must deploy with M4 |
|
||||
| 6. Cleanup | db.js, server.js | Low | Yes — after bake period |
|
||||
|
||||
**Milestones 1-2**: Safe to deploy independently, no user-visible changes.
|
||||
**Milestones 3-5**: Must ship together (API shape changes + frontend expects new shape).
|
||||
**Milestone 6**: Ship after 1 week bake.
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Table naming**: `transmissions` + `observations`? Or keep `packets` + add `observations`? The word "transmission" is more accurate but "packet" is what the whole UI calls them.
|
||||
2. **Packet detail URLs**: Currently `#/packet/123` uses the observation ID. Keep observation IDs as the URL key? Or switch to hash?
|
||||
3. **Path dedup in paths table**: The `paths` table also has per-observation entries. Normalize that too, or leave as-is?
|
||||
4. **Migration on prod**: Run migration script before deploying new code, or make new code handle both old and new schema?
|
||||
@@ -82,7 +82,7 @@
|
||||
<script src="roles.js?v=1774028201"></script>
|
||||
<script src="app.js?v=1774034748"></script>
|
||||
<script src="home.js?v=1774042199"></script>
|
||||
<script src="packets.js?v=1774051434"></script>
|
||||
<script src="packets.js?v=1774051770"></script>
|
||||
<script src="map.js?v=1774028201" onerror="console.error('Failed to load:', this.src)"></script>
|
||||
<script src="channels.js?v=1774050030" onerror="console.error('Failed to load:', this.src)"></script>
|
||||
<script src="nodes.js?v=1774050030" onerror="console.error('Failed to load:', this.src)"></script>
|
||||
|
||||
@@ -698,7 +698,7 @@
|
||||
// Advertisements — show node name and role
|
||||
if (decoded.type === 'ADVERT' && decoded.name) {
|
||||
const role = decoded.flags?.repeater ? '📡' : decoded.flags?.room ? '🏠' : decoded.flags?.sensor ? '🌡' : '📻';
|
||||
return `${role} ${escapeHtml(decoded.name)}`;
|
||||
return `${role} <a href="#/nodes/${encodeURIComponent(decoded.pubKey)}" class="hop-link hop-named" data-hop-link="true">${escapeHtml(decoded.name)}</a>`;
|
||||
}
|
||||
// Direct messages
|
||||
if (decoded.type === 'TXT_MSG') return `✉️ ${decoded.srcHash?.slice(0,8) || '?'} → ${decoded.destHash?.slice(0,8) || '?'}`;
|
||||
@@ -944,7 +944,7 @@
|
||||
fOff += 8;
|
||||
}
|
||||
if (decoded.flags.hasName) {
|
||||
rows += fieldRow(fOff, 'Node Name', escapeHtml(decoded.name || ''), '');
|
||||
rows += fieldRow(fOff, 'Node Name', decoded.pubKey ? `<a href="#/nodes/${encodeURIComponent(decoded.pubKey)}" class="hop-link hop-named" data-hop-link="true">${escapeHtml(decoded.name || '')}</a>` : escapeHtml(decoded.name || ''), '');
|
||||
}
|
||||
}
|
||||
} else if (decoded.type === 'GRP_TXT') {
|
||||
|
||||
Reference in New Issue
Block a user