Commit Graph

1289 Commits

Author SHA1 Message Date
Kpa-clawbot dc5b5ce9a0 fix: reject weak/default API keys + startup warning (#532) (#628)
## Summary

Hardens API key security for write endpoints (fixes #532):

1. **Constant-time comparison** — uses
`crypto/subtle.ConstantTimeCompare` to prevent timing attacks on API key
validation
2. **Weak key blocklist** — rejects known default/example keys (`test`,
`password`, `change-me`, `your-secret-api-key-here`, etc.)
3. **Minimum length enforcement** — keys shorter than 16 characters are
rejected
4. **Startup warning** — logs a clear warning if the configured key is
weak or a known default
5. **Generic error messages** — HTTP 403 response uses opaque
"forbidden" message to prevent information leakage about why a key was
rejected

### Security Model
- **Empty key** → all write endpoints disabled (403)
- **Weak/default key** → all write endpoints disabled (403), startup
warning logged
- **Wrong key** → 401 unauthorized
- **Strong correct key** → request proceeds

### Files Changed
- `cmd/server/config.go` — `IsWeakAPIKey()` function + blocklist
- `cmd/server/routes.go` — constant-time comparison via
`constantTimeEqual()`, weak key rejection
- `cmd/server/main.go` — startup warning for weak keys
- `cmd/server/apikey_security_test.go` — comprehensive test coverage
- `cmd/server/routes_test.go` — existing tests updated to use strong
keys

### Reviews
-  Self-review: all security properties verified
-  djb Final Review: timing fix correct, blocklist pragmatic, error
messages opaque, tests comprehensive. **Verdict: Ship it.**

### Test Results
All existing + new tests pass. Coverage includes: weak key detection
(blocklist + length + case-insensitive), empty key handling, strong key
acceptance, wrong key rejection, and constant-time comparison.

---------

Co-authored-by: you <you@example.com>
2026-04-05 14:50:40 -07:00
Kpa-clawbot f59b4629b0 feat: publish Docker images to GHCR + simplified deploy (#610) (#627)
## Summary

Implements M1-M2 of the deployment simplification spec (#610). Adds
pre-built multi-arch Docker images published to GHCR, plus a simplified
deploy experience for operators.

**Spec:**
[docs/specs/deployment-simplification.md](https://github.com/Kpa-clawbot/CoreScope/blob/master/docs/specs/deployment-simplification.md)

## Files Added (no existing files modified)

### 1. `.github/workflows/publish.yml`
Multi-arch Docker publish workflow:
- Triggers on `v*` tags (releases) → produces `vX.Y.Z`, `vX.Y`, `vX`,
`latest`
- Triggers on master push → produces `edge` (unstable)
- `workflow_dispatch` for manual runs
- QEMU + buildx for `linux/amd64` + `linux/arm64`
- GHCR auth via `GITHUB_TOKEN`
- GHA layer caching for fast rebuilds

### 2. `docker-compose.example.yml`
20-line compose file that pulls from GHCR (no local build required):
- Env var overrides: `HTTP_PORT`, `DATA_DIR`, `DISABLE_CADDY`,
`DISABLE_MOSQUITTO`
- Health check included
- Volume mount for data persistence

### 3. `DEPLOY.md`
Operator documentation:
- One-line `docker run` deploy
- Tag reference (pinned vs latest vs edge)
- Environment variables table
- Update path (`docker compose pull && docker compose up -d`)
- TLS options (Caddy auto-TLS vs reverse proxy)
- **Migration guide for existing manage.sh users** — both paths
documented with command equivalency table

## Review Status

-  Self-review: Actions syntax, GHCR auth, multi-arch, tag strategy,
security — all verified
-  Torvalds: Deploy UX is clean, one-liner works, right level of
simplicity
-  BUILD_TIME fixed: uses `date` command instead of fragile
`head_commit.timestamp`
-  Migration guide added for existing manage.sh admins
- ⚠️ `DISABLE_CADDY` env var documented but not implemented in
entrypoint — pre-existing bug, filed as #629

Fixes #610

---------

Co-authored-by: you <you@example.com>
2026-04-05 14:33:57 -07:00
Kpa-clawbot f7000992ca fix(rf-health): auto-scale airtime Y-axis + hover tooltips (#600) (#623)
## Summary

Addresses user feedback on #600 — two improvements to RF Health detail
panel charts:

### 1. Auto-scale airtime Y-axis
Previously fixed 0-100% which made low-activity nodes unreadable (e.g.
0.1% TX barely visible). Now auto-scales to the actual data range with
20% headroom (minimum 1%), matching how the noise floor chart already
works.

### 2. Hover tooltips on all chart data points
Invisible SVG `<circle>` elements with native `<title>` tooltips on
every data point across all 4 charts:
- **Noise floor**: `NF: -112.3 dBm` + UTC timestamp
- **Airtime**: `TX: 2.1%` or `RX: 8.3%` + UTC timestamp  
- **Error rate**: `Err: 0.05%` + UTC timestamp
- **Battery**: `Batt: 3.85V` + UTC timestamp

Uses native browser SVG tooltips — zero dependencies, accessible, no JS
event handlers.

### Design rationale (Tufte)
- Auto-scaling increases data-ink ratio by eliminating wasted vertical
space
- Tooltips provide detail-on-demand without cluttering the chart with
labels on every point

### Spec update
Added M2 feedback improvements section to
`docs/specs/rf-health-dashboard.md`.

---------

Co-authored-by: you <you@example.com>
2026-04-05 13:08:05 -07:00
Kpa-clawbot 30e7e9ae3c docs: document lock ordering for cacheMu and channelsCacheMu (#624)
## Summary

Documents the lock ordering for all five mutexes in `PacketStore`
(`store.go`) to prevent future deadlocks.

## What changed

Added a comment block above the `PacketStore` struct documenting:

- All 5 mutexes (`mu`, `cacheMu`, `channelsCacheMu`, `groupedCacheMu`,
`regionObsMu`)
- What each mutex guards
- The required acquisition order (numbered 1–5)
- The nesting relationships that exist today (`cacheMu →
channelsCacheMu` in `invalidateCachesFor` and `rebuildAnalyticsCaches`)
- Confirmation that no reverse ordering exists (no deadlock risk)

## Verification

- Grepped all lock acquisition sites to confirm no reverse nesting
exists
- `go build ./...` passes — documentation-only change

Fixes #413

---------

Co-authored-by: you <you@example.com>
2026-04-05 13:00:35 -07:00
Kpa-clawbot 3415d3babb fix: measure VSCROLL_ROW_HEIGHT and theadHeight dynamically (#625)
## Summary

Replaces hardcoded `VSCROLL_ROW_HEIGHT = 36` and `theadHeight = 40` in
the virtual scroll logic with dynamic DOM measurement, so the values
stay correct if CSS changes.

## Changes

- `VSCROLL_ROW_HEIGHT`: measured once from the first rendered data row's
`offsetHeight` after the initial full rebuild. Falls back to 36px until
measurement occurs.
- `theadHeight`: measured from the actual `<thead>` element's
`offsetHeight` on every `renderVisibleRows` call. Falls back to 40px if
no thead is found.
- Both variables are now `let` instead of `const` to allow runtime
updates.

## Performance

No performance impact — both measurements are single `offsetHeight`
reads (no reflow triggered since the DOM was just written). Row height
measurement runs only once (guarded by `_vscrollRowHeightMeasured`
flag). Thead measurement is a single property read per scroll event.

Fixes #407

Co-authored-by: you <you@example.com>
2026-04-05 13:00:20 -07:00
Kpa-clawbot 05fbcb09dd fix: wire cacheTTL.analyticsHashSizes config to collision cache (#420) (#622)
## Summary

Fixes #420 — wires `cacheTTL` config values to server-side cache
durations that were previously hardcoded.

## Problem

`collisionCacheTTL` was hardcoded at 60s in `store.go`. The config has
`cacheTTL.analyticsHashSizes: 3600` (1 hour) but it was never read — the
`/api/config/cache` endpoint just passed the raw map to the client
without applying values server-side.

## Changes

- **`store.go`**: Add `cacheTTLSec()` helper to safely extract duration
values from the `cacheTTL` config map. `NewPacketStore` now accepts an
optional `cacheTTL` map (variadic, backward-compatible) and wires:
  - `cacheTTL.analyticsHashSizes` → `collisionCacheTTL`
  - `cacheTTL.analyticsRF` → `rfCacheTTL`
- **Default changed**: `collisionCacheTTL` default raised from 60s →
3600s (1 hour). Hash collision computation is expensive and data changes
rarely — 60s was causing unnecessary recomputation.
- **`main.go`**: Pass `cfg.CacheTTL` to `NewPacketStore`.
- **Tests**: Added `TestCacheTTLFromConfig` and `TestCacheTTLDefaults`
in eviction_test.go. Updated existing `TestHashCollisionsCacheTTL` for
the new default.

## Audit of other cacheTTL values

The remaining `cacheTTL` keys (`stats`, `nodeDetail`, `nodeHealth`,
`nodeList`, `bulkHealth`, `networkStatus`, `observers`, `channels`,
`channelMessages`, `analyticsTopology`, `analyticsChannels`,
`analyticsSubpaths`, `analyticsSubpathDetail`, `nodeAnalytics`,
`nodeSearch`, `invalidationDebounce`) are **client-side only** — served
via `/api/config/cache` and consumed by the frontend. They don't have
corresponding server-side caches to wire to. The only server-side caches
(`rfCache`, `topoCache`, `hashCache`, `chanCache`, `distCache`,
`subpathCache`, `collisionCache`) all use either `rfCacheTTL` or
`collisionCacheTTL`, both now configurable.

## Complexity

O(1) config lookup at store init time. No hot-path impact.

Co-authored-by: you <you@example.com>
2026-04-05 12:49:46 -07:00
efiten b587f20d1c feat: add distance column to neighbor table in node details (#617)
Closes #616

## What

Adds a **Distance** column to the neighbor table on the node detail
page.

When both the viewed node and a neighbor have GPS coordinates recorded,
the table shows the haversine distance between them (e.g. `3.2 km`).
When either node lacks GPS, the cell shows `—`.

## Changes

**Backend** (`cmd/server/neighbor_api.go`):
- Added `distance_km *float64` (omitempty) to `NeighborEntry`
- In `handleNodeNeighbors`: look up source node coords from `nodeMap`,
then for each resolved (non-ambiguous) neighbor with GPS, compute
`haversineKm` and set the field

**Frontend** (`public/nodes.js`):
- Added `Distance` column header between Last Seen and Conf
- Cell renders `X.X km` or `—` (muted) when unavailable

**Tests** (`cmd/server/neighbor_api_test.go`):
- `TestNeighborAPI_DistanceKm_WithGPS`: two nodes with real coords →
`distance_km` is positive
- `TestNeighborAPI_DistanceKm_NoGPS`: two nodes at 0,0 → `distance_km`
is nil

## Verification

Test at **https://staging.on8ar.eu** — navigate to any node detail page
and scroll to the Neighbors section. Nodes with GPS coordinates show a
distance; those without show `—`.
2026-04-05 12:33:23 -07:00
you af9754dbea ci: move staging build+deploy to meshcore-runner-2
Prod VM (meshcore-vm) is now prod-only. Staging builds and
deploys on the secondary runner.
2026-04-05 17:33:15 +00:00
Kpa-clawbot 767c8a5a3e perf: async chunked backfill — HTTP serves within 2 minutes (#612) (#614)
## Summary

Adds two config knobs for controlling backfill scope and neighbor graph
data retention, plus removes the dead synchronous backfill function.

## Changes

### Config knobs

#### `resolvedPath.backfillHours` (default: 24)
Controls how far back (in hours) the async backfill scans for
observations with NULL `resolved_path`. Transmissions with `first_seen`
older than this window are skipped, reducing startup time for instances
with large historical datasets.

#### `neighborGraph.maxAgeDays` (default: 30)
Controls the maximum age of `neighbor_edges` entries. Edges with
`last_seen` older than this are pruned from both SQLite and the
in-memory graph. Pruning runs on startup (after a 4-minute stagger) and
every 24 hours thereafter.

### Dead code removal
- Removed the synchronous `backfillResolvedPaths` function that was
replaced by the async version.

### Implementation details
- `backfillResolvedPathsAsync` now accepts a `backfillHours` parameter
and filters by `tx.FirstSeen`
- `NeighborGraph.PruneOlderThan(cutoff)` removes stale edges from the
in-memory graph
- `PruneNeighborEdges(conn, graph, maxAgeDays)` prunes both DB and
in-memory graph
- Periodic pruning ticker follows the same pattern as metrics pruning
(24h interval, staggered start)
- Graceful shutdown stops the edge prune ticker

### Config example
Both knobs added to `config.example.json` with `_comment` fields.

## Tests
- Config default/override tests for both knobs
- `TestGraphPruneOlderThan` — in-memory edge pruning
- `TestPruneNeighborEdgesDB` — SQLite + in-memory pruning together
- `TestBackfillRespectsHourWindow` — verifies old transmissions are
excluded by backfill window

---------

Co-authored-by: you <you@example.com>
2026-04-05 09:49:39 -07:00
Kpa-clawbot 382b3505dc feat: channel color quick-assign UI (M2, #271) (#611)
## Summary

Implements M2 of channel color highlighting (#271): a right-click
context menu popover for quick-assigning colors to hash channels.

Builds on M1 (PR #607) which provides `ChannelColors.set/get/remove`
storage primitives.

## What's new

### Color picker popover (`channel-color-picker.js`)
- **Right-click** any GRP_TXT/CHAN row in the **live feed** or **packets
table** → opens a color picker popover at the click point
- **Long-press** (500ms) on mobile triggers the same popover
- **10 preset swatches** — maximally distinct, ColorBrewer-inspired
palette
- **Custom hex** — native `<input type="color">` with Apply button
- **Clear button** — removes color assignment (hidden when no color
assigned)
- **Popover positioning** — auto-adjusts to avoid viewport overflow
- **Dismiss** — click outside or Escape key

### Immediate feedback
- Assigning a color instantly re-styles all visible live feed items with
that channel
- Packets table triggers `renderVisibleRows()` via exposed
`window._packetsRenderVisible`

### Wiring
- Feed items store `_ccPkt` packet reference for channel extraction
- Picker installed via `registerPage` init hooks in both `live.js` and
`packets.js`
- Single shared popover DOM element, repositioned on each open

### Styling
- Dark card with border, matching existing CoreScope dropdown patterns
- CSS in `style.css` under `.cc-picker-*` classes
- Uses CSS variables (`--surface-1`, `--border`, `--accent`, etc.) for
theme compatibility

## Files changed

| File | Change |
|------|--------|
| `public/channel-color-picker.js` | New — popover component (IIFE, no
dependencies except `ChannelColors`) |
| `public/index.html` | Script tag for picker |
| `public/live.js` | Store `_ccPkt` on feed items, install picker on
init |
| `public/packets.js` | Install picker on init, expose
`_packetsRenderVisible` |
| `public/style.css` | Popover CSS |
| `test-channel-colors.js` | 2 new tests for picker loading and graceful
degradation |

## Testing

- All 21 channel-colors tests pass (19 M1 + 2 M2)
- All 445 frontend-helpers tests pass
- All 62 packet-filter tests pass

## Performance

No hot-path impact. The popover is a single shared DOM element created
lazily on first use. Context menu handlers use event delegation on the
feed/table containers (one listener each, not per-row). The
`refreshVisibleRows` function only iterates currently-visible DOM
elements.

Closes milestone M2 of #271.

---------

Co-authored-by: you <you@example.com>
2026-04-05 06:45:13 -07:00
you dc635775b5 docs: TUI spec updated with expert feedback + MVP definition 2026-04-05 07:12:11 +00:00
you 8a94c43334 docs: startup performance spec — serve HTTP within 2 minutes on any DB size 2026-04-05 07:09:55 +00:00
you 6aaa5cdc20 docs: add user guide — getting started, pages, config, FAQ 2026-04-05 07:09:54 +00:00
you 788005bff7 docs: clarify Docker tag strategy — pin to vX.Y.Z for production, edge for testing 2026-04-05 07:09:44 +00:00
you af03f9aa57 docs: deployment simplification spec — pre-built Docker images + one-line deploy 2026-04-05 07:06:35 +00:00
Kpa-clawbot 3328ca4354 feat: channel color highlighting M1 — core model + feed row (#271) (#607)
## Summary

Implements M1 of the [channel color highlighting
spec](docs/specs/channel-color-highlighting.md) for issue #271.

Allows users to assign custom highlight colors to specific hash
channels. When a `GRP_TXT` packet arrives with an assigned channel
color, the feed row and packets table row get:
- **4px colored left border** in the assigned color
- **Subtle background tint** (color at 10% opacity)

## What's included

### `public/channel-colors.js` — Storage model
- `ChannelColors.get(channel)` → hex color or null
- `ChannelColors.set(channel, color)` — assign a color
- `ChannelColors.remove(channel)` — clear assignment
- `ChannelColors.getAll()` → all assignments
- `ChannelColors.getRowStyle(typeName, channel)` → inline CSS string for
row highlighting
- Uses `localStorage` key `live-channel-colors`
- Gracefully handles corrupt/missing localStorage data

### Feed row highlighting (`public/live.js`)
- Both `addFeedItem` (live WS) and `addFeedItemDOM` (replay/DB load)
apply channel color styles
- Reads `decoded.payload.channelName` from the packet

### Packets table highlighting (`public/packets.js`)
- `buildFlatRowHtml` and `buildGroupRowHtml` apply channel color styles
to `<tr>` elements
- Reads channel from `getParsedDecoded(p).channel`

### Tests (`test-channel-colors.js`)
- 16 unit tests covering storage CRUD, edge cases (null, empty, corrupt
data), and style generation
- Tests verify only GRP_TXT/CHAN types get coloring, other types are
unaffected

## Design decisions

- **Only GRP_TXT/CHAN packets** — other types retain default
`TYPE_COLORS` styling
- **Channel color takes priority** over default type colors for row
highlighting
- **No UI for assigning colors yet** — that's M2 (right-click context
menu + color picker)
- **Storage key abstracted** behind functions to ease future migration
if customizer rework (#288) lands
- **10% opacity tint** (`#hexcolor` + `1a` suffix) ensures readability
in both dark/light modes

## Performance

- `getRowStyle()` is O(1) — single localStorage read + JSON parse per
call
- No per-packet API calls; all data is client-side
- No impact on hot rendering paths beyond one localStorage read per row
render

Closes #271 (M1 only — further milestones in separate PRs)

---------

Co-authored-by: you <you@example.com>
2026-04-05 00:03:17 -07:00
you 14732135b7 docs: proposal for terminal/TUI interface into CoreScope 2026-04-05 06:56:33 +00:00
Kpa-clawbot e42477b810 feat: collapsible panels + medium breakpoint on live map (#606)
## Summary

Adds collapsible/minimizable UI panels on the live map page so overlay
panels don't block map content on medium-sized screens.

Fixes #279

## Changes

### Collapsible Legend Panel (all screen sizes)
- The legend toggle button (🎨/✕) is now visible at **all** screen sizes,
not just mobile
- Clicking it smoothly collapses/expands the legend with a CSS
transition
- Collapsed state persists in `localStorage` (`live-legend-hidden`)
- Feed panel already had hide/show with localStorage — no changes needed
there

### Medium Breakpoint (768px)
New `@media (max-width: 768px)` rules for tablet/small laptop screens:
- Feed panel: 360px → 280px wide, max-height 340px → 200px
- Node detail panel: 320px → 260px wide
- Legend: smaller font (10px) and tighter padding
- Header: reduced gap and padding
- Stats/toggles: smaller font sizes

### What's NOT changed
- Mobile (≤640px): existing behavior preserved (feed/legend hidden
entirely)
- Desktop (>768px): no changes — panels render at full size as before

## Testing
- `test-packet-filter.js`: 62 passed
- `test-aging.js`: 29 passed  
- `test-frontend-helpers.js`: 445 passed

---------

Co-authored-by: you <you@example.com>
2026-04-04 23:56:07 -07:00
you cbc3e3ce13 docs: movable UI panels spec — draggable panel positioning (#279) 2026-04-05 06:54:45 +00:00
you 1796493ec0 docs: channel color highlighting spec (#271)
Custom color assignment for hash channels in Live tab.
Reviewed by Tufte, Torvalds, and Doshi personas.
2026-04-05 06:45:53 +00:00
you 168866ecb6 fix: View Route on Map button works on packet detail page
The button click handler used document.getElementById() which fails on
/packet/[ID] pages because renderDetail() runs before the container is
appended to the DOM. Changed to panel.querySelector() which searches
within the detached element tree.

Fixes #601
2026-04-05 06:43:59 +00:00
you be9257cd26 chore: switch license to GPL v3
Copyleft ensures all derivative works remain open source.
2026-04-05 06:36:03 +00:00
you b5b6faf90a chore: switch license from MIT to Apache 2.0
Adds patent protection for contributors while maintaining the same
permissive usage rights.
2026-04-05 06:35:38 +00:00
you 592061ec7e chore: add MIT license 2026-04-05 06:32:28 +00:00
you 596ccf2322 fix(rf-health): offset TX/RX airtime labels when overlapping
When TX and RX values are within 12px, TX label shifts up and RX shifts
down to avoid rendering on top of each other.
2026-04-05 06:31:02 +00:00
Kpa-clawbot 232770a858 feat(rf-health): M2 — airtime, error rate, battery charts with delta computation (#605)
## M2: Airtime + Channel Quality + Battery Charts

Implements M2 of #600 — server-side delta computation and three new
charts in the RF Health detail view.

### Backend Changes

**Delta computation** for cumulative counters (`tx_air_secs`,
`rx_air_secs`, `recv_errors`):
- Computes per-interval deltas between consecutive samples
- **Reboot handling:** detects counter reset (current < previous), skips
that delta, records reboot timestamp
- **Gap handling:** if time between samples > 2× interval, inserts null
(no interpolation)
- Returns `tx_airtime_pct` and `rx_airtime_pct` as percentages
(delta_secs / interval_secs × 100)
- Returns `recv_error_rate` as delta_errors / (delta_recv +
delta_errors) × 100

**`resolution` query param** on `/api/observers/{id}/metrics`:
- `5m` (default) — raw samples
- `1h` — hourly aggregates (GROUP BY hour with AVG/MAX)
- `1d` — daily aggregates

**Schema additions:**
- `packets_sent` and `packets_recv` columns added to `observer_metrics`
(migration)
- Ingestor parses these fields from MQTT stats messages

**API response** now includes:
- `tx_airtime_pct`, `rx_airtime_pct`, `recv_error_rate` (computed
deltas)
- `reboots` array with timestamps of detected reboots
- `is_reboot_sample` flag on affected samples

### Frontend Changes

Three new charts in the RF Health detail view, stacked vertically below
noise floor:

1. **Airtime chart** — TX (red) + RX (blue) as separate SVG lines,
Y-axis 0-100%, direct labels at endpoints
2. **Error Rate chart** — `recv_error_rate` line, shown only when data
exists
3. **Battery chart** — voltage line with 3.3V low reference, shown only
when battery_mv > 0

All charts:
- Share X-axis and time range (aligned vertically)
- Reboot markers as vertical hairlines spanning all charts
- Direct labels on data (no legends)
- Resolution auto-selected: `1h` for 7d/30d ranges
- Charts hidden when no data exists

### Tests

- `TestComputeDeltas`: normal deltas, reboot detection, gap detection
- `TestGetObserverMetricsResolution`: 5m/1h/1d downsampling verification
- Updated `TestGetObserverMetrics` for new API signature

---------

Co-authored-by: you <you@example.com>
2026-04-04 23:17:17 -07:00
you 747aea37b7 fix(rf-health): add region filter support to metrics summary
Frontend passes RegionFilter query string to summary API.
Backend filters results by observer IATA region.
Added iata field to MetricsSummaryRow.
2026-04-05 06:00:42 +00:00
you 968c104e14 feat(rf-health): show observer detail in side panel instead of page bottom
- Change RF Health detail view from bottom-of-page to a right-sliding side panel
- Grid stays visible and stable when detail is open (no layout shift)
- Click another observer updates panel in place; close button (×) dismisses
- On mobile (<640px): panel stacks below grid at full width
- Filter out observers with insufficient data (<2 sparkline points) from grid entirely
- Follows the same split-layout pattern used by the nodes page
2026-04-05 05:53:42 +00:00
Kpa-clawbot 6f35d4d417 feat: RF Health Dashboard M1 — observer metrics + small multiples grid (#604)
## RF Health Dashboard — M1: Observer Metrics Storage, API & Small
Multiples Grid

Implements M1 of #600.

### What this does

Adds a complete RF health monitoring pipeline: MQTT stats ingestion →
SQLite storage → REST API → interactive dashboard with small multiples
grid.

### Backend Changes

**Ingestor (`cmd/ingestor/`)**
- New `observer_metrics` table via migration system (`_migrations`
pattern)
- Parse `tx_air_secs`, `rx_air_secs`, `recv_errors` from MQTT status
messages (same pattern as existing `noise_floor` and `battery_mv`)
- `INSERT OR REPLACE` with timestamps rounded to nearest 5-min interval
boundary (using ingestor wall clock, not observer timestamps)
- Missing fields stored as NULLs — partial data is always better than no
data
- Configurable retention pruning: `retention.metricsDays` (default 30),
runs on startup + every 24h

**Server (`cmd/server/`)**
- `GET /api/observers/{id}/metrics?since=...&until=...` — per-observer
time-series data
- `GET /api/observers/metrics/summary?window=24h` — fleet summary with
current NF, avg/max NF, sample count
- `parseWindowDuration()` supports `1h`, `24h`, `3d`, `7d`, `30d` etc.
- Server-side metrics retention pruning (same config, staggered 2min
after packet prune)

### Frontend Changes

**RF Health tab (`public/analytics.js`, `public/style.css`)**
- Small multiples grid showing all observers simultaneously — anomalies
pop out visually
- Per-observer cell: name, current NF value, battery voltage, sparkline,
avg/max stats
- NF status coloring: warning (amber) at ≥-100 dBm, critical (red) at
≥-85 dBm — text color only, no background fills
- Click any cell → expanded detail view with full noise floor line chart
- Reference lines with direct text labels (`-100 warning`, `-85
critical`) — not color bands
- Min/max points labeled directly on the chart
- Time range selector: preset buttons (1h/3h/6h/12h/24h/3d/7d/30d) +
custom from/to datetime picker
- Deep linking: `#/analytics?tab=rf-health&observer=...&range=...`
- All charts use SVG, matching existing analytics.js patterns
- Responsive: 3-4 columns on desktop, 1 on mobile

### Design Decisions (from spec)
- Labels directly on data, not in legends
- Reference lines with text labels, not color bands
- Small multiples grid, not card+accordion (Tufte: instant visual fleet
comparison)
- Ingestor wall clock for all timestamps (observer clocks may drift)

### Tests Added

**Ingestor tests:**
- `TestRoundToInterval` — 5 cases for rounding to 5-min boundaries
- `TestInsertMetrics` — basic insertion with all fields
- `TestInsertMetricsIdempotent` — INSERT OR REPLACE deduplication
- `TestInsertMetricsNullFields` — partial data with NULLs
- `TestPruneOldMetrics` — retention pruning
- `TestExtractObserverMetaNewFields` — parsing tx_air_secs, rx_air_secs,
recv_errors

**Server tests:**
- `TestGetObserverMetrics` — time-series query with since/until filters,
NULL handling
- `TestGetMetricsSummary` — fleet summary aggregation
- `TestObserverMetricsAPIEndpoints` — DB query verification
- `TestMetricsAPIEndpoints` — HTTP endpoint response shape
- `TestParseWindowDuration` — duration parsing for h/d formats

### Test Results
```
cd cmd/ingestor && go test ./... → PASS (26s)
cd cmd/server && go test ./... → PASS (5s)
```

### What's NOT in this PR (deferred to M2+)
- Server-side delta computation for cumulative counters
- Airtime charts (TX/RX percentage lines)
- Channel quality chart (recv_error_rate)
- Battery voltage chart
- Reboot detection and chart annotations
- Resolution downsampling (1h, 1d aggregates)
- Pattern detection / automated diagnosis

---------

Co-authored-by: you <you@example.com>
2026-04-04 22:21:35 -07:00
you aaf00d0616 docs: add M5 Prometheus/Grafana metrics export to RF Health spec 2026-04-05 05:02:36 +00:00
you 41c046c974 docs: RF Health Dashboard spec — observer radio metrics
Per-observer time-series charts for noise floor, TX/RX airtime, CRC errors,
and battery. Small multiples grid design. MVP-first milestones.

Reviewed by Carmack (perf), Munger (failure modes), radio expert (hardware),
Tufte (visualization), and Doshi (product strategy).
2026-04-05 04:42:32 +00:00
efiten 1fbdd1c3d3 feat: Prefix Tool tab on Analytics page (#347) (#599)
## Summary

- Adds a new **Prefix Tool** tab to the Analytics page (alongside Hash
Stats / Hash Issues)
- **Network Overview**: per-tier collision stats (1/2/3-byte) and a
network-size-based recommendation — collapsible, folded by default
- **Prefix Checker**: accepts a 1/2/3-byte hex prefix or full public
key; shows colliding nodes at each tier with severity badges ( / ⚠️ /
🔴); clicking a node navigates to its detail page
- **Prefix Generator**: picks a random collision-free prefix at the
chosen hash size; links to
[meshcore-web-keygen](https://agessaman.github.io/meshcore-web-keygen/)
with the prefix pre-filled
- **Hash Issues tab**: adds a "🔎 Check a prefix →" shortcut in the nav
- **Deep-link support**: `#/analytics?tab=prefix-tool&prefix=A3F1`
pre-fills and runs the checker; `?generate=2` pre-selects and runs the
generator
- **No new API endpoints** — 100% client-side using the existing
`/nodes` list

## Verification

Live on staging:
**https://staging.on8ar.eu/#/analytics?tab=prefix-tool**

## Test plan

- [x] Network Overview card is collapsed by default; expands on click;
stats are correct
- [x] Prefix Checker: 2-char input shows 1-byte results; 4-char shows
2-byte; 6-char shows 3-byte; 64-char pubkey shows all three tiers
- [x] Prefix Checker: invalid hex shows error; odd-length input shows
error
- [x] Prefix Generator: Generate picks an unused prefix; "Try another"
cycles; keygen link opens with prefix pre-filled
- [x] Deep link `?prefix=A3F1` pre-fills checker and scrolls to it
- [x] Deep link `?generate=2` pre-selects 2-byte and runs generator
- [x] Hash Issues tab shows "🔎 Check a prefix →" in the nav
- [x] FAQ link at bottom of generator opens correct MeshCore docs anchor

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 20:18:32 -07:00
efiten d34320fa6c fix: use _getColCount() in error-state row to match spacers (#406) (#597)
## Summary

The error-state `<tbody>` row (shown when packet loading fails)
hardcoded `colspan="10"`, while the virtual scroll spacers and the
empty-state row both use `_getColCount()` (which reads from the actual
`<thead>` and falls back to 11). One-line fix: replace the hardcoded
value with `_getColCount()`.

Fixes #406

## Test plan

- [x] Trigger the error state (e.g. kill the backend mid-load) — error
row should span all columns with no gap on the right
- [x] `node test-packets.js` — 72 passed, 0 failed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 19:41:55 -07:00
efiten 77b7c33d0f perf: incremental DOM diff in renderVisibleRows (#414) (#596)
## Summary

- Replace full \`tbody\` teardown+rebuild on every scroll frame with a
range-diff that only adds/removes the delta rows at the edges of the
visible window
- \`buildFlatRowHtml\` / \`buildGroupRowHtml\` now accept an
\`entryIdx\` parameter and emit \`data-entry-idx\` on every \`<tr>\` so
the diff can target rows precisely (including expanded group children)
- Full rebuild is retained for initial render and large scroll jumps
past the buffer (no range overlap)
- Also loads \`packet-helpers.js\` in the test sandbox, fixing 7
pre-existing test failures for the builder functions; adds 4 new tests
covering \`data-entry-idx\` output

Fixes #414

## Test plan

- [x] Open packets page with 500+ packets, scroll rapidly — DOM
inspector should show incremental \`<tr>\` adds/removes rather than full
\`tbody\` teardown
- [x] Expand a grouped packet, scroll away and back — expanded children
re-render correctly
- [x] Large scroll jump (jump to bottom via scrollbar) — full rebuild
fires, no visual glitch
- [x] \`node test-packets.js\` — 72 passed, 0 failed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: you <you@example.com>
2026-04-04 19:41:33 -07:00
you 0a55717283 docs: add PSK brute-force attack with timestamp oracle to security analysis
Weak passphrases with no KDF stretching are the #1 practical threat.
Timestamp in plaintext block 0 serves as known-plaintext oracle for
instant key verification from a single captured packet.

Key findings:
- decode_base64() output used directly as AES key, no KDF
- Short passphrases produce <16 byte keys (reduced key space)
- No salt means global precomputed attacks work
- 3-word passphrase crackable in ~2 min on commodity GPU

Reviewed by djb and Dijkstra personas. Corrections applied:
- GPU throughput upgraded from 10^9 to 10^10 AES/sec baseline
- Oracle strengthened: bytes 4+ (type byte, sender name) also predictable
- Dictionary size assumptions made explicit
- Zipf's law caveat added (humans don't choose uniformly)
- base64 short-passphrase key truncation issue documented
2026-04-05 00:58:57 +00:00
you bcab31bf72 docs: AES-128-ECB security analysis — block-level vulnerability assessment
Formal analysis of MeshCore's ECB encryption for channel and direct messages.
Reviewed by djb and Dijkstra expert personas through 3 revisions.

Key findings:
- Block 0 has accidental nonce (4-byte timestamp) preventing repetition
- Blocks 1+ are pure deterministic ECB with no nonce — vulnerable to
  frequency analysis for repeated message content
- Partial final block attack: zero-padding reduces search space
- HMAC key reuse: AES key is first 16 bytes of HMAC key (same material)
- Recommended fix: switch to AES-128-CTR mode
2026-04-05 00:44:21 +00:00
Kpa-clawbot 6ae62ce535 perf: make txToMap observations lazy via ExpandObservations flag (#595)
## Summary

`txToMap()` previously always allocated observation sub-maps for every
packet, even though the `/api/packets` handler immediately stripped them
via `delete(p, "observations")` unless `expand=observations` was
requested. A typical page of 50 packets with ~5 observations each caused
300+ unnecessary map allocations per request.

## Changes

- **`txToMap`**: Add variadic `includeObservations bool` parameter.
Observations are only built when `true` is passed, eliminating
allocations when they'd just be discarded.
- **`PacketQuery`**: Add `ExpandObservations bool` field to thread the
caller's intent through the query pipeline.
- **`routes.go`**: Set `ExpandObservations` based on
`expand=observations` query param. Removed the post-hoc `delete(p,
"observations")` loop — observations are simply never created when not
requested.
- **Single-packet lookups** (`GetPacketByID`, `GetPacketByHash`): Always
pass `true` since detail views need observations.
- **Multi-node/analytics queries**: Default (no flag) = no observations,
matching prior behavior.

## Testing

- Added `TestTxToMapLazyObservations` covering all three cases: no flag,
`false`, and `true`.
- All existing tests pass (`go test ./...`).

## Perf Impact

Eliminates ~250 observation map allocations per /api/packets request (at
default page size of 50 with ~5 observations each). This is a
constant-factor improvement per request — no algorithmic complexity
change.

Fixes #374

Co-authored-by: you <you@example.com>
2026-04-04 10:39:30 -07:00
Kpa-clawbot 6e2f79c0ad perf: optimize QueryGroupedPackets — cache observer count, defer map construction (#594)
## Summary

Optimizes `QueryGroupedPackets()` in `store.go` to eliminate two major
inefficiencies on every grouped packet list request:

### Changes

1. **Cache `UniqueObserverCount` on `StoreTx`** — Instead of iterating
all observations to count unique observers on every query
(O(total_observations) per request), we now track unique observers at
ingest time via an `observerSet` map and pre-computed
`UniqueObserverCount` field. This is updated incrementally as
observations arrive.

2. **Defer map construction until after pagination** — Previously,
`map[string]interface{}` was built for ALL 30K+ filtered results before
sorting and paginating. Now the grouped cache stores sorted `[]*StoreTx`
pointers (lightweight), and `groupedTxsToPage()` builds maps only for
the requested page (typically 50 items). This eliminates ~30K map
allocations per cache miss.

3. **Lighter cache footprint** — The grouped cache now stores
`[]*StoreTx` instead of `*PacketResult` with pre-built maps, reducing
memory pressure and GC work.

### Complexity

- Observer counting: O(1) per query (was O(total_observations))
- Map construction: O(page_size) per query (was O(n) where n = all
filtered results)
- Sort remains O(n log n) on cache miss, but the cache (3s TTL) absorbs
repeated requests

### Testing

- `cd cmd/server && go test ./...` — all tests pass
- `cd cmd/ingestor && go build ./...` — builds clean

Fixes #370

---------

Co-authored-by: you <you@example.com>
2026-04-04 10:39:04 -07:00
Kpa-clawbot b0862f7a41 fix: replace time.Tick with NewTicker in prune goroutine for graceful shutdown (#593)
## Summary

Replace `time.Tick()` with `time.NewTicker()` in the auto-prune
goroutine so it stops cleanly during graceful shutdown.

## Problem

`time.Tick` creates a ticker that can never be garbage collected or
stopped. While the prune goroutine runs for the process lifetime, it
won't stop during graceful shutdown — the goroutine leaks past the
shutdown sequence.

## Fix

- Create a `time.NewTicker` and a done channel
- Use `select` to listen on both the ticker and done channel
- Stop the ticker and close the done channel in the shutdown path (after
`poller.Stop()`)
- Pattern matches the existing `StartEvictionTicker()` approach

## Testing

- `go build ./...` — compiles cleanly
- `go test ./...` — all tests pass

Fixes #377

Co-authored-by: you <you@example.com>
2026-04-04 10:38:37 -07:00
Kpa-clawbot 45991eca09 perf: combine chained filterPackets passes into single scan (#592)
## Summary

Combines the chained `filterTxSlice` calls in `filterPackets()` into a
single pass over the packet slice.

## Problem

When multiple filter parameters are specified (e.g.,
`type=4&route=1&since=...&until=...`), each filter created a new
intermediate `[]*StoreTx` slice. With N filters, this meant N separate
scans and N-1 unnecessary allocations.

## Fix

All filter predicates (type, route, observer, hash, since, until,
region, node) are pre-computed before the loop, then evaluated in a
single `filterTxSlice` call. This eliminates all intermediate
allocations.

**Preserved behavior:**
- Fast-path index lookups for hash-only and observer-only queries remain
unchanged
- Node-only fast-path via `byNode` index preserved
- All existing filter semantics maintained (same comparison operators,
same null checks)

**Complexity:** Single `O(n)` pass regardless of how many filters are
active, vs previous `O(n * k)` where k = number of active filters (each
pass is O(n) but allocates).

## Testing

All existing tests pass (`cd cmd/server && go test ./...`).

Fixes #373

Co-authored-by: you <you@example.com>
2026-04-04 10:38:10 -07:00
Kpa-clawbot 76c42556a2 perf: sort snrVals/rssiVals once in computeAnalyticsRF (#591)
## Summary

Sort `snrVals` and `rssiVals` once upfront in `computeAnalyticsRF()` and
read min/max/median directly from the sorted slices, instead of copying
and sorting per stat call.

## Changes

- Sort both slices once before computing stats (2 sorts total instead of
4+ copy+sorts)
- Read `min` from `sorted[0]`, `max` from `sorted[len-1]`, `median` from
`sorted[len/2]`
- Remove the now-unused `sortedF64` and `medianF64` helper closures

## Performance impact

With 100K+ observations, this eliminates multiple O(n log n) copy+sort
operations. Previously each call to `medianF64` did a full copy + sort,
and `minF64`/`maxF64` did O(n) scans on the unsorted array. Now: 2
in-place sorts total, O(1) lookups for min/max/median.

Fixes #366

Co-authored-by: you <you@example.com>
2026-04-04 10:37:42 -07:00
Kpa-clawbot 6f8378a31c perf: batch-remove from secondary indexes in EvictStale (#590)
## Summary

`EvictStale()` was doing O(n) linear scans per evicted item to remove
from secondary indexes (`byObserver`, `byPayloadType`, `byNode`).
Evicting 1000 packets from an observer with 50K observations meant 1000
× 50K = 50M comparisons — all under a write lock.

## Fix

Replace per-item removal with batch single-pass filtering:

1. **Collect phase**: Walk evicted packets once, building sets of
evicted tx IDs, observation IDs, and affected index keys
2. **Filter phase**: For each affected index slice, do a single pass
keeping only non-evicted entries

**Before**: O(evicted_count × index_slice_size) per index — quadratic in
practice
**After**: O(evicted_count + index_slice_size) per affected key — linear

## Changes

- `cmd/server/store.go`: Restructured `EvictStale()` eviction loop into
collect + batch-filter pattern

## Testing

- All existing tests pass (`cd cmd/server && go test ./...`)

Fixes #368

Co-authored-by: you <you@example.com>
2026-04-04 10:37:27 -07:00
Kpa-clawbot 56115ee0a4 perf: use byNode index in QueryMultiNodePackets instead of full scan (#589)
## Summary

`QueryMultiNodePackets()` was scanning ALL packets with
`strings.Contains` on JSON blobs — O(packets × pubkeys × json_length).
With 30K+ packets and multiple pubkeys, this caused noticeable latency
on `/api/packets?nodes=...`.

## Fix

Replace the full scan with lookups into the existing `byNode` index,
which already maps pubkeys to their transmissions. Merge results with
hash-based deduplication, then apply time filters.

**Before:** O(N × P × J) where N=all packets, P=pubkeys, J=avg JSON
length
**After:** O(M × P) where M=packets per pubkey (typically small), plus
O(R log R) sort for pagination correctness

Results are sorted by `FirstSeen` after merging to maintain the
oldest-first ordering expected by the pagination logic.

Fixes #357

Co-authored-by: you <you@example.com>
2026-04-04 10:36:59 -07:00
Kpa-clawbot 321d1cf913 perf: apply time filter early in GetNodeAnalytics to avoid full packet scan (#588)
## Problem

`GetNodeAnalytics()` in `store.go` scans ALL 30K+ packets doing
`strings.Contains` on every JSON blob when the node has a name, then
filters by time range *after* the full scan. This is `O(packets ×
json_length)` on every `/api/nodes/{pubkey}/analytics` request.

## Fix

Move the `fromISO` time check inside the scan loop so old packets are
skipped **before** the expensive `strings.Contains` matching. For the
non-name path (indexed-only), the time filter is also applied inline,
eliminating the separate `allPkts` intermediate slice.

### Before
1. Scan all packets → collect matches (including old ones) → `allPkts`
2. Filter `allPkts` by time → `packets`

### After
1. Scan packets, skip `tx.FirstSeen <= fromISO` immediately → `packets`

This avoids `strings.Contains` calls on packets outside the requested
time window (typically 7 days out of months of data).

## Complexity
- **Before:** `O(total_packets × avg_json_length)` for name matching
- **After:** `O(recent_packets × avg_json_length)` — only packets within
the time window are string-matched

## Testing
- `cd cmd/server && go test ./...` — all tests pass

Fixes #367

Co-authored-by: you <you@example.com>
2026-04-04 10:36:49 -07:00
Kpa-clawbot 790a713ba9 perf: combine 4 subpath API calls into single bulk endpoint (#587)
## Summary

Consolidates the 4 parallel `/api/analytics/subpaths` calls in the Route
Patterns tab into a single `/api/analytics/subpaths-bulk` endpoint,
eliminating 3 redundant server-side scans of the subpath index on cache
miss.

## Changes

### Backend (`cmd/server/routes.go`, `cmd/server/store.go`)
- New `GET
/api/analytics/subpaths-bulk?groups=2-2:50,3-3:30,4-4:20,5-8:15`
endpoint
- Groups format: `minLen-maxLen:limit` comma-separated
- `GetAnalyticsSubpathsBulk()` iterates `spIndex` once, bucketing
entries into per-group accumulators by hop length
- Hop name resolution is done once per raw hop and shared across groups
- Results are cached per-group for compatibility with existing
single-key cache lookups
- Region-filtered queries fall back to individual
`GetAnalyticsSubpaths()` calls (region filtering requires
per-transmission observer checks)

### Frontend (`public/analytics.js`)
- `renderSubpaths()` now makes 1 API call instead of 4
- Response shape: `{ results: [{ subpaths, totalPaths }, ...] }` —
destructured into the same `[d2, d3, d4, d5]` variables

### Tests (`cmd/server/routes_test.go`)
- `TestAnalyticsSubpathsBulk`: validates 3-group response shape, missing
params error, invalid format error

## Performance

- **Before:** 4 API calls → 4 scans of `spIndex` + 4× hop resolution on
cache miss
- **After:** 1 API call → 1 scan of `spIndex` + 1× hop resolution
(shared cache)
- Cache miss cost reduced by ~75% for this tab
- No change on cache hit (individual group caching still works)

Fixes #398

Co-authored-by: you <you@example.com>
2026-04-04 10:19:18 -07:00
Kpa-clawbot cd470dffbe perf: batch observation fetching to eliminate N+1 API calls on sort change (#586)
## Summary

Fixes the N+1 API call pattern when changing observation sort mode on
the packets page. Previously, switching sort to Path or Time fired
individual `/api/packets/{hash}` requests for **every**
multi-observation group without cached children — potentially 100+
concurrent requests.

## Changes

### Backend: Batch observations endpoint
- **New endpoint:** `POST /api/packets/observations` accepts `{"hashes":
["h1", "h2", ...]}` and returns all observations keyed by hash in a
single response
- Capped at 200 hashes per request to prevent abuse
- 4 test cases covering empty input, invalid JSON, too-many-hashes, and
valid requests

### Frontend: Use batch endpoint
- `packets.js` sort change handler now collects all hashes needing
observation data and sends a single POST request instead of N individual
GETs
- Same behavior, single round-trip

## Performance

- **Before:** Changing sort with 100 visible groups → 100 concurrent API
requests, browser connection queueing (6 per host), several seconds of
lag
- **After:** Single POST request regardless of group count, response
time proportional to store lookup (sub-millisecond per hash in memory)

Fixes #389

---------

Co-authored-by: you <you@example.com>
2026-04-04 10:18:40 -07:00
Kpa-clawbot 7ff89d8607 perf(packets): coalesce WS-triggered renders with requestAnimationFrame (#585)
## Summary

Coalesce WS-triggered `renderTableRows()` calls using
`requestAnimationFrame` instead of `setTimeout` debouncing.

Fixes #396

## Problem

During high WebSocket throughput, multiple WS batches could each trigger
a `renderTableRows()` call via `setTimeout(..., 200)`. With rapid
batches, this caused the 50K-row table to be fully rebuilt every few
hundred milliseconds, causing UI jank.

## Solution

Replace the `setTimeout`-based debounce with a `requestAnimationFrame`
coalescing pattern:

1. **`scheduleWSRender()`** — sets a dirty flag and schedules a single
rAF callback
2. **Dirty flag** — multiple WS batches within the same frame just set
the flag; only one render fires
3. **Cleanup** — `destroy()` cancels any pending rAF and resets the
dirty flag

This ensures at most **one `renderTableRows()` per animation frame**
(~16ms), regardless of how many WS batches arrive.

## Performance justification

- **Before:** Each WS batch → `setTimeout(renderTableRows, 200)` — N
batches in <200ms = N renders
- **After:** N batches in one frame → 1 render on next rAF (~16ms)
- Worst case goes from O(N) renders per second to O(60) renders per
second (frame-capped)

## Changes

- `public/packets.js`: Add `scheduleWSRender()` with rAF + dirty flag;
replace setTimeout in WS handler; clean up in `destroy()`
- `test-frontend-helpers.js`: Update tests to verify rAF coalescing
pattern instead of setTimeout debounce

## Testing

- All existing tests pass (`npm test` — 0 failures)
- Updated 2 test cases to verify new rAF coalescing behavior

Co-authored-by: you <you@example.com>
2026-04-04 10:18:09 -07:00
Kpa-clawbot 493849f2e3 perf(frontend): compress og-image.png from 1.1MB to 235KB (#584)
## Summary

Compress `public/og-image.png` from **1,159,050 bytes (1.1MB)** to
**234,899 bytes (235KB)** — an **80% reduction**.

## What Changed

- Applied lossy PNG quantization via `pngquant` (quality 45-65, speed 1)
- Image dimensions unchanged: 1200×630px (standard OG image size)
- Visual quality remains suitable for social media previews

## Why

A 1.1MB OpenGraph image is excessive. Typical OG images are 50-200KB.
This reduces deployment size and Git repo bloat without affecting
functionality (browsers don't preload OG images).

## Testing

- Unit tests pass (`npm run test:unit`)
- No code changes — image-only commit
- `index.html` reference unchanged (`<meta property="og:image"
content="/og-image.png">`)

Fixes #397

Co-authored-by: you <you@example.com>
2026-04-04 10:17:21 -07:00
Kpa-clawbot 87ac61748c perf(analytics): compute network status client-side, eliminate redundant API call (#583)
## Summary

Reduces the analytics nodes tab from 3 parallel API calls to 2 by
computing network status (active/degraded/silent counts) client-side
instead of fetching from `/nodes/network-status`.

## What Changed

**`public/analytics.js` — `renderNodesTab()`:**
- Removed the `/nodes/network-status` API call from the `Promise.all`
batch
- Added client-side computation of active/degraded/silent counts using
the shared `getHealthThresholds()` function from `roles.js`
- Uses `nodesResp.total` and `nodesResp.counts` (already returned by
`/nodes` endpoint) for total node count and role breakdown

## Why This Works

The `/nodes` response already includes:
- `total` — count of all matching nodes (server-computed across full DB)
- `counts` — role counts across all nodes (from `GetAllRoleCounts()`)
- Per-node `last_seen`/`last_heard` timestamps

The `getHealthThresholds()` function in `roles.js` provides the same
degraded/silent thresholds used server-side, so client-side status
computation produces equivalent results for the loaded node set.

## Performance

- **Before:** 3 parallel API calls (`/nodes`, `/nodes/bulk-health`,
`/nodes/network-status`)
- **After:** 2 parallel API calls (`/nodes`, `/nodes/bulk-health`)
- Network status computation is O(n) over the 200 loaded nodes —
negligible client-side cost
- The `/nodes/network-status` endpoint scanned ALL nodes in the DB on
every call; this eliminates that server-side work entirely

## Testing

- All frontend helper tests pass (445/445)
- All packet filter tests pass (62/62)  
- All aging tests pass (29/29)
- All Go backend tests pass

Fixes #392

---------

Co-authored-by: you <you@example.com>
2026-04-04 10:17:05 -07:00
Kpa-clawbot 26de38f4b6 perf(map): reposition markers on zoom/resize instead of full rebuild (#582)
## Summary

Eliminates visible marker flicker on zoom/resize events in the map page
when displaying 500+ nodes.

## Problem

`renderMarkers()` was called on every `zoomend` and `resize` event,
which did `markerLayer.clearLayers()` followed by a full rebuild of all
markers. With many nodes, this caused a visible flash where all markers
disappeared briefly before being re-added.

## Solution

Instead of rebuilding all markers from scratch on zoom/resize:

1. **Store Leaflet layer references** on marker data objects
(`_leafletMarker`, `_leafletLine`, `_leafletDot`) during the initial
full render
2. **Add `_repositionMarkers()`** — re-runs `deconflictLabels()` at the
new zoom level and updates existing marker positions via
`setLatLng()`/`setLatLngs()` without clearing the layer group
3. **Debounce zoom/resize handlers** (150ms) to coalesce rapid events
during animated zooms
4. **Dynamically manage offset indicators** — adds/removes deconfliction
offset lines and dots as positions change at different zoom levels

Full `renderMarkers()` is still called for filter changes, data updates,
and theme changes — only zoom/resize uses the lightweight repositioning
path.

## Complexity

- `_repositionMarkers()`: O(n) — single pass over stored marker data
- `deconflictLabels()`: O(n × k) where k is max spiral offsets (48) —
unchanged
- No new API calls, no DOM rebuilds

Fixes #393

---------

Co-authored-by: you <you@example.com>
2026-04-04 17:16:48 +00:00