Commit Graph

12 Commits

Author SHA1 Message Date
Kpa-clawbot
dc5b5ce9a0 fix: reject weak/default API keys + startup warning (#532) (#628)
## Summary

Hardens API key security for write endpoints (fixes #532):

1. **Constant-time comparison** — uses
`crypto/subtle.ConstantTimeCompare` to prevent timing attacks on API key
validation
2. **Weak key blocklist** — rejects known default/example keys (`test`,
`password`, `change-me`, `your-secret-api-key-here`, etc.)
3. **Minimum length enforcement** — keys shorter than 16 characters are
rejected
4. **Startup warning** — logs a clear warning if the configured key is
weak or a known default
5. **Generic error messages** — HTTP 403 response uses opaque
"forbidden" message to prevent information leakage about why a key was
rejected

### Security Model
- **Empty key** → all write endpoints disabled (403)
- **Weak/default key** → all write endpoints disabled (403), startup
warning logged
- **Wrong key** → 401 unauthorized
- **Strong correct key** → request proceeds

### Files Changed
- `cmd/server/config.go` — `IsWeakAPIKey()` function + blocklist
- `cmd/server/routes.go` — constant-time comparison via
`constantTimeEqual()`, weak key rejection
- `cmd/server/main.go` — startup warning for weak keys
- `cmd/server/apikey_security_test.go` — comprehensive test coverage
- `cmd/server/routes_test.go` — existing tests updated to use strong
keys

### Reviews
-  Self-review: all security properties verified
-  djb Final Review: timing fix correct, blocklist pragmatic, error
messages opaque, tests comprehensive. **Verdict: Ship it.**

### Test Results
All existing + new tests pass. Coverage includes: weak key detection
(blocklist + length + case-insensitive), empty key handling, strong key
acceptance, wrong key rejection, and constant-time comparison.

---------

Co-authored-by: you <you@example.com>
2026-04-05 14:50:40 -07:00
Kpa-clawbot
767c8a5a3e perf: async chunked backfill — HTTP serves within 2 minutes (#612) (#614)
## Summary

Adds two config knobs for controlling backfill scope and neighbor graph
data retention, plus removes the dead synchronous backfill function.

## Changes

### Config knobs

#### `resolvedPath.backfillHours` (default: 24)
Controls how far back (in hours) the async backfill scans for
observations with NULL `resolved_path`. Transmissions with `first_seen`
older than this window are skipped, reducing startup time for instances
with large historical datasets.

#### `neighborGraph.maxAgeDays` (default: 30)
Controls the maximum age of `neighbor_edges` entries. Edges with
`last_seen` older than this are pruned from both SQLite and the
in-memory graph. Pruning runs on startup (after a 4-minute stagger) and
every 24 hours thereafter.

### Dead code removal
- Removed the synchronous `backfillResolvedPaths` function that was
replaced by the async version.

### Implementation details
- `backfillResolvedPathsAsync` now accepts a `backfillHours` parameter
and filters by `tx.FirstSeen`
- `NeighborGraph.PruneOlderThan(cutoff)` removes stale edges from the
in-memory graph
- `PruneNeighborEdges(conn, graph, maxAgeDays)` prunes both DB and
in-memory graph
- Periodic pruning ticker follows the same pattern as metrics pruning
(24h interval, staggered start)
- Graceful shutdown stops the edge prune ticker

### Config example
Both knobs added to `config.example.json` with `_comment` fields.

## Tests
- Config default/override tests for both knobs
- `TestGraphPruneOlderThan` — in-memory edge pruning
- `TestPruneNeighborEdgesDB` — SQLite + in-memory pruning together
- `TestBackfillRespectsHourWindow` — verifies old transmissions are
excluded by backfill window

---------

Co-authored-by: you <you@example.com>
2026-04-05 09:49:39 -07:00
Kpa-clawbot
6f35d4d417 feat: RF Health Dashboard M1 — observer metrics + small multiples grid (#604)
## RF Health Dashboard — M1: Observer Metrics Storage, API & Small
Multiples Grid

Implements M1 of #600.

### What this does

Adds a complete RF health monitoring pipeline: MQTT stats ingestion →
SQLite storage → REST API → interactive dashboard with small multiples
grid.

### Backend Changes

**Ingestor (`cmd/ingestor/`)**
- New `observer_metrics` table via migration system (`_migrations`
pattern)
- Parse `tx_air_secs`, `rx_air_secs`, `recv_errors` from MQTT status
messages (same pattern as existing `noise_floor` and `battery_mv`)
- `INSERT OR REPLACE` with timestamps rounded to nearest 5-min interval
boundary (using ingestor wall clock, not observer timestamps)
- Missing fields stored as NULLs — partial data is always better than no
data
- Configurable retention pruning: `retention.metricsDays` (default 30),
runs on startup + every 24h

**Server (`cmd/server/`)**
- `GET /api/observers/{id}/metrics?since=...&until=...` — per-observer
time-series data
- `GET /api/observers/metrics/summary?window=24h` — fleet summary with
current NF, avg/max NF, sample count
- `parseWindowDuration()` supports `1h`, `24h`, `3d`, `7d`, `30d` etc.
- Server-side metrics retention pruning (same config, staggered 2min
after packet prune)

### Frontend Changes

**RF Health tab (`public/analytics.js`, `public/style.css`)**
- Small multiples grid showing all observers simultaneously — anomalies
pop out visually
- Per-observer cell: name, current NF value, battery voltage, sparkline,
avg/max stats
- NF status coloring: warning (amber) at ≥-100 dBm, critical (red) at
≥-85 dBm — text color only, no background fills
- Click any cell → expanded detail view with full noise floor line chart
- Reference lines with direct text labels (`-100 warning`, `-85
critical`) — not color bands
- Min/max points labeled directly on the chart
- Time range selector: preset buttons (1h/3h/6h/12h/24h/3d/7d/30d) +
custom from/to datetime picker
- Deep linking: `#/analytics?tab=rf-health&observer=...&range=...`
- All charts use SVG, matching existing analytics.js patterns
- Responsive: 3-4 columns on desktop, 1 on mobile

### Design Decisions (from spec)
- Labels directly on data, not in legends
- Reference lines with text labels, not color bands
- Small multiples grid, not card+accordion (Tufte: instant visual fleet
comparison)
- Ingestor wall clock for all timestamps (observer clocks may drift)

### Tests Added

**Ingestor tests:**
- `TestRoundToInterval` — 5 cases for rounding to 5-min boundaries
- `TestInsertMetrics` — basic insertion with all fields
- `TestInsertMetricsIdempotent` — INSERT OR REPLACE deduplication
- `TestInsertMetricsNullFields` — partial data with NULLs
- `TestPruneOldMetrics` — retention pruning
- `TestExtractObserverMetaNewFields` — parsing tx_air_secs, rx_air_secs,
recv_errors

**Server tests:**
- `TestGetObserverMetrics` — time-series query with since/until filters,
NULL handling
- `TestGetMetricsSummary` — fleet summary aggregation
- `TestObserverMetricsAPIEndpoints` — DB query verification
- `TestMetricsAPIEndpoints` — HTTP endpoint response shape
- `TestParseWindowDuration` — duration parsing for h/d formats

### Test Results
```
cd cmd/ingestor && go test ./... → PASS (26s)
cd cmd/server && go test ./... → PASS (5s)
```

### What's NOT in this PR (deferred to M2+)
- Server-side delta computation for cumulative counters
- Airtime charts (TX/RX percentage lines)
- Channel quality chart (recv_error_rate)
- Battery voltage chart
- Reboot detection and chart annotations
- Resolution downsampling (1h, 1d aggregates)
- Pattern detection / automated diagnosis

---------

Co-authored-by: you <you@example.com>
2026-04-04 22:21:35 -07:00
Kpa-clawbot
58f791266d feat: affinity debugging tools (#482) — milestone 6 (#521)
## Summary

Milestone 6 of #482: Observability & Debugging tools for the neighbor
affinity system.

These tools exist because someone will need them at 3 AM when "Show
Neighbors is showing the wrong node for C0DE" and they have 5 minutes to
diagnose it.

## Changes

### 1. Debug API — `GET /api/debug/affinity`
- Full graph state dump: all edges with weights, observation counts,
last-seen timestamps
- Per-prefix resolution log with disambiguation reasoning (Jaccard
scores, ratios, thresholds)
- Query params: `?prefix=C0DE` filter to specific prefix,
`?node=<pubkey>` for specific node's edges
- Protected by API key (same auth as `/api/admin/prune`)
- Response includes: edge count, node count, cache age, last rebuild
time

### 2. Debug Overlay on Map
- Toggle-able checkbox "🔍 Affinity Debug" in map controls
- Draws lines between nodes showing affinity edges with color coding:
  - Green = high confidence (score ≥ 0.6)
  - Yellow = medium (0.3–0.6)
  - Red = ambiguous (< 0.3)
- Line thickness proportional to weight, dashed for ambiguous
- Unresolved prefixes shown as  markers
- Click edge → popup with observation count, last seen, score, observers
- Hidden behind `debugAffinity` config flag or
`localStorage.setItem('meshcore-affinity-debug', 'true')`

### 3. Per-Node Debug Panel
- Expandable "🔍 Affinity Debug" section in node detail page (collapsed
by default)
- Shows: neighbor edges table with scores, prefix resolutions with
reasoning trace
- Candidates table with Jaccard scores, highlighting the chosen
candidate
- Graph-level stats summary

### 4. Server-Side Structured Logging
- Integrated into `disambiguate()` — logs every resolution decision
during graph build
- Format: `[affinity] resolve C0DE: c0dedad4 score=47 Jaccard=0.82 vs
c0dedad9 score=3 Jaccard=0.11 → neighbor_affinity (ratio 15.7×)`
- Logs ambiguous decisions: `scores too close (12 vs 9, ratio 1.3×) →
ambiguous`
- Gated by `debugAffinity` config flag

### 5. Dashboard Stats Widget
- Added to analytics overview tab when debug mode is enabled
- Metrics: total edges/nodes, resolved/ambiguous counts (%), avg
confidence, cold-start coverage, cache age, last rebuild

## Files Changed
- `cmd/server/neighbor_debug.go` — new: debug API handler, resolution
builder, cold-start coverage
- `cmd/server/neighbor_debug_test.go` — new: 7 tests for debug API
- `cmd/server/neighbor_graph.go` — added structured logging to
disambiguate(), `logFn` field, `BuildFromStoreWithLog`
- `cmd/server/neighbor_api.go` — pass debug flag through
`BuildFromStoreWithLog`
- `cmd/server/config.go` — added `DebugAffinity` config field
- `cmd/server/routes.go` — registered `/api/debug/affinity` route,
exposed `debugAffinity` in client config
- `cmd/server/types.go` — added `DebugAffinity` to
`ClientConfigResponse`
- `public/map.js` — affinity debug overlay layer with edge visualization
- `public/nodes.js` — per-node affinity debug panel
- `public/analytics.js` — dashboard stats widget
- `test-e2e-playwright.js` — 3 Playwright tests for debug UI

## Tests
-  7 Go unit tests (API shape, prefix/node filters, auth, structured
logging, cold-start coverage)
-  3 Playwright E2E tests (overlay checkbox, toggle without crash,
panel expansion)
-  All existing tests pass (`go test ./cmd/server/... -count=1`)

Part of #482

---------

Co-authored-by: you <you@example.com>
2026-04-02 23:45:03 -07:00
efiten
fe314be3a8 feat: geo_filter enforcement, DB pruning, geofilter-builder tool, HB column (#215)
## Summary

Several features and fixes from a live deployment of the Go v3.0.0
backend.

### geo_filter — full enforcement

- **Go backend config** (`cmd/server/config.go`,
`cmd/ingestor/config.go`): added `GeoFilterConfig` struct so
`geo_filter.polygon` and `bufferKm` from `config.json` are parsed by
both the server and ingestor
- **Ingestor** (`cmd/ingestor/geo_filter.go`, `cmd/ingestor/main.go`):
ADVERT packets from nodes outside the configured polygon + buffer are
dropped *before* any DB write — no transmission, node, or observation
data is stored
- **Server API** (`cmd/server/geo_filter.go`, `cmd/server/routes.go`):
`GET /api/config/geo-filter` endpoint returns the polygon + bufferKm to
the frontend; `/api/nodes` responses filter out any out-of-area nodes
already in the DB
- **Frontend** (`public/map.js`, `public/live.js`): blue polygon overlay
(solid inner + dashed buffer zone) on Map and Live pages, toggled via
"Mesh live area" checkbox, state shared via localStorage

### Automatic DB pruning

- Add `retention.packetDays` to `config.json` to delete transmissions +
observations older than N days on a daily schedule (1 min after startup,
then every 24h). Nodes and observers are never pruned.
- `POST /api/admin/prune?days=N` for manual runs (requires `X-API-Key`
header if `apiKey` is set)

```json
"retention": {
  "nodeDays": 7,
  "packetDays": 30
}
```

### tools/geofilter-builder.html

Standalone HTML tool (no server needed) — open in browser, click to
place polygon points on a Leaflet map, set `bufferKm`, copy the
generated `geo_filter` JSON block into `config.json`.

### scripts/prune-nodes-outside-geo-filter.py

Utility script to clean existing out-of-area nodes from the database
(dry-run + confirm). Useful after first enabling geo_filter on a
populated DB.

### HB column in packets table

Shows the hop hash size in bytes (1–4) decoded from the path byte of
each packet's raw hex. Displayed as **HB** between Size and Type
columns, hidden on small screens.

## Test plan

- [x] ADVERT from node outside polygon is not stored (no new row in
nodes or transmissions)
- [x] `GET /api/config/geo-filter` returns polygon + bufferKm when
configured, `{polygon: null, bufferKm: 0}` when not
- [x] `/api/nodes` excludes nodes outside polygon even if present in DB
- [x] Map and Live pages show blue polygon overlay when configured;
checkbox toggles it
- [x] `retention.packetDays: 30` deletes old transmissions/observations
on startup and daily
- [x] `POST /api/admin/prune?days=30` returns `{deleted: N, days: 30}`
- [x] `tools/geofilter-builder.html` opens standalone, draws polygon,
copies valid JSON
- [x] HB column shows 1–4 for all packets in grouped and flat view

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 01:10:56 -07:00
Kpa-clawbot
5aa4fbb600 chore: normalize all files to LF line endings 2026-03-30 22:52:46 -07:00
Kpa-clawbot
1e1fb298c2 Backend: timestamp config for client defaults (#292)
## Backend: Timestamp Config for Client Defaults

Refs #286 — implements backend scope from the [final
spec](https://github.com/Kpa-clawbot/CoreScope/issues/286#issuecomment-4158891089).

### What changed

**Config struct (`cmd/server/config.go`)**
- Added `TimestampConfig` struct with `defaultMode`, `timezone`,
`formatPreset`, `customFormat`, `allowCustomFormat`
- Added `Timestamps *TimestampConfig` to main `Config` struct
- Normalization method: invalid values fall back to safe defaults
(`ago`/`local`/`iso`)

**Startup warnings (`cmd/server/main.go`)**
- Missing timestamps section: `[config] timestamps not configured —
using defaults (ago/local/iso)`
- Invalid values logged with what was normalized

**API endpoint (`cmd/server/routes.go`)**
- Timestamp config included in `GET /api/config/client` response via
`ClientConfigResponse`
- Frontend reads server defaults from this endpoint

**Config example (`config.example.json`)**
- Added `timestamps` section with documented defaults

### Tests (`cmd/server/`)
- Config loads with timestamps section
- Config loads without timestamps section (defaults applied)
- Invalid values are normalized
- `/api/config/client` returns timestamp config

### Validation
- `cd cmd/server && go test ./...` 

---------

Co-authored-by: Kpa-clawbot <259247574+Kpa-clawbot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-30 17:41:45 -07:00
efiten
999436d714 feat: geo_filter polygon overlay on map and live pages (Go backend) (#213)
## Summary

- Adds `GeoFilter` struct to `Config` in `cmd/server/config.go` so
`geo_filter.polygon` and `bufferKm` from `config.json` are parsed by the
Go backend
- Adds `GET /api/config/geo-filter` endpoint in `cmd/server/routes.go`
returning the polygon + bufferKm to the frontend
- Restores the blue polygon overlay (solid inner + dashed buffer zone)
on the **Map** page (`public/map.js`)
- Restores the same overlay on the **Live** page (`public/live.js`),
toggled via the "Mesh live area" checkbox

## Test plan

- [x] `GET /api/config/geo-filter` returns `{ polygon: [...], bufferKm:
N }` when configured
- [x] `GET /api/config/geo-filter` returns `{ polygon: null, bufferKm: 0
}` when not configured
- [x] Map page shows blue polygon overlay when `geo_filter.polygon` is
set in config
- [x] Live page shows same overlay, checkbox state shared via
localStorage
- [x] Checkbox is hidden when no polygon is configured

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 15:28:28 -07:00
Kpa-clawbot
77d8f35a04 feat: implement packet store eviction/aging to prevent OOM (#273)
## Summary

The in-memory `PacketStore` had **no eviction or aging** — it grew
unbounded until OOM killed the process. At ~3K packets/hour and ~5KB per
packet (not the 450 bytes previously estimated), an 8GB VM would OOM in
a few days.

## Changes

### Time-based eviction
- Configurable via `config.json`: `"packetStore": { "retentionHours": 24
}`
- Packets older than the retention window are evicted from the head of
the sorted slice

### Memory-based cap
- Configurable via `"packetStore": { "maxMemoryMB": 1024 }`
- Hard ceiling — evicts oldest packets when estimated memory exceeds the
cap

### Index cleanup
When a `StoreTx` is evicted, ALL associated data is removed from:
- `byHash`, `byTxID`, `byObsID`, `byObserver`, `byNode`, `byPayloadType`
- `nodeHashes`, `distHops`, `distPaths`, `spIndex`

### Periodic execution
- Background ticker runs eviction every 60 seconds
- Analytics caches and hash size cache are invalidated after eviction

### Stats fixes
- `estimatedMB` now uses ~5KB/packet + ~500B/observation (was 430B +
200B)
- `evicted` counter reflects actual evictions (was hardcoded to 0)
- Removed fake `maxPackets: 2386092` and `maxMB: 1024` from stats

### Config example
```json
{
  "packetStore": {
    "retentionHours": 24,
    "maxMemoryMB": 1024
  }
}
```

Both values default to 0 (unlimited) for backward compatibility.

## Tests
- 7 new tests in `eviction_test.go` covering time-based, memory-based,
index cleanup, thread safety, config parsing, and no-op when disabled
- All existing tests pass unchanged

Co-authored-by: Kpa-clawbot <kpabap+clawdbot@gmail.com>
2026-03-30 03:42:11 +00:00
you
0f70cd1ac0 feat: make health thresholds configurable in hours
Change healthThresholds config from milliseconds to hours for readability.
Config keys: infraDegradedHours, infraSilentHours, nodeDegradedHours, nodeSilentHours.
Defaults: infra degraded 24h, silent 72h; node degraded 1h, silent 24h.

- Config stored in hours, converted to ms at comparison time
- /api/config/client sends ms to frontend (backward compatible)
- Frontend tooltips use dynamic thresholds instead of hardcoded strings
- Added healthThresholds section to config.example.json
- Updated Go and Node.js servers, tests
2026-03-29 09:50:32 -07:00
Kpa-clawbot
520adcc6ab feat: move stale nodes to inactive_nodes table, fixes #202
- Create inactive_nodes table with identical schema to nodes
- Add retention.nodeDays config (default 7) in Node.js and Go
- On startup: move nodes not seen in N days to inactive_nodes
- Daily timer (24h setInterval / goroutine ticker) repeats the move
- Log 'Moved X nodes to inactive_nodes (not seen in N days)'
- All existing queries unchanged — they only read nodes table
- Add 14 new tests for moveStaleNodes in test-db.js
- Both Node (db.js/server.js) and Go (ingestor/server) implemented

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-27 22:43:53 -07:00
Kpa-clawbot
742ed86596 feat: add Go web server (cmd/server/) — full API + WebSocket + static files
35+ REST endpoints matching Node.js server, WebSocket broadcast,
static file serving with SPA fallback, config.json support.
Uses modernc.org/sqlite (pure Go, no CGO required).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-27 01:16:59 -07:00