meshcore-analyzer

mirror of https://github.com/Kpa-clawbot/meshcore-analyzer.git synced 2026-05-30 07:44:03 +00:00

Author	SHA1	Message	Date
Kpa-clawbot	4cd8445233	perf(#1265 ): wire /api/observers/clock-skew + /api/nodes/clock-skew into analytics recomputer (#1266 ) RED: `97f49a0c` · CI: https://github.com/Kpa-clawbot/CoreScope/actions/runs/26046530920 Fixes #1265. ## Problem On staging two clock-skew endpoints serve compute-on-request: - `/api/observers/clock-skew` — 3.3s - `/api/nodes/clock-skew` — 8.9s Both drive a full `clockSkew.Recompute` over 100k+ adverts while holding `s.mu.RLock`, blocking under concurrent reader load. ## Fix Wire both endpoints into the established `analytics_recomputer.go` pattern (PRs #1248 / #1259 / #1263). Two new slots: - `recompObserversClockSkew` — wraps `computeObserverCalibrations()` - `recompNodesClockSkew` — wraps `computeFleetClockSkew()` Accessors `GetObserverCalibrations` / `GetFleetClockSkew` now prefer the atomic-pointer snapshot; on-request compute is fallback-only for the brief window before initial sync compute lands (and for tests that skip the recomputer). Default interval 300s, overridable via: ```json "analytics": { "recomputeIntervalSeconds": { "observersClockSkew": 300, "nodesClockSkew": 300 } } ``` `config.example.json` + the `_comment_analytics` doc updated. ## TDD - RED `97f49a0c` — `TestClockSkewRecomputersRegistered` + `TestClockSkewHandlersSteadyStateLatency` (8 concurrent readers × 25 reqs per endpoint, p99 < 100ms gate). Fails on master: recomputer slots nil. - GREEN `19599375` — wire + accessor switch. p99 well under 5ms on the test fixture. ## Verification ``` cd cmd/server && go test ./... -count=1 # ok 42s bash ~/.openclaw/skills/pr-preflight/scripts/run-all.sh origin/master # all gates pass ``` --------- Co-authored-by: CoreScope Bot <bot@corescope.local>	2026-05-18 12:27:44 -07:00
Kpa-clawbot	f81ed5b3cf	perf(#1256 ): wire /api/analytics/roles into steady-state recomputer (#1259 ) RED commit: `0190466d` — failing CI: https://github.com/Kpa-clawbot/CoreScope/actions (will populate after PR creation) ## Problem On staging (commit `d69d9fb`, 78k tx, 2.3M obs), `curl http://localhost/api/analytics/roles` times out at 60s with 0 bytes — the Roles tab is unusable. Issue #1256. PR #1248's steady-state recomputer fan-out (topology / rf / distance / channels / hash-collisions / hash-sizes) didn't include roles. The legacy handler: 1. Holds `s.mu.RLock` for the entire compute. 2. Calls `GetFleetClockSkew()`, which drives `clockSkew.Recompute(s)` over all ADVERT transmissions — O(78k) per request. 3. Concurrent ingest writers compound the latency through writer-starvation. Result: every request hits the cold path; the response never comes back inside the 60 s HTTP budget. ## Fix Add `roles` as the 7th endpoint in the recomputer fan-out — same pattern as #1248: - `PacketStore.recompRoles` slot, registered in `StartAnalyticsRecomputers` with default 5-min interval. - `PacketStore.GetAnalyticsRoles()` → atomic-pointer load from the snapshot (sub-ms), with a `computeAnalyticsRoles()` fallback only for the brief startup window before the initial sync compute completes. - Handler is now a thin wrapper — no lock-held work on the request path. - New optional `roles` key under `analytics.recomputeIntervalSeconds` in config; `config.example.json` and `_comment_analytics` updated. ## Latency (unit-scope benchmark) - Worst-of-50 handler latency: <100 ms (test budget; well under the 2 s p99 acceptance). - Compute itself is bounded by the existing 5-min recompute window — it runs once in the background, never on the request path. ## Tests - RED `0190466d`: asserts `recompRoles` is registered and the handler returns under the latency budget. Fails on master with `recompRoles not registered`. - GREEN `d7784f76`: registers the recomputer + snapshot accessor — both tests pass. Fixes #1256 --------- Co-authored-by: openclaw-bot <bot@openclaw.local>	2026-05-18 07:36:28 -07:00
Kpa-clawbot	356f001027	perf(#1240 ): steady-state background recompute for analytics endpoints (#1248 ) RED commit: `27630f6a` — adds latency test that fails on master (p99=225ms > 50ms budget) and a stub `StartAnalyticsRecomputers` that returns a no-op so the assertion (not a build error) gates the change. GREEN commit: `20fbbceb` — wires real background recompute infrastructure. Test passes at p99=~1µs. ## What changed Replaces the on-request "compute-then-cache" pattern for the default-shape analytics queries with a steady-state background recompute loop. Reads always hit an `atomic.Value` snapshot in <1µs regardless of compute cost or writer contention. Operator principle: serving slightly stale data quickly beats real-time data slowly. ## Endpoints converted (default 5min interval each) \| Endpoint \| Cold compute \| Recomputer interval \| \|---\|---\|---\| \| `/api/analytics/topology` \| ~5s \| 5 min \| \| `/api/analytics/rf` \| ~4s \| 5 min \| \| `/api/analytics/distance` \| ~3s \| 5 min \| \| `/api/analytics/channels` \| ~0.5s \| 5 min \| \| `/api/analytics/hash-collisions` \| ~0.5s \| 5 min \| \| `/api/analytics/hash-sizes` \| ~22ms \| 5 min \| All intervals configurable per-endpoint via `analytics.recomputeIntervalSeconds.<name>` in `config.json`; documented in `config.example.json`. Default override via `analytics.defaultIntervalSeconds`. ## Scope: default query only Only the canonical shape `(region="", window=zero)` is precomputed. Region- or window-filtered requests fall back to the legacy TTL cache + on-request compute — keeps recomputer count bounded (6, not 6×N×M). ## Latency Test `TestAnalyticsRecomputerSteadyStateLatency`: 100 concurrent readers + 4 writers churning `s.mu.Lock` on 20k distHops. - Before: p50=188ms p99=225ms (assertion failed) - After: p50=240ns p99=1.1µs (atomic load + map return) ## Shutdown integration `StartAnalyticsRecomputers` returns a stop closure invoked from `main.go`'s SIGTERM handler BEFORE `dbClose()` so any in-flight SQLite compute drains cleanly. `TestAnalyticsRecomputerShutdownNoLeak` confirms all 6 goroutines are reaped (Δ=6 within 2s). ## Safety details - Initial compute is synchronous in `Start()` — first read after startup never sees nil. - `recover()` inside `runOnce` keeps a compute panic from killing the goroutine; previous snapshot remains valid. - `analyticsRecomputerMu` is a sync.RWMutex; recomputer pointers are read-locked in the hot path. The atomic.Value swap inside `runOnce` is lock-free. Fixes #1240. --------- Co-authored-by: OpenClaw Bot <bot@openclaw.local>	2026-05-17 17:33:30 +00:00
Kpa-clawbot	7179afcfde	feat(#1228 ): reject geo-implausible neighbor-graph edges at build time (#1230 ) Fixes #1228 — geo-implausible neighbor-graph edges are rejected at build time. Red commit: `5a6d9660` — failing tests for 4 cases (reject SF↔Berlin, accept local CA, accept no-GPS endpoint, counter increments). Live CI run (latest commit): https://github.com/Kpa-clawbot/CoreScope/actions?query=branch%3Afix%2Fissue-1228 ## Why The disambiguator's tier-1 affinity graph is built blindly from path co-occurrence. On wide-geo MQTT deployments, a single bad hop disambiguation seeds an edge across geographically impossible distances (e.g. Bay Area ↔ Berlin), which then reinforces the same wrong resolution next time. Self-poisoning spiral. ## What changed - `upsertEdge` now consults a per-graph GPS index. When both endpoints have known GPS and their haversine distance exceeds the threshold, the edge is dropped and `NeighborGraph.RejectedEdgesGeoFar` (atomic) is incremented. - Either endpoint missing GPS ⇒ accept (no signal to reject), per acceptance criteria. - Threshold is configurable via `neighborGraph.maxEdgeKm` (default 500 km — well above any plausible terrestrial LoRa hop, including satellite-assisted). 0 ⇒ use default; negative ⇒ disable the filter. Exposed via `Config.NeighborMaxEdgeKm()`. - New `BuildFromStoreWithOptions` carrying the threshold; `BuildFromStore` and `BuildFromStoreWithLog` are kept as thin wrappers. - Stats are surfaced under `GET /api/analytics/neighbor-graph` as `stats.rejected_edges_geo_far`. - All rejection logs PII-truncate pubkeys to 8 hex chars (public repo discipline). - `config.example.json` updated with the new field + comment. ## Follow-up #1229 (per-region scoped affinity graphs) depends on this landing first. --------- Co-authored-by: corescope-bot <bot@corescope.local>	2026-05-16 10:14:44 -07:00
efiten	11d2026bb1	feat(startup): hot startup — load hotStartupHours synchronously, fill retentionHours in background (#1187 ) Closes #1183 ## Summary - Adds `packetStore.hotStartupHours` config key (float64, default 0 = disabled). When set, `Load()` loads only that many hours of data synchronously, reducing startup time on large DBs. Background goroutine fills the remaining `retentionHours` window in daily chunks after startup completes. - A background goroutine (`loadBackgroundChunks`) fills the remaining `retentionHours` window in daily chunks after startup completes. Analytics indexes are rebuilt once at the end. - `QueryPackets` and `QueryGroupedPackets` check `oldestLoaded` and fall back to `db.QueryPackets()` for any query whose `Since`/`Until` predates the in-memory window — covering days 8–30 permanently (beyond `retentionHours`) and the background-fill gap during startup. - `/api/perf` gains `hotStartupHours`, `backgroundLoadComplete`, and `backgroundLoadProgress` fields inside `packetStore` so operators can monitor the fill. ### Drive-by fixes - E2E: added `gotoPackets` navigation helper used across packet-related tests - E2E: rewrote stripe assertion to check per-row stripe parity rather than a fragile computed-style comparison - E2E: theme test updated to use `#/home` as the initial route (was `#/`) - `db.go`: removed the RFC3339→unix-timestamp subquery path in `buildTransmissionWhere`; `t.first_seen` is now always compared directly as a string for both RFC3339 and non-RFC3339 inputs ## Configuration ```json "packetStore": { "retentionHours": 168, "hotStartupHours": 24 } ``` `hotStartupHours: 0` (default) preserves existing behavior exactly. Recommended for large DBs to reduce startup time; set to 0 to disable (loads full retentionHours at startup, legacy behavior). ## Test plan - [x] `TestHotStartupConfig_Clamp` — clamping when `hotStartupHours > retentionHours` - [x] `TestHotStartupConfig_ZeroIsDisabled` — zero leaves feature disabled - [x] `TestHotStartup_LoadsOnlyHotWindow` — only hot-window packets in memory after `Load()` - [x] `TestHotStartup_DisabledWhenZero` — all retention packets loaded when disabled - [x] `TestHotStartup_loadChunk_AddsOlderData` — chunk merges correctly, ASC order maintained - [x] `TestHotStartup_BackgroundFillsToRetention` — background goroutine fills to `retentionHours` - [x] `TestHotStartup_ChunkErrorRecovery` — chunk SQL failure logged and skipped, loop terminates - [x] `TestHotStartup_SQLFallback_TriggeredForOldDate` — query before `oldestLoaded` routes to SQL - [x] `TestHotStartup_SQLFallback_NotTriggeredForRecentDate` — recent query stays in-memory - [x] `TestHotStartup_PerfStats` — new fields present in `GetPerfStoreStats()` (backs the perf endpoint) - [x] `TestHotStartup_PerfStoreHTTP` — HTTP-level: GET /api/perf returns `hotStartupHours`, `backgroundLoadComplete`, `backgroundLoadProgress` in `packetStore` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: openclaw-bot <bot@openclaw.local> Co-authored-by: CoreScope Bot <bot@corescope.local>	2026-05-15 22:46:25 -07:00
Kpa-clawbot	3ab404b545	feat(node-battery): voltage trend chart + /api/nodes/{pubkey}/battery (#663 ) (#1082 ) ## Summary Closes #663 (Phase 2 + 3 partial — time-series tracking + thresholds for nodes that are also observers). Adds a per-node battery voltage trend chart and `/api/nodes/{pubkey}/battery` endpoint, sourced from the existing `observer_metrics.battery_mv` samples populated by observer status messages. No new ingest or schema changes — purely surfaces data we were already collecting. ## Scope (TDD red→green) RED commit: test(node-battery) — DB query, endpoint shape (200/404/no-data), and config getters all asserted. GREEN commit: feat(node-battery) — implementation only. ## Changes ### Backend - `cmd/server/node_battery.go` (new): - `DB.GetNodeBatteryHistory(pubkey, since)` — pulls `(timestamp, battery_mv)` rows from `observer_metrics WHERE LOWER(observer_id) = LOWER(public_key) AND battery_mv IS NOT NULL`. Case-insensitive join tolerates historical pubkey casing variation (observers persist uppercase, nodes lowercase in this DB). - `Server.handleNodeBattery` — `GET /api/nodes/{pubkey}/battery?days=N` (default 7, max 365). Returns `{public_key, days, samples[], latest_mv, latest_ts, status, thresholds}`. - `Config.LowBatteryMv()` / `CriticalBatteryMv()` — defaults 3300 / 3000 mV. - `cmd/server/config.go` — `BatteryThresholds *BatteryThresholdsConfig` field. - `cmd/server/routes.go` — route registration alongside existing `/health`, `/analytics`. ### Frontend - `public/node-analytics.js` — new "Battery Voltage" chart card with status badge (🔋 OK / ⚠️ Low / 🪫 Critical / No data). Renders dashed threshold lines at `lowMv` and `criticalMv`. Empty-state message when no samples in window. ### Config - `config.example.json` — `batteryThresholds: { lowMv: 3300, criticalMv: 3000 }` with `_comment` per Config Documentation Rule. ## Status semantics \| latest_mv \| status \| \|-----------------------\|------------\| \| no samples in window \| `unknown` \| \| `>= lowMv` \| `ok` \| \| `< lowMv`, `>= critMv`\| `low` \| \| `< criticalMv` \| `critical` \| ## What this PR does NOT do (deferred) The issue's full Phase 1 (writing decoded sensor advert telemetry into `nodes.battery_mv` / `temperature_c` from server-side decoder) and Phase 4 (firmware/active polling for repeaters without observers) are out of scope here. This PR delivers the requested Phase 2/3 surfacing for the data path that already lands rows: `observer_metrics`. Repeaters that are also observers (i.e. publish status to MQTT) will get a voltage trend immediately; pure passive nodes won't until Phase 1 lands. ## Tests - `TestGetNodeBatteryHistory_FromObserverMetrics` — case-insensitive join, NULL skipping, ordering. - `TestNodeBatteryEndpoint` — full happy path with thresholds + status. - `TestNodeBatteryEndpoint_NoData` — 200 + status=unknown. - `TestNodeBatteryEndpoint_404` — unknown node. - `TestBatteryThresholds_ConfigOverride` — config getters + defaults. `cd cmd/server && go test ./...` — green. ## Performance Endpoint is per-pubkey (called once on analytics page open), indexed by `(observer_id, timestamp)` PK on `observer_metrics`. No hot-path impact. --------- Co-authored-by: bot <bot@corescope>	2026-05-05 01:41:00 -07:00
Kpa-clawbot	45f30fcadc	feat(repeater): liveness detection — distinguish actively relaying from advert-only (#662 ) (#1073 ) ## Summary Implements repeater liveness detection per #662 — distinguishes a repeater that is actively relaying traffic from one that is alive but idle (only sending its own adverts). ## Approach The backend already maintains a `byPathHop` index keyed by lowercase hop/pubkey for every transmission. Decode-window writes also key it by resolved pubkey for relay hops. We just weren't surfacing it. `GetRepeaterRelayInfo(pubkey, windowHours)`: - Reads `byPathHop[pubkey]`. - Skips packets whose `payload_type == 4` (advert) — a self-advert proves liveness, not relaying. - Returns the most recent `FirstSeen` as `lastRelayed`, plus `relayActive` (within window) and the `windowHours` actually used. ## Three states (per issue) \| State \| Indicator \| Condition \| \|---\|---\|---\| \| 🟢 Relaying \| green \| `last_relayed` within `relayActiveHours` \| \| 🟡 Alive (idle) \| yellow \| repeater is in the DB but `relay_active=false` (no recent path-hop appearance, or none ever) \| \| ⚪ Stale \| existing \| falls out of the existing `getNodeStatus` logic \| ## API - `GET /api/nodes` — repeater/room rows now include `last_relayed` (omitted if never observed) and `relay_active`. - `GET /api/nodes/{pubkey}` — same fields plus `relay_window_hours`. ## Config New optional field under `healthThresholds`: ```json "healthThresholds": { ..., "relayActiveHours": 24 } ``` Default 24h. Documented in `config.example.json`. ## Frontend Node detail page gains a Last Relayed row for repeaters/rooms with the 🟢/🟡 state badge. Tooltip explains the distinction from "Last Heard". ## TDD - Red commit `4445f91`: `repeater_liveness_test.go` + stub `GetRepeaterRelayInfo` returning zero. Active and Stale tests fail on assertion (LastRelayed empty / mismatched). Idle and IgnoresAdverts already match the desired behavior under the stub. Compiles, runs, fails on assertions — not on imports. - Green commit `5fcfb57`: Implementation. All four tests pass. Full `cmd/server` suite green (~22s). ## Performance `O(N)` over `byPathHop[pubkey]` per call. The index is bounded by store eviction; a single repeater has at most a few hundred entries on real data. The `/api/nodes` loop adds one map read + scan per repeater row — negligible against the existing enrichment work. ## Limitations (per issue body) 1. Observer coverage gaps — if no observer hears a repeater's relay, it'll show as idle even when actively relaying. This is inherent to passive observation. 2. Low-traffic networks — a repeater in a quiet area legitimately shows idle. The 🟡 indicator copy makes that explicit ("alive (idle)"). 3. Hash collisions are mitigated by the existing `resolveWithContext` path before pubkeys land in `byPathHop`. Fixes #662 --------- Co-authored-by: clawbot <bot@corescope.local>	2026-05-05 01:17:52 -07:00
Kpa-clawbot	dd2f044f2b	fix: cache RW SQLite connection + dedup DBConfig (closes #921 ) (#982 ) Closes #921 ## Summary Follow-up to #920 (incremental auto-vacuum). Addresses both items from the adversarial review: ### 1. RW connection caching Previously, every call to `openRW(dbPath)` opened a new SQLite RW connection and closed it after use. This happened in: - `runIncrementalVacuum` (~4x/hour) - `PruneOldPackets`, `PruneOldMetrics`, `RemoveStaleObservers` - `buildAndPersistEdges`, `PruneNeighborEdges` - All neighbor persist operations Now a single `*sql.DB` handle (with `MaxOpenConns(1)`) is cached process-wide via `cachedRW(dbPath)`. The underlying connection pool manages serialization. The original `openRW()` function is retained for one-shot test usage. ### 2. DBConfig dedup `DBConfig` was defined identically in both `cmd/server/config.go` and `cmd/ingestor/config.go`. Extracted to `internal/dbconfig/` as a shared package; both binaries now use a type alias (`type DBConfig = dbconfig.DBConfig`). ## Tests added \| Test \| File \| \|------\|------\| \| `TestCachedRW_ReturnsSameHandle` \| `cmd/server/rw_cache_test.go` \| \| `TestCachedRW_100Calls_SingleConnection` \| `cmd/server/rw_cache_test.go` \| \| `TestGetIncrementalVacuumPages_Default` \| `internal/dbconfig/dbconfig_test.go` \| \| `TestGetIncrementalVacuumPages_Configured` \| `internal/dbconfig/dbconfig_test.go` \| ## Verification ``` ok github.com/corescope/server 20.069s ok github.com/corescope/ingestor 47.117s ok github.com/meshcore-analyzer/dbconfig 0.003s ``` Both binaries build cleanly. 100 sequential `cachedRW()` calls return the same handle with exactly 1 entry in the cache map. --------- Co-authored-by: you <you@example.com>	2026-05-02 20:15:30 -07:00
Kpa-clawbot	4b8d8143f4	feat(server): explicit CORS policy with configurable origin allowlist (#883 ) (#971 ) ## Summary Adds explicit CORS policy support to the CoreScope API server, closing #883. ### Problem The API relied on browser same-origin defaults with no way for operators to configure cross-origin access. Operators running dashboards or third-party frontends on different origins had no supported way to make API calls. ### Solution New config option: `corsAllowedOrigins` (string array, default `[]`) Middleware behavior: \| Config \| Behavior \| \|--------\|----------\| \| `[]` (default) \| No `Access-Control-` headers added — browsers enforce same-origin. Preserves current behavior.* \| \| `["https://dashboard.example.com"]` \| Echoes matching `Origin`, sets `Allow-Methods`/`Allow-Headers` \| \| `[""]` \| Sets `Access-Control-Allow-Origin: ` (explicit opt-in only) \| Headers set when origin matches: - `Access-Control-Allow-Origin: <origin>` (or ``) - `Access-Control-Allow-Methods: GET, POST, OPTIONS` - `Access-Control-Allow-Headers: Content-Type, X-API-Key` - `Vary: Origin` (non-wildcard only) Preflight handling:* `OPTIONS` → `204 No Content` with CORS headers (or `403` if origin not in allowlist). ### Config example ```json { "corsAllowedOrigins": ["https://dashboard.example.com", "https://monitor.internal"] } ``` ### Files changed \| File \| Change \| \|------\|--------\| \| `cmd/server/cors.go` \| New CORS middleware \| \| `cmd/server/cors_test.go` \| 7 unit tests covering all branches \| \| `cmd/server/config.go` \| `CORSAllowedOrigins` field \| \| `cmd/server/routes.go` \| Wire middleware before all routes \| ### Testing Unit tests (7): - Default config → no CORS headers - Allowlist match → headers present with `Vary: Origin` - Allowlist miss → no CORS headers - Preflight allowed → 204 with headers - Preflight rejected → 403 - Wildcard → `` without `Vary` - No `Origin` header → pass-through Live verification (Rule 18):* ``` # Default (empty corsAllowedOrigins): $ curl -I -H "Origin: https://evil.example" localhost:19883/api/health HTTP/1.1 200 OK # No Access-Control-* headers ✓ # With corsAllowedOrigins: ["https://good.example"]: $ curl -I -H "Origin: https://good.example" localhost:19884/api/health Access-Control-Allow-Origin: https://good.example Access-Control-Allow-Methods: GET, POST, OPTIONS Access-Control-Allow-Headers: Content-Type, X-API-Key Vary: Origin ✓ $ curl -I -H "Origin: https://evil.example" localhost:19884/api/health # No Access-Control-* headers ✓ $ curl -I -X OPTIONS -H "Origin: https://good.example" localhost:19884/api/health HTTP/1.1 204 No Content Access-Control-Allow-Origin: https://good.example ✓ ``` Closes #883 Co-authored-by: you <you@example.com>	2026-05-02 12:04:37 -07:00
Kpa-clawbot	b3a9677c52	feat(ingestor + server): observerBlacklist config (#962 ) (#963 ) ## Summary Implements `observerBlacklist` config — mirrors the existing `nodeBlacklist` pattern for observers. Drop observers by pubkey at ingest, with defense-in-depth filtering on the server side. Closes #962 ## Changes ### Ingestor (`cmd/ingestor/`) - `config.go`: Added `ObserverBlacklist []string` field + `IsObserverBlacklisted()` method (case-insensitive, whitespace-trimmed) - `main.go`: Early return in `handleMessage` when `parts[2]` (observer ID from MQTT topic) matches blacklist — before status handling, before IATA filter. No UpsertObserver, no observations, no metrics insert. Log line: `observer <pubkey-short> blacklisted, dropping` ### Server (`cmd/server/`) - `config.go`: Same `ObserverBlacklist` field + `IsObserverBlacklisted()` with `sync.Once` cached set (same pattern as `nodeBlacklist`) - `routes.go`: Defense-in-depth filtering in `handleObservers` (skip blacklisted in list) and `handleObserverDetail` (404 for blacklisted ID) - `main.go`: Startup `softDeleteBlacklistedObservers()` marks matching rows `inactive=1` so historical data is hidden - `neighbor_persist.go`: `softDeleteBlacklistedObservers()` implementation ### Tests - `cmd/ingestor/observer_blacklist_test.go`: config method tests (case-insensitive, empty, nil) - `cmd/server/observer_blacklist_test.go`: config tests + HTTP handler tests (list excludes blacklisted, detail returns 404, no-blacklist passes all, concurrent safety) ## Config ```json { "observerBlacklist": [ "EE550DE547D7B94848A952C98F585881FCF946A128E72905E95517475F83CFB1" ] } ``` ## Verification (Rule 18 — actual server output) Before blacklist (no config): ``` Total: 31 DUBLIN in list: True ``` After blacklist (DUBLIN Observer pubkey in `observerBlacklist`): ``` [observer-blacklist] soft-deleted 1 blacklisted observer(s) Total: 30 DUBLIN in list: False ``` Detail endpoint for blacklisted observer returns 404. All existing tests pass (`go test ./...` for both server and ingestor). --------- Co-authored-by: you <you@example.com>	2026-05-01 23:11:27 -07:00
Kpa-clawbot	aeae7813bc	fix: enable SQLite incremental auto-vacuum so DB shrinks after retention (#919 ) (#920 ) Closes #919 ## Summary Enables SQLite incremental auto-vacuum so the database file actually shrinks after retention reaper deletes old data. Previously, `DELETE` operations freed pages internally but never returned disk space to the OS. ## Changes ### 1. Auto-vacuum on new databases - `PRAGMA auto_vacuum = INCREMENTAL` set via DSN pragma before `journal_mode(WAL)` in the ingestor's `OpenStoreWithInterval` - Must be set before any tables are created; DSN ordering ensures this ### 2. Post-reaper incremental vacuum - `PRAGMA incremental_vacuum(N)` runs after every retention reaper cycle (packets, metrics, observers, neighbor edges) - N defaults to 1024 pages, configurable via `db.incrementalVacuumPages` - Noop on `auto_vacuum=NONE` databases (safe before migration) - Added to both server and ingestor ### 3. Opt-in full VACUUM for existing databases - Startup check logs a clear warning if `auto_vacuum != INCREMENTAL` - `db.vacuumOnStartup: true` config triggers one-time `PRAGMA auto_vacuum = INCREMENTAL; VACUUM` - Logs start/end time for operator visibility ### 4. Documentation - `docs/user-guide/configuration.md`: retention section notes that lowering retention doesn't immediately shrink the DB - `docs/user-guide/database.md`: new guide covering WAL, auto-vacuum, migration, manual VACUUM ### 5. Tests - `TestNewDBHasIncrementalAutoVacuum` — fresh DB gets `auto_vacuum=2` - `TestExistingDBHasAutoVacuumNone` — old DB stays at `auto_vacuum=0` - `TestVacuumOnStartupMigratesDB` — full VACUUM sets `auto_vacuum=2` - `TestIncrementalVacuumReducesFreelist` — DELETE + vacuum shrinks freelist - `TestCheckAutoVacuumLogs` — handles both modes without panic - `TestConfigIncrementalVacuumPages` — config defaults and overrides ## Migration path for existing databases 1. On startup, CoreScope logs: `[db] auto_vacuum=NONE — DB needs one-time VACUUM...` 2. Set `db.vacuumOnStartup: true` in config.json 3. Restart — VACUUM runs (blocks startup, minutes on large DBs) 4. Remove `vacuumOnStartup` after migration ## Test results ``` ok github.com/corescope/server 19.448s ok github.com/corescope/ingestor 30.682s ``` --------- Co-authored-by: you <you@example.com>	2026-04-30 23:45:00 -07:00
Kpa-clawbot	9e90548637	perf(#800 ): remove per-StoreTx ResolvedPath, replace with membership index + on-demand decode (#806 ) ## Summary Remove `ResolvedPath []string` field from `StoreTx` and `StoreObs` structs, replacing it with a compact membership index + on-demand SQL decode. This eliminates the dominant heap cost identified in profiling (#791, #799). Spec:* #800 (consolidated from two rounds of expert + implementer review on #799) Closes #800 Closes #791 ## Design ### Removed - `StoreTx.ResolvedPath []string` - `StoreObs.ResolvedPath []string` - `TransmissionResp.ResolvedPath`, `ObservationResp.ResolvedPath` struct fields ### Added \| Structure \| Purpose \| Est. cost at 1M obs \| \|---\|---\|---:\| \| `resolvedPubkeyIndex map[uint64][]int` \| FNV-1a(pubkey) → []txID forward index \| 50–120 MB \| \| `resolvedPubkeyReverse map[int][]uint64` \| txID → []hashes for clean removal \| ~40 MB \| \| `apiResolvedPathLRU` (10K entries) \| FIFO cache for on-demand API decode \| ~2 MB \| ### Decode-window discipline `resolved_path` JSON decoded once per packet. Consumers fed in order, temp slice dropped — never stored on struct: 1. `addToByNode` — relay node indexing 2. `touchRelayLastSeen` — relay liveness DB updates 3. `byPathHop` resolved-key entries 4. `resolvedPubkeyIndex` + reverse insert 5. WebSocket broadcast map (raw JSON bytes) 6. Persist batch (raw JSON bytes for SQL UPDATE) ### Collision safety When the forward index returns candidates, a batched SQL query confirms exact pubkey presence using `LIKE '%"pubkey"%'` on the `resolved_path` column. ### Feature flag `useResolvedPathIndex` (default `true`). Off-path is conservative: all candidates kept, index not consulted. For one-release rollback safety. ## Files changed \| File \| Changes \| \|---\|---\| \| `resolved_index.go` \| New — index structures, LRU cache, on-demand SQL helpers, collision safety \| \| `store.go` \| Remove RP fields, decode-window discipline in Load/Ingest, on-demand txToMap/obsToMap/enrichObs, eviction cleanup via SQL, memory accounting update \| \| `types.go` \| Remove RP fields from TransmissionResp/ObservationResp \| \| `routes.go` \| Replace `nodeInResolvedPath` with `nodeInResolvedPathViaIndex`, remove RP from mapSlice helpers \| \| `neighbor_persist.go` \| Refactor backfill: reverse-map removal → forward+reverse insert → LRU invalidation \| ## Tests added (27 new) Unit: - `TestStoreTx_ResolvedPathFieldAbsent` — reflection guard - `TestResolvedPubkeyIndex_BuildFromLoad` — forward+reverse consistency - `TestResolvedPubkeyIndex_HashCollision` — SQL collision safety - `TestResolvedPubkeyIndex_IngestUpdate` — maps reflect new ingests - `TestResolvedPubkeyIndex_RemoveOnEvict` — clean removal via reverse map - `TestResolvedPubkeyIndex_PerObsCoverage` — non-best obs pubkeys indexed - `TestAddToByNode_WithoutResolvedPathField` - `TestTouchRelayLastSeen_WithoutResolvedPathField` - `TestWebSocketBroadcast_IncludesResolvedPath` - `TestBackfill_InvalidatesLRU` - `TestEviction_ByNodeCleanup_OnDemandSQL` - `TestExtractResolvedPubkeys`, `TestMergeResolvedPubkeys` - `TestResolvedPubkeyHash_Deterministic` - `TestLRU_EvictionOnFull` Endpoint: - `TestPathsThroughNode_NilResolvedPathFallback` - `TestPacketsAPI_OnDemandResolvedPath` - `TestPacketsAPI_OnDemandResolvedPath_LRUHit` - `TestPacketsAPI_OnDemandResolvedPath_Empty` Feature flag: - `TestFeatureFlag_OffPath_PreservesOldBehavior` - `TestFeatureFlag_Toggle_NoStateLeak` Concurrency: - `TestReverseMap_NoLeakOnPartialFailure` - `TestDecodeWindow_LockHoldTimeBounded` - `TestLivePolling_LRUUnderConcurrentIngest` Regression: - `TestRepeaterLiveness_StillAccurate` Benchmarks: - `BenchmarkLoad_BeforeAfter` - `BenchmarkResolvedPubkeyIndex_Memory` - `BenchmarkPathsThroughNode_Latency` - `BenchmarkLivePolling_UnderIngest` ## Benchmark results ``` BenchmarkResolvedPubkeyIndex_Memory/pubkeys=50K 429ms 103MB 777K allocs BenchmarkResolvedPubkeyIndex_Memory/pubkeys=500K 4205ms 896MB 7.67M allocs BenchmarkLoad_BeforeAfter 65ms 20MB 202K allocs BenchmarkPathsThroughNode_Latency 3.9µs 0B 0 allocs BenchmarkLivePolling_UnderIngest 5.4µs 545B 7 allocs ``` Key: per-obs `[]string` overhead completely eliminated. At 1M obs with 3 hops average, this saves ~72 bytes/obs × 1M = ~68 MB just from the slice headers + pointers, plus the JSON-decoded string data (~900 MB at scale per profiling). ## Design choices - FNV-1a instead of xxhash: stdlib availability, no external dependency. Performance is equivalent for this use case (pubkey strings are short). - FIFO LRU instead of true LRU: simpler implementation, adequate for the access pattern (mostly sequential obs IDs from live polling). - Grouped packets view omits resolved_path: cold path, not worth SQL round-trip per page render. - Backfill pending check uses reverse-map presence* instead of per-obs field: if a tx has any indexed pubkeys, its observations are considered resolved. Closes #807 --------- Co-authored-by: you <you@example.com>	2026-04-20 19:55:00 -07:00
Joel Claw	b9ba447046	feat: add nodeBlacklist config to hide abusive/troll nodes (#742 ) ## Problem Some mesh participants set offensive names, report deliberately false GPS positions, or otherwise troll the network. Instance operators currently have no way to hide these nodes from public-facing APIs without deleting the underlying data. ## Solution Add a `nodeBlacklist` array to `config.json` containing public keys of nodes to exclude from all API responses. ### Blacklisted nodes are filtered from: - `GET /api/nodes` — list endpoint - `GET /api/nodes/search` — search results - `GET /api/nodes/{pubkey}` — detail (returns 404) - `GET /api/nodes/{pubkey}/health` — returns 404 - `GET /api/nodes/{pubkey}/paths` — returns 404 - `GET /api/nodes/{pubkey}/analytics` — returns 404 - `GET /api/nodes/{pubkey}/neighbors` — returns 404 - `GET /api/nodes/bulk-health` — filtered from results ### Config example ```json { "nodeBlacklist": [ "aabbccdd...", "11223344..." ] } ``` ### Design decisions - Case-insensitive — public keys normalized to lowercase - Whitespace trimming — leading/trailing whitespace handled - Empty entries ignored — `""` or `" "` do not cause false positives - Nil-safe — `IsBlacklisted()` on nil Config returns false - Backward-compatible — empty/missing `nodeBlacklist` has zero effect - Lazy-cached set — blacklist converted to `map[string]bool` on first lookup ### What this does NOT do (intentionally) - Does not delete or modify database data — only filters API responses - Does not block packet ingestion — data still flows for analytics - Does not filter `/api/packets` — only node-facing endpoints are affected ## Testing - Unit tests for `Config.IsBlacklisted()` (case sensitivity, whitespace, empty entries, nil config) - Integration tests for `/api/nodes`, `/api/nodes/{pubkey}`, `/api/nodes/search` - Full test suite passes with no regressions	2026-04-17 23:43:05 +00:00
Joel Claw	fa3f623bd6	feat: add observer retention — remove stale observers after configurable days (#764 ) ## Summary Observers that stop actively sending data now get removed after a configurable retention period (default 14 days). Previously, observers remained in the `observers` table forever. This meant nodes that were once observers for an instance but are no longer connected (even if still active in the mesh elsewhere) would continue appearing in the observer list indefinitely. ## Key Design Decisions - Active data requirement: `last_seen` is only updated when the observer itself sends packets (via `stmtUpdateObserverLastSeen`). Being seen by another node does NOT update this field. So an observer must actively send data to stay listed. - Default: 14 days — observers not seen in 14 days are removed - `-1` = keep forever — for users who want observers to never be removed - `0` = use default (14 days) — same as not setting the field - Runs on startup + daily ticker — staggered 3 minutes after metrics prune to avoid DB contention ## Changes \| File \| Change \| \|------\|--------\| \| `cmd/ingestor/config.go` \| Add `ObserverDays` to `RetentionConfig`, add `ObserverDaysOrDefault()` \| \| `cmd/ingestor/db.go` \| Add `RemoveStaleObservers()` — deletes observers with `last_seen` before cutoff \| \| `cmd/ingestor/main.go` \| Wire up startup + daily ticker for observer retention \| \| `cmd/server/config.go` \| Add `ObserverDays` to `RetentionConfig`, add `ObserverDaysOrDefault()` \| \| `cmd/server/db.go` \| Add `RemoveStaleObservers()` (server-side, uses read-write connection) \| \| `cmd/server/main.go` \| Wire up startup + daily ticker, shutdown cleanup \| \| `cmd/server/routes.go` \| Admin prune API now also removes stale observers \| \| `config.example.json` \| Add `observerDays: 14` with documentation \| \| `cmd/ingestor/coverage_boost_test.go` \| 4 tests: basic removal, empty store, keep forever (-1), default (0→14) \| \| `cmd/server/config_test.go` \| 4 tests: `ObserverDaysOrDefault` edge cases \| ## Config Example ```json { "retention": { "nodeDays": 7, "observerDays": 14, "packetDays": 30, "_comment": "observerDays: -1 = keep forever, 0 = use default (14)" } } ``` ## Admin API The `/api/admin/prune` endpoint now also removes stale observers (using `observerDays` from config) and reports `observers_removed` in the response alongside `packets_deleted`. ## Test Plan - [x] `TestRemoveStaleObservers` — old observer removed, recent observer kept - [x] `TestRemoveStaleObserversNone` — empty store, no errors - [x] `TestRemoveStaleObserversKeepForever` — `-1` keeps even year-old observers - [x] `TestRemoveStaleObserversDefault` — `0` defaults to 14 days - [x] `TestObserverDaysOrDefault` (ingestor) — nil/zero/positive/keep-forever - [x] `TestObserverDaysOrDefault` (server) — nil/zero/positive/keep-forever - [x] Both binaries compile cleanly (`go build`) - [ ] Manual: verify observer count decreases after retention period on a live instance	2026-04-17 09:24:40 -07:00
Kpa-clawbot	dc5b5ce9a0	fix: reject weak/default API keys + startup warning (#532 ) (#628 ) ## Summary Hardens API key security for write endpoints (fixes #532): 1. Constant-time comparison — uses `crypto/subtle.ConstantTimeCompare` to prevent timing attacks on API key validation 2. Weak key blocklist — rejects known default/example keys (`test`, `password`, `change-me`, `your-secret-api-key-here`, etc.) 3. Minimum length enforcement — keys shorter than 16 characters are rejected 4. Startup warning — logs a clear warning if the configured key is weak or a known default 5. Generic error messages — HTTP 403 response uses opaque "forbidden" message to prevent information leakage about why a key was rejected ### Security Model - Empty key → all write endpoints disabled (403) - Weak/default key → all write endpoints disabled (403), startup warning logged - Wrong key → 401 unauthorized - Strong correct key → request proceeds ### Files Changed - `cmd/server/config.go` — `IsWeakAPIKey()` function + blocklist - `cmd/server/routes.go` — constant-time comparison via `constantTimeEqual()`, weak key rejection - `cmd/server/main.go` — startup warning for weak keys - `cmd/server/apikey_security_test.go` — comprehensive test coverage - `cmd/server/routes_test.go` — existing tests updated to use strong keys ### Reviews - ✅ Self-review: all security properties verified - ✅ djb Final Review: timing fix correct, blocklist pragmatic, error messages opaque, tests comprehensive. Verdict: Ship it. ### Test Results All existing + new tests pass. Coverage includes: weak key detection (blocklist + length + case-insensitive), empty key handling, strong key acceptance, wrong key rejection, and constant-time comparison. --------- Co-authored-by: you <you@example.com>	2026-04-05 14:50:40 -07:00
Kpa-clawbot	767c8a5a3e	perf: async chunked backfill — HTTP serves within 2 minutes (#612 ) (#614 ) ## Summary Adds two config knobs for controlling backfill scope and neighbor graph data retention, plus removes the dead synchronous backfill function. ## Changes ### Config knobs #### `resolvedPath.backfillHours` (default: 24) Controls how far back (in hours) the async backfill scans for observations with NULL `resolved_path`. Transmissions with `first_seen` older than this window are skipped, reducing startup time for instances with large historical datasets. #### `neighborGraph.maxAgeDays` (default: 30) Controls the maximum age of `neighbor_edges` entries. Edges with `last_seen` older than this are pruned from both SQLite and the in-memory graph. Pruning runs on startup (after a 4-minute stagger) and every 24 hours thereafter. ### Dead code removal - Removed the synchronous `backfillResolvedPaths` function that was replaced by the async version. ### Implementation details - `backfillResolvedPathsAsync` now accepts a `backfillHours` parameter and filters by `tx.FirstSeen` - `NeighborGraph.PruneOlderThan(cutoff)` removes stale edges from the in-memory graph - `PruneNeighborEdges(conn, graph, maxAgeDays)` prunes both DB and in-memory graph - Periodic pruning ticker follows the same pattern as metrics pruning (24h interval, staggered start) - Graceful shutdown stops the edge prune ticker ### Config example Both knobs added to `config.example.json` with `_comment` fields. ## Tests - Config default/override tests for both knobs - `TestGraphPruneOlderThan` — in-memory edge pruning - `TestPruneNeighborEdgesDB` — SQLite + in-memory pruning together - `TestBackfillRespectsHourWindow` — verifies old transmissions are excluded by backfill window --------- Co-authored-by: you <you@example.com>	2026-04-05 09:49:39 -07:00
Kpa-clawbot	6f35d4d417	feat: RF Health Dashboard M1 — observer metrics + small multiples grid (#604 ) ## RF Health Dashboard — M1: Observer Metrics Storage, API & Small Multiples Grid Implements M1 of #600. ### What this does Adds a complete RF health monitoring pipeline: MQTT stats ingestion → SQLite storage → REST API → interactive dashboard with small multiples grid. ### Backend Changes Ingestor (`cmd/ingestor/`) - New `observer_metrics` table via migration system (`_migrations` pattern) - Parse `tx_air_secs`, `rx_air_secs`, `recv_errors` from MQTT status messages (same pattern as existing `noise_floor` and `battery_mv`) - `INSERT OR REPLACE` with timestamps rounded to nearest 5-min interval boundary (using ingestor wall clock, not observer timestamps) - Missing fields stored as NULLs — partial data is always better than no data - Configurable retention pruning: `retention.metricsDays` (default 30), runs on startup + every 24h Server (`cmd/server/`) - `GET /api/observers/{id}/metrics?since=...&until=...` — per-observer time-series data - `GET /api/observers/metrics/summary?window=24h` — fleet summary with current NF, avg/max NF, sample count - `parseWindowDuration()` supports `1h`, `24h`, `3d`, `7d`, `30d` etc. - Server-side metrics retention pruning (same config, staggered 2min after packet prune) ### Frontend Changes RF Health tab (`public/analytics.js`, `public/style.css`) - Small multiples grid showing all observers simultaneously — anomalies pop out visually - Per-observer cell: name, current NF value, battery voltage, sparkline, avg/max stats - NF status coloring: warning (amber) at ≥-100 dBm, critical (red) at ≥-85 dBm — text color only, no background fills - Click any cell → expanded detail view with full noise floor line chart - Reference lines with direct text labels (`-100 warning`, `-85 critical`) — not color bands - Min/max points labeled directly on the chart - Time range selector: preset buttons (1h/3h/6h/12h/24h/3d/7d/30d) + custom from/to datetime picker - Deep linking: `#/analytics?tab=rf-health&observer=...&range=...` - All charts use SVG, matching existing analytics.js patterns - Responsive: 3-4 columns on desktop, 1 on mobile ### Design Decisions (from spec) - Labels directly on data, not in legends - Reference lines with text labels, not color bands - Small multiples grid, not card+accordion (Tufte: instant visual fleet comparison) - Ingestor wall clock for all timestamps (observer clocks may drift) ### Tests Added Ingestor tests: - `TestRoundToInterval` — 5 cases for rounding to 5-min boundaries - `TestInsertMetrics` — basic insertion with all fields - `TestInsertMetricsIdempotent` — INSERT OR REPLACE deduplication - `TestInsertMetricsNullFields` — partial data with NULLs - `TestPruneOldMetrics` — retention pruning - `TestExtractObserverMetaNewFields` — parsing tx_air_secs, rx_air_secs, recv_errors Server tests: - `TestGetObserverMetrics` — time-series query with since/until filters, NULL handling - `TestGetMetricsSummary` — fleet summary aggregation - `TestObserverMetricsAPIEndpoints` — DB query verification - `TestMetricsAPIEndpoints` — HTTP endpoint response shape - `TestParseWindowDuration` — duration parsing for h/d formats ### Test Results ``` cd cmd/ingestor && go test ./... → PASS (26s) cd cmd/server && go test ./... → PASS (5s) ``` ### What's NOT in this PR (deferred to M2+) - Server-side delta computation for cumulative counters - Airtime charts (TX/RX percentage lines) - Channel quality chart (recv_error_rate) - Battery voltage chart - Reboot detection and chart annotations - Resolution downsampling (1h, 1d aggregates) - Pattern detection / automated diagnosis --------- Co-authored-by: you <you@example.com>	2026-04-04 22:21:35 -07:00
Kpa-clawbot	58f791266d	feat: affinity debugging tools (#482 ) — milestone 6 (#521 ) ## Summary Milestone 6 of #482: Observability & Debugging tools for the neighbor affinity system. These tools exist because someone will need them at 3 AM when "Show Neighbors is showing the wrong node for C0DE" and they have 5 minutes to diagnose it. ## Changes ### 1. Debug API — `GET /api/debug/affinity` - Full graph state dump: all edges with weights, observation counts, last-seen timestamps - Per-prefix resolution log with disambiguation reasoning (Jaccard scores, ratios, thresholds) - Query params: `?prefix=C0DE` filter to specific prefix, `?node=<pubkey>` for specific node's edges - Protected by API key (same auth as `/api/admin/prune`) - Response includes: edge count, node count, cache age, last rebuild time ### 2. Debug Overlay on Map - Toggle-able checkbox "🔍 Affinity Debug" in map controls - Draws lines between nodes showing affinity edges with color coding: - Green = high confidence (score ≥ 0.6) - Yellow = medium (0.3–0.6) - Red = ambiguous (< 0.3) - Line thickness proportional to weight, dashed for ambiguous - Unresolved prefixes shown as ❓ markers - Click edge → popup with observation count, last seen, score, observers - Hidden behind `debugAffinity` config flag or `localStorage.setItem('meshcore-affinity-debug', 'true')` ### 3. Per-Node Debug Panel - Expandable "🔍 Affinity Debug" section in node detail page (collapsed by default) - Shows: neighbor edges table with scores, prefix resolutions with reasoning trace - Candidates table with Jaccard scores, highlighting the chosen candidate - Graph-level stats summary ### 4. Server-Side Structured Logging - Integrated into `disambiguate()` — logs every resolution decision during graph build - Format: `[affinity] resolve C0DE: c0dedad4 score=47 Jaccard=0.82 vs c0dedad9 score=3 Jaccard=0.11 → neighbor_affinity (ratio 15.7×)` - Logs ambiguous decisions: `scores too close (12 vs 9, ratio 1.3×) → ambiguous` - Gated by `debugAffinity` config flag ### 5. Dashboard Stats Widget - Added to analytics overview tab when debug mode is enabled - Metrics: total edges/nodes, resolved/ambiguous counts (%), avg confidence, cold-start coverage, cache age, last rebuild ## Files Changed - `cmd/server/neighbor_debug.go` — new: debug API handler, resolution builder, cold-start coverage - `cmd/server/neighbor_debug_test.go` — new: 7 tests for debug API - `cmd/server/neighbor_graph.go` — added structured logging to disambiguate(), `logFn` field, `BuildFromStoreWithLog` - `cmd/server/neighbor_api.go` — pass debug flag through `BuildFromStoreWithLog` - `cmd/server/config.go` — added `DebugAffinity` config field - `cmd/server/routes.go` — registered `/api/debug/affinity` route, exposed `debugAffinity` in client config - `cmd/server/types.go` — added `DebugAffinity` to `ClientConfigResponse` - `public/map.js` — affinity debug overlay layer with edge visualization - `public/nodes.js` — per-node affinity debug panel - `public/analytics.js` — dashboard stats widget - `test-e2e-playwright.js` — 3 Playwright tests for debug UI ## Tests - ✅ 7 Go unit tests (API shape, prefix/node filters, auth, structured logging, cold-start coverage) - ✅ 3 Playwright E2E tests (overlay checkbox, toggle without crash, panel expansion) - ✅ All existing tests pass (`go test ./cmd/server/... -count=1`) Part of #482 --------- Co-authored-by: you <you@example.com>	2026-04-02 23:45:03 -07:00
efiten	fe314be3a8	feat: geo_filter enforcement, DB pruning, geofilter-builder tool, HB column (#215 ) ## Summary Several features and fixes from a live deployment of the Go v3.0.0 backend. ### geo_filter — full enforcement - Go backend config (`cmd/server/config.go`, `cmd/ingestor/config.go`): added `GeoFilterConfig` struct so `geo_filter.polygon` and `bufferKm` from `config.json` are parsed by both the server and ingestor - Ingestor (`cmd/ingestor/geo_filter.go`, `cmd/ingestor/main.go`): ADVERT packets from nodes outside the configured polygon + buffer are dropped before any DB write — no transmission, node, or observation data is stored - Server API (`cmd/server/geo_filter.go`, `cmd/server/routes.go`): `GET /api/config/geo-filter` endpoint returns the polygon + bufferKm to the frontend; `/api/nodes` responses filter out any out-of-area nodes already in the DB - Frontend (`public/map.js`, `public/live.js`): blue polygon overlay (solid inner + dashed buffer zone) on Map and Live pages, toggled via "Mesh live area" checkbox, state shared via localStorage ### Automatic DB pruning - Add `retention.packetDays` to `config.json` to delete transmissions + observations older than N days on a daily schedule (1 min after startup, then every 24h). Nodes and observers are never pruned. - `POST /api/admin/prune?days=N` for manual runs (requires `X-API-Key` header if `apiKey` is set) ```json "retention": { "nodeDays": 7, "packetDays": 30 } ``` ### tools/geofilter-builder.html Standalone HTML tool (no server needed) — open in browser, click to place polygon points on a Leaflet map, set `bufferKm`, copy the generated `geo_filter` JSON block into `config.json`. ### scripts/prune-nodes-outside-geo-filter.py Utility script to clean existing out-of-area nodes from the database (dry-run + confirm). Useful after first enabling geo_filter on a populated DB. ### HB column in packets table Shows the hop hash size in bytes (1–4) decoded from the path byte of each packet's raw hex. Displayed as HB between Size and Type columns, hidden on small screens. ## Test plan - [x] ADVERT from node outside polygon is not stored (no new row in nodes or transmissions) - [x] `GET /api/config/geo-filter` returns polygon + bufferKm when configured, `{polygon: null, bufferKm: 0}` when not - [x] `/api/nodes` excludes nodes outside polygon even if present in DB - [x] Map and Live pages show blue polygon overlay when configured; checkbox toggles it - [x] `retention.packetDays: 30` deletes old transmissions/observations on startup and daily - [x] `POST /api/admin/prune?days=30` returns `{deleted: N, days: 30}` - [x] `tools/geofilter-builder.html` opens standalone, draws polygon, copies valid JSON - [x] HB column shows 1–4 for all packets in grouped and flat view 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-31 01:10:56 -07:00
Kpa-clawbot	5aa4fbb600	chore: normalize all files to LF line endings	2026-03-30 22:52:46 -07:00
Kpa-clawbot	1e1fb298c2	Backend: timestamp config for client defaults (#292 ) ## Backend: Timestamp Config for Client Defaults Refs #286 — implements backend scope from the [final spec](https://github.com/Kpa-clawbot/CoreScope/issues/286#issuecomment-4158891089). ### What changed Config struct (`cmd/server/config.go`) - Added `TimestampConfig` struct with `defaultMode`, `timezone`, `formatPreset`, `customFormat`, `allowCustomFormat` - Added `Timestamps TimestampConfig` to main `Config` struct - Normalization method: invalid values fall back to safe defaults (`ago`/`local`/`iso`) Startup warnings (`cmd/server/main.go`)* - Missing timestamps section: `[config] timestamps not configured — using defaults (ago/local/iso)` - Invalid values logged with what was normalized API endpoint (`cmd/server/routes.go`) - Timestamp config included in `GET /api/config/client` response via `ClientConfigResponse` - Frontend reads server defaults from this endpoint Config example (`config.example.json`) - Added `timestamps` section with documented defaults ### Tests (`cmd/server/`) - Config loads with timestamps section - Config loads without timestamps section (defaults applied) - Invalid values are normalized - `/api/config/client` returns timestamp config ### Validation - `cd cmd/server && go test ./...` ✅ --------- Co-authored-by: Kpa-clawbot <259247574+Kpa-clawbot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-30 17:41:45 -07:00
efiten	999436d714	feat: geo_filter polygon overlay on map and live pages (Go backend) (#213 ) ## Summary - Adds `GeoFilter` struct to `Config` in `cmd/server/config.go` so `geo_filter.polygon` and `bufferKm` from `config.json` are parsed by the Go backend - Adds `GET /api/config/geo-filter` endpoint in `cmd/server/routes.go` returning the polygon + bufferKm to the frontend - Restores the blue polygon overlay (solid inner + dashed buffer zone) on the Map page (`public/map.js`) - Restores the same overlay on the Live page (`public/live.js`), toggled via the "Mesh live area" checkbox ## Test plan - [x] `GET /api/config/geo-filter` returns `{ polygon: [...], bufferKm: N }` when configured - [x] `GET /api/config/geo-filter` returns `{ polygon: null, bufferKm: 0 }` when not configured - [x] Map page shows blue polygon overlay when `geo_filter.polygon` is set in config - [x] Live page shows same overlay, checkbox state shared via localStorage - [x] Checkbox is hidden when no polygon is configured 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-30 15:28:28 -07:00
Kpa-clawbot	77d8f35a04	feat: implement packet store eviction/aging to prevent OOM (#273 ) ## Summary The in-memory `PacketStore` had no eviction or aging — it grew unbounded until OOM killed the process. At ~3K packets/hour and ~5KB per packet (not the 450 bytes previously estimated), an 8GB VM would OOM in a few days. ## Changes ### Time-based eviction - Configurable via `config.json`: `"packetStore": { "retentionHours": 24 }` - Packets older than the retention window are evicted from the head of the sorted slice ### Memory-based cap - Configurable via `"packetStore": { "maxMemoryMB": 1024 }` - Hard ceiling — evicts oldest packets when estimated memory exceeds the cap ### Index cleanup When a `StoreTx` is evicted, ALL associated data is removed from: - `byHash`, `byTxID`, `byObsID`, `byObserver`, `byNode`, `byPayloadType` - `nodeHashes`, `distHops`, `distPaths`, `spIndex` ### Periodic execution - Background ticker runs eviction every 60 seconds - Analytics caches and hash size cache are invalidated after eviction ### Stats fixes - `estimatedMB` now uses ~5KB/packet + ~500B/observation (was 430B + 200B) - `evicted` counter reflects actual evictions (was hardcoded to 0) - Removed fake `maxPackets: 2386092` and `maxMB: 1024` from stats ### Config example ```json { "packetStore": { "retentionHours": 24, "maxMemoryMB": 1024 } } ``` Both values default to 0 (unlimited) for backward compatibility. ## Tests - 7 new tests in `eviction_test.go` covering time-based, memory-based, index cleanup, thread safety, config parsing, and no-op when disabled - All existing tests pass unchanged Co-authored-by: Kpa-clawbot <kpabap+clawdbot@gmail.com>	2026-03-30 03:42:11 +00:00
you	0f70cd1ac0	feat: make health thresholds configurable in hours Change healthThresholds config from milliseconds to hours for readability. Config keys: infraDegradedHours, infraSilentHours, nodeDegradedHours, nodeSilentHours. Defaults: infra degraded 24h, silent 72h; node degraded 1h, silent 24h. - Config stored in hours, converted to ms at comparison time - /api/config/client sends ms to frontend (backward compatible) - Frontend tooltips use dynamic thresholds instead of hardcoded strings - Added healthThresholds section to config.example.json - Updated Go and Node.js servers, tests	2026-03-29 09:50:32 -07:00
Kpa-clawbot	520adcc6ab	feat: move stale nodes to inactive_nodes table, fixes #202 - Create inactive_nodes table with identical schema to nodes - Add retention.nodeDays config (default 7) in Node.js and Go - On startup: move nodes not seen in N days to inactive_nodes - Daily timer (24h setInterval / goroutine ticker) repeats the move - Log 'Moved X nodes to inactive_nodes (not seen in N days)' - All existing queries unchanged — they only read nodes table - Add 14 new tests for moveStaleNodes in test-db.js - Both Node (db.js/server.js) and Go (ingestor/server) implemented Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-27 22:43:53 -07:00
Kpa-clawbot	742ed86596	feat: add Go web server (cmd/server/) — full API + WebSocket + static files 35+ REST endpoints matching Node.js server, WebSocket broadcast, static file serving with SPA fallback, config.json support. Uses modernc.org/sqlite (pure Go, no CGO required). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-27 01:16:59 -07:00

26 Commits