mirror of
https://github.com/Kpa-clawbot/meshcore-analyzer.git
synced 2026-05-12 13:54:43 +00:00
45f2607f75
## Summary Implements **M1 from #1115**: batches observation/transmission INSERTs into a single SQLite `BEGIN/COMMIT` window instead of fsyncing per packet. At ~250 obs/sec this drops WAL fsync rate from ~20/s to ~1/s and eliminates the `obs-persist skipped` / `SQLITE_BUSY` log spam that the issue documents. This is a **partial fix** — it ships the group-commit mechanism. Acceptance items 6–7 (measured fsync rate / measured `obs-persist skipped` rate at staging steady-state) require post-deploy observation, and M2 (per-`tx_hash` observation buffering) is intentionally deferred. The issue stays open for the user to verify on staging. > Partial fix for #1115 — does not auto-close. Refs #1115. ## Mechanism - `Store` gains an active `*sql.Tx`, `pendingRows` counter, `gcMu`, and the `groupCommitMs` / `groupCommitMaxRows` knobs. `SetGroupCommit(ms, maxRows)` enables the mode; `FlushGroupTx()` commits the in-flight tx. - `InsertTransmission` lazily opens a tx on the first call after each flush, then issues all writes through `tx.Stmt()` bindings of the existing prepared statements. With `MaxOpenConns(1)` the connection is already serialized; `gcMu` serializes group-commit state without contention. - A goroutine in `cmd/ingestor/main.go` calls `FlushGroupTx()` every `groupCommitMs` ms. `pendingRows >= groupCommitMaxRows` triggers an eager flush. `Close()` flushes before the WAL checkpoint so no rows are lost on graceful shutdown. - `groupCommitMs == 0` short-circuits to the legacy per-call auto-commit path (statements bound to `s.db`, no tx) — current behavior preserved byte-for-byte for operators who opt out. ## Config Two new optional fields (ingestor-only), both documented in `config.example.json`: | Field | Default | Effect | |---|---|---| | `groupCommitMs` | `1000` | Flush window in ms. `0` disables batching (legacy per-packet auto-commit). | | `groupCommitMaxRows` | `1000` | Safety cap; when exceeded the queue flushes immediately to bound memory and the crash-loss window. | No DB schema change. No required config change on upgrade. ## Tests (TDD red → green visible in commits) `cmd/ingestor/group_commit_test.go` — three assertions, written first as the red commit: - `TestGroupCommit_BatchesInsertsIntoOneTx` — 50 `InsertTransmission` calls inside a wide window produce **0** commits until `FlushGroupTx`, then exactly **1**; all 50 rows visible after flush. (This is the spec's "50 observations → 1 SQLite write transaction" assertion.) - `TestGroupCommit_Disabled` — `groupCommitMs=0` keeps every insert immediately visible and `GroupCommitFlushes` never advances. (Spec's "groupCommitMs=0 reverts to per-packet behavior" assertion.) - `TestGroupCommit_MaxRowsForcesEarlyFlush` — cap=3, 7 inserts → 2 auto-flushes from the cap + 1 final manual flush = 3 total. Red commit: `e2b0370` (stubs `SetGroupCommit` / `FlushGroupTx` so the tests compile and fail on **assertions**, not import errors). Green commit: `73f3559`. Full ingestor suite (`go test ./...` in `cmd/ingestor`) stays green, ~49 s. ## Performance This PR is the perf change itself. Local micro-test (the new `TestGroupCommit_BatchesInsertsIntoOneTx`) shows the structural property: 50 inserts → 1 commit. The fsync-rate measurement called out in the M1 acceptance criteria (`~20/s → ~1/s` at 250 obs/sec) requires staging deployment to confirm — that's the remaining open item that keeps #1115 open after this merges. No hot-path regressions: when `groupCommitMs > 0` we acquire one mutex per insert (uncontended in the steady state — the connection was already single-threaded via `MaxOpenConns(1)`). When `groupCommitMs == 0` the code path is identical to before plus one nil-tx check. ## What this PR does NOT do (per spec) - Does not collapse "30 observations of one packet" into 1 row write — that's M2. - Does not eliminate dual-writer contention with `cmd/server`'s `resolved_path` writes. - Does not change observation ordering or live broadcast latency. --------- Co-authored-by: corescope-bot <bot@corescope.local>
253 lines
12 KiB
JSON
253 lines
12 KiB
JSON
{
|
|
"port": 3000,
|
|
"apiKey": "your-secret-api-key-here",
|
|
"nodeBlacklist": [],
|
|
"_comment_nodeBlacklist": "Public keys of nodes to hide from all API responses. Use for trolls, offensive names, or nodes reporting false data that operators refuse to fix.",
|
|
"observerIATAWhitelist": [],
|
|
"_comment_observerIATAWhitelist": "Global IATA region whitelist. When non-empty, only observers whose IATA code (from MQTT topic) matches are processed. Case-insensitive. Empty = allow all. Unlike per-source iataFilter, this applies across all MQTT sources.",
|
|
"groupCommitMs": 1000,
|
|
"_comment_groupCommitMs": "Ingestor only (#1115 M1). Window in milliseconds for batching observation INSERTs into a single SQLite transaction. Default 1000 (1s). Set to 0 to disable batching and revert to per-packet auto-commit (legacy behavior). Trade-off: up to this many ms of delay before observations are queryable via SQL (live WebSocket broadcast is unaffected). Reduces WAL fsync rate from ~per-packet to ~1/window, eliminating SQLITE_BUSY/obs-persist-skipped log spam at high ingest rates.",
|
|
"groupCommitMaxRows": 1000,
|
|
"_comment_groupCommitMaxRows": "Ingestor only (#1115 M1). Safety cap on pending rows in the group-commit queue. When exceeded, the queue flushes immediately even if the time window has not elapsed. Bounds memory use and the crash-loss window. Default 1000.",
|
|
"retention": {
|
|
"nodeDays": 7,
|
|
"observerDays": 14,
|
|
"packetDays": 30,
|
|
"_comment": "nodeDays: nodes not seen in N days moved to inactive_nodes (default 7). observerDays: observers not sending data in N days are removed (-1 = keep forever, default 14). packetDays: transmissions older than N days are deleted (0 = disabled)."
|
|
},
|
|
"db": {
|
|
"vacuumOnStartup": false,
|
|
"incrementalVacuumPages": 1024,
|
|
"_comment": "vacuumOnStartup: run one-time full VACUUM to enable incremental auto-vacuum on existing DBs (blocks startup for minutes on large DBs; requires 2x DB file size in free disk space). incrementalVacuumPages: free pages returned to OS after each retention reaper cycle (default 1024). See #919."
|
|
},
|
|
"https": {
|
|
"cert": "/path/to/cert.pem",
|
|
"key": "/path/to/key.pem",
|
|
"_comment": "TLS cert/key paths for direct HTTPS. Most deployments use Caddy (included in Docker) for auto-TLS instead."
|
|
},
|
|
"branding": {
|
|
"siteName": "CoreScope",
|
|
"tagline": "Real-time MeshCore LoRa mesh network analyzer",
|
|
"logoUrl": null,
|
|
"faviconUrl": null,
|
|
"_comment": "Customize site name, tagline, logo, and favicon. logoUrl/faviconUrl can be absolute URLs or relative paths."
|
|
},
|
|
"theme": {
|
|
"accent": "#4a9eff",
|
|
"accentHover": "#6db3ff",
|
|
"navBg": "#0f0f23",
|
|
"navBg2": "#1a1a2e",
|
|
"statusGreen": "#45644c",
|
|
"statusYellow": "#b08b2d",
|
|
"statusRed": "#b54a4a",
|
|
"_comment": "CSS color overrides. Use the in-app Theme Customizer for live preview, then export values here."
|
|
},
|
|
"nodeColors": {
|
|
"repeater": "#dc2626",
|
|
"companion": "#2563eb",
|
|
"room": "#16a34a",
|
|
"sensor": "#d97706",
|
|
"observer": "#8b5cf6",
|
|
"_comment": "Marker/badge colors per node role. Used on map, nodes list, and live feed."
|
|
},
|
|
"home": {
|
|
"heroTitle": "CoreScope",
|
|
"heroSubtitle": "Find your nodes to start monitoring them.",
|
|
"steps": [
|
|
{
|
|
"emoji": "\ud83d\udce1",
|
|
"title": "Connect",
|
|
"description": "Link your node to the mesh"
|
|
},
|
|
{
|
|
"emoji": "\ud83d\udd0d",
|
|
"title": "Monitor",
|
|
"description": "Watch packets flow in real-time"
|
|
},
|
|
{
|
|
"emoji": "\ud83d\udcca",
|
|
"title": "Analyze",
|
|
"description": "Understand your network's health"
|
|
}
|
|
],
|
|
"checklist": [
|
|
{
|
|
"question": "How do I add my node?",
|
|
"answer": "Search for your node name or paste your public key."
|
|
},
|
|
{
|
|
"question": "What regions are covered?",
|
|
"answer": "Check the map page to see active observers and nodes."
|
|
}
|
|
],
|
|
"footerLinks": [
|
|
{
|
|
"label": "\ud83d\udce6 Packets",
|
|
"url": "#/packets"
|
|
},
|
|
{
|
|
"label": "\ud83d\uddfa\ufe0f Network Map",
|
|
"url": "#/map"
|
|
},
|
|
{
|
|
"label": "\ud83d\udd34 Live",
|
|
"url": "#/live"
|
|
},
|
|
{
|
|
"label": "\ud83d\udce1 All Nodes",
|
|
"url": "#/nodes"
|
|
},
|
|
{
|
|
"label": "\ud83d\udcac Channels",
|
|
"url": "#/channels"
|
|
}
|
|
],
|
|
"_comment": "Customize the landing page hero, onboarding steps, FAQ, and footer links."
|
|
},
|
|
"mqtt": {
|
|
"broker": "mqtt://localhost:1883",
|
|
"topic": "meshcore/+/+/packets",
|
|
"_comment": "Legacy single-broker config. Prefer mqttSources[] for multiple brokers."
|
|
},
|
|
"mqttSources": [
|
|
{
|
|
"name": "local",
|
|
"broker": "mqtt://localhost:1883",
|
|
"topics": [
|
|
"meshcore/+/+/packets",
|
|
"meshcore/#"
|
|
]
|
|
},
|
|
{
|
|
"name": "lincomatic",
|
|
"broker": "mqtts://mqtt.lincomatic.com:8883",
|
|
"username": "your-username",
|
|
"password": "your-password",
|
|
"rejectUnauthorized": false,
|
|
"topics": [
|
|
"meshcore/SJC/#",
|
|
"meshcore/SFO/#",
|
|
"meshcore/OAK/#",
|
|
"meshcore/MRY/#"
|
|
],
|
|
"iataFilter": [
|
|
"SJC",
|
|
"SFO",
|
|
"OAK",
|
|
"MRY"
|
|
],
|
|
"region": "SJC",
|
|
"connectTimeoutSec": 45
|
|
}
|
|
],
|
|
"channelKeys": {
|
|
"Public": "8b3387e9c5cdea6ac9e5edbaa115cd72"
|
|
},
|
|
"hashChannels": [
|
|
"#LongFast",
|
|
"#test",
|
|
"#sf",
|
|
"#wardrive",
|
|
"#yo",
|
|
"#bot",
|
|
"#queer",
|
|
"#bookclub",
|
|
"#shtf"
|
|
],
|
|
"healthThresholds": {
|
|
"infraDegradedHours": 24,
|
|
"infraSilentHours": 72,
|
|
"nodeDegradedHours": 1,
|
|
"nodeSilentHours": 24,
|
|
"relayActiveHours": 24,
|
|
"_comment": "How long (hours) before nodes show as degraded/silent. 'infra' = repeaters & rooms, 'node' = companions & others. relayActiveHours: a repeater is shown as 'actively relaying' if its pubkey appeared as a path hop in a non-advert packet within this window (issue #662)."
|
|
},
|
|
"defaultRegion": "SJC",
|
|
"mapDefaults": {
|
|
"center": [
|
|
37.45,
|
|
-122.0
|
|
],
|
|
"zoom": 9
|
|
},
|
|
"geo_filter": {
|
|
"polygon": [
|
|
[37.80, -122.52],
|
|
[37.80, -121.80],
|
|
[37.20, -121.80],
|
|
[37.20, -122.52]
|
|
],
|
|
"bufferKm": 20,
|
|
"_comment": "Optional. Restricts ingestion and API responses to nodes within the polygon + bufferKm. Polygon is an array of [lat, lon] pairs (minimum 3). Use the GeoFilter Builder (`/geofilter-builder.html`) to draw a polygon, save drafts to localStorage with Save Draft, and export a config snippet with Download — paste the snippet here as the `geo_filter` block. Remove this section to disable filtering. Nodes with no GPS fix are always allowed through."
|
|
},
|
|
"foreignAdverts": {
|
|
"mode": "flag",
|
|
"_comment": "Controls how the ingestor handles ADVERTs whose GPS is OUTSIDE the geo_filter polygon (#730). 'flag' (default): store the advert/node and tag it foreign_advert=1 so operators can see bridged/leaked nodes via the API ('foreign': true on /api/nodes). 'drop': legacy behavior — silently discard the advert (no log, no node row). Only applies when geo_filter is configured; otherwise has no effect."
|
|
},
|
|
"regions": {
|
|
"SJC": "San Jose, US",
|
|
"SFO": "San Francisco, US",
|
|
"OAK": "Oakland, US",
|
|
"MRY": "Monterey, US"
|
|
},
|
|
"cacheTTL": {
|
|
"stats": 10,
|
|
"nodeDetail": 300,
|
|
"nodeHealth": 300,
|
|
"nodeList": 90,
|
|
"bulkHealth": 600,
|
|
"networkStatus": 600,
|
|
"observers": 300,
|
|
"channels": 15,
|
|
"channelMessages": 10,
|
|
"analyticsRF": 1800,
|
|
"analyticsTopology": 1800,
|
|
"analyticsChannels": 1800,
|
|
"analyticsHashSizes": 3600,
|
|
"analyticsSubpaths": 3600,
|
|
"analyticsSubpathDetail": 3600,
|
|
"nodeAnalytics": 60,
|
|
"nodeSearch": 10,
|
|
"invalidationDebounce": 30,
|
|
"_comment": "All values in seconds. Server uses these directly. Client fetches via /api/config/cache."
|
|
},
|
|
"liveMap": {
|
|
"propagationBufferMs": 5000,
|
|
"_comment": "How long (ms) to buffer incoming observations of the same packet before animating. Mesh packets propagate through multiple paths and arrive at different observers over several seconds. This window collects all observations of a single transmission so the live map can animate them simultaneously as one realistic propagation event. Set higher for wide meshes with many observers, lower for snappier animations. 5000ms captures ~95% of observations for a typical mesh."
|
|
},
|
|
"timestamps": {
|
|
"defaultMode": "ago",
|
|
"timezone": "local",
|
|
"formatPreset": "iso",
|
|
"customFormat": "",
|
|
"allowCustomFormat": false,
|
|
"_comment": "defaultMode: ago|local|iso. timezone: local|utc. formatPreset: iso|us|eu. customFormat: strftime-style (requires allowCustomFormat: true)."
|
|
},
|
|
"packetStore": {
|
|
"maxMemoryMB": 1024,
|
|
"estimatedPacketBytes": 450,
|
|
"retentionHours": 168,
|
|
"_comment": "In-memory packet store. maxMemoryMB caps RAM usage. retentionHours: only packets younger than this are loaded on startup and kept in memory (0 = unlimited, not recommended for large DBs — causes OOM on cold start). 168 = 7 days. Must be ≤ retention.packetDays * 24.",
|
|
"_comment_gomemlimit": "On startup the server reads GOMEMLIMIT from the environment if set; otherwise it derives a Go runtime soft memory limit of maxMemoryMB * 1.5 and applies it via debug.SetMemoryLimit. This forces aggressive GC under cgroup pressure so the process self-throttles before the kernel SIGKILLs it. To override, set GOMEMLIMIT explicitly (e.g. GOMEMLIMIT=850MiB). See issue #836."
|
|
},
|
|
"resolvedPath": {
|
|
"backfillHours": 24,
|
|
"_comment": "How far back (hours) the async backfill scans for observations with NULL resolved_path. Default: 24. Set higher to backfill older data, lower to speed up startup."
|
|
},
|
|
"neighborGraph": {
|
|
"maxAgeDays": 5,
|
|
"_comment": "Neighbor edges older than this many days are pruned on startup and daily. Default: 5."
|
|
},
|
|
"batteryThresholds": {
|
|
"lowMv": 3300,
|
|
"criticalMv": 3000,
|
|
"_comment": "Voltage cutoffs (millivolts) for the per-node battery trend chart on /node-analytics. Latest sample below lowMv shows the node as ⚠️ Low; below criticalMv shows 🪫 Critical. Both default to 3300 / 3000 if omitted. Source data: observer_metrics.battery_mv populated from observer status messages; only nodes that are themselves observers (matching pubkey ↔ observer id) yield a series. Issue #663."
|
|
},
|
|
"_comment_mqttSources": "Each source connects to an MQTT broker. topics: what to subscribe to. iataFilter: only ingest packets from these regions (optional). region: default IATA region for this source — used when packet/topic doesn't specify one (optional, priority: payload > topic > this field).",
|
|
"_comment_channelKeys": "Hex keys for decrypting channel messages. Key name = channel display name. public channel key is well-known.",
|
|
"_comment_hashChannels": "Channel names whose keys are derived via SHA256. Key = SHA256(name)[:16]. Listed here so the ingestor can auto-derive keys.",
|
|
"_comment_defaultRegion": "IATA code shown by default in region filters.",
|
|
"_comment_mapDefaults": "Initial map center [lat, lon] and zoom level.",
|
|
"_comment_regions": "IATA code → display name mapping for the region filter UI. Each key is a 3-letter IATA code that an observer is tagged with (resolved priority: MQTT payload `region` field > topic-derived region > mqttSources.region). Observers without an IATA tag will not appear under any region filter — only under 'All Regions'. The region filter dropdown shows one entry per code listed here PLUS any extra IATA codes the server discovers from observers at runtime (so you can omit codes here and they will still be selectable, just labelled with the bare IATA code instead of a friendly name). Selecting 'All Regions' (or no region) returns results from every observer including those with no IATA tag; selecting one or more codes restricts results to packets observed by observers tagged with those codes. The reserved value 'All' (case-insensitive) is treated as 'no filter' on the server, so the URL ?region=All behaves identically to omitting the param. Issue #770."
|
|
}
|