RED commit: `1cd25f7b` — CI (failing on assertion): https://github.com/Kpa-clawbot/CoreScope/actions?query=sha%3A1cd25f7b1bdd0091f689dd64ce1bfec6d031191f Fixes #1212 ## Root cause NOT that `AutoReconnect` was off — it was set; `MaxReconnectInterval=30s` was set (PR #949); a `SetReconnectingHandler` was wired. The defect was an **observability gap**: `SetReconnectingHandler` fires only INSIDE paho's reconnect goroutine. If that goroutine never iterates (status race after the recovered handler panic at 21:07:13, or an internal abort), operators see ONLY the `disconnected: pingresp not received` line and then total silence. They cannot distinguish "paho is patiently retrying" from "paho gave up and the goroutine is gone." That ambiguity is what turned a 30s blip into 6h of downtime. ## Changes ### `cmd/ingestor/main.go` — `SetConnectionAttemptHandler` Fires on every TCP/TLS dial — the initial `Connect()` AND every reconnect — independent of paho's internal reconnect-loop state. Logs: ``` MQTT [staging] connection attempt #1 to tcp://broker:1883 MQTT [staging] connection attempt #2 to tcp://broker:1883 ``` Per-source attempt counter via `atomic.AddInt64`. ### `cmd/ingestor/mqtt_watchdog.go` (new) — per-source stall watchdog Satisfies the watchdog acceptance criterion. Even when paho reports `connected`, if no MQTT messages have flowed for >5m, log a WARN line every 60s: ``` MQTT [staging] WATCHDOG: client reports connected to tcp://broker:1883 but no messages received for 7m30s (threshold 5m) — possible half-open socket or upstream stall ``` Catches half-open TCP and broker-accepted-but-not-forwarding scenarios that look "connected" to paho. Hot-path cost: one `atomic.StoreInt64` per inbound message. Watchdog scans the registry once a minute. ### Tests (`cmd/ingestor/mqtt_reconnect_test.go`, new) - `TestBuildMQTTOpts_InstrumentsConnectionAttempt` — asserts `OnConnectAttempt` is wired in `buildMQTTOpts`. - `TestMQTTStallWatchdog_FiresOnSilentSource` — connected + 10m silent + 5m threshold → stall flagged. - `TestMQTTStallWatchdog_QuietWhenRecent` — recent message → no stall. - `TestMQTTStallWatchdog_QuietWhenDisconnected` — disconnected → no stall (paho's reconnect logging covers it). ## TDD - RED `1cd25f7b` — 2 assertion failures (compile OK, stub returns no-stall, `OnConnectAttempt` nil). - GREEN `2527be6f` — implementation; all ingestor tests pass. ## Out of scope - Slice-bounds decode panic (#1211, separate PR). - A full in-process MQTT broker integration test would require a new dep (mochi-mqtt) — the observability and watchdog behaviors are independently verifiable by the unit tests above, and the reconnect path itself is paho's responsibility (we already test it's configured via `mqtt_opts_test.go`). --------- Co-authored-by: bot <bot@example.com> Co-authored-by: OpenClaw Bot <bot@openclaw.local> Co-authored-by: corescope-bot <bot@corescope.local> Co-authored-by: openclaw-bot <openclaw-bot@users.noreply.github.com>
MeshCore MQTT Ingestor (Go)
Standalone MQTT ingestion service for CoreScope. Connects to MQTT brokers, decodes raw MeshCore packets, and writes to the same SQLite database used by the Node.js web server.
This is the first step of a larger Go rewrite — separating MQTT ingestion from the web server.
Architecture
MQTT Broker(s) → Go Ingestor → SQLite DB ← Node.js Web Server
(this binary) (shared)
- Single static binary — no runtime dependencies, no CGO
- SQLite via
modernc.org/sqlite(pure Go) - MQTT via
github.com/eclipse/paho.mqtt.golang - Runs alongside the Node.js server — they share the DB file
- Does NOT serve HTTP/WebSocket — that stays in Node.js
Build
Requires Go 1.22+.
cd cmd/ingestor
go build -o corescope-ingestor .
Cross-compile for Linux (e.g., for the production VM):
GOOS=linux GOARCH=amd64 go build -o corescope-ingestor .
Run
./corescope-ingestor -config /path/to/config.json
The config file uses the same format as the Node.js config.json. The ingestor reads the mqttSources array (or legacy mqtt object) and dbPath fields.
Environment Variables
| Variable | Description | Default |
|---|---|---|
DB_PATH |
SQLite database path | data/meshcore.db |
MQTT_BROKER |
Single MQTT broker URL (overrides config) | — |
MQTT_TOPIC |
MQTT topic (used with MQTT_BROKER) |
meshcore/# |
CORESCOPE_INGESTOR_STATS |
Path to the per-second stats JSON file consumed by the server's /api/perf/io and /api/perf/write-sources endpoints (#1120) |
/tmp/corescope-ingestor-stats.json |
Stats file (CORESCOPE_INGESTOR_STATS)
Every second the ingestor publishes a JSON snapshot of its counters
(tx_inserted, obs_inserted, walCommits, backfillUpdates.*, etc.) plus
a procIO block sampled from /proc/self/io (read/write/cancelled bytes per
second + syscall counts). The server reads this file and surfaces the data on
the Perf page so operators can self-diagnose write-volume anomalies.
The writer uses O_NOFOLLOW | O_CREAT | O_TRUNC mode 0o600, so a
pre-planted symlink at the path cannot be used to clobber an arbitrary file.
Security note: the default lives in /tmp, which is world-writable on
most hosts (sticky bit only protects deletion, not creation). On
shared/multi-tenant hosts, override CORESCOPE_INGESTOR_STATS to point at a
private directory (e.g. /var/lib/corescope/ingestor-stats.json) that only
the corescope user can write to.
Minimal Config
{
"dbPath": "data/meshcore.db",
"mqttSources": [
{
"name": "local",
"broker": "mqtt://localhost:1883",
"topics": ["meshcore/#"]
}
]
}
Full Config (same as Node.js)
The ingestor reads these fields from the existing config.json:
mqttSources[]— array of MQTT broker connectionsname— display name for loggingbroker— MQTT URL (mqtt://,mqtts://)username/password— auth credentialstopics— array of topic patterns to subscribeiataFilter— optional regional filter
mqtt— legacy single-broker config (auto-converted tomqttSources)dbPath— SQLite DB path (default:data/meshcore.db)
Test
cd cmd/ingestor
go test -v ./...
What It Does
- Connects to configured MQTT brokers with auto-reconnect
- Subscribes to mesh packet topics (e.g.,
meshcore/+/+/packets) - Receives raw hex packets via JSON messages (
{ "raw": "...", "SNR": ..., "RSSI": ... }) - Decodes MeshCore packet headers, paths, and payloads (ported from
decoder.js) - Computes content hashes (path-independent, SHA-256-based)
- Writes to SQLite:
transmissions+observationstables - Upserts
nodesfrom decoded ADVERT packets (with validation) - Upserts
observersfrom MQTT topic metadata
Schema Compatibility
The Go ingestor creates the same v3 schema as the Node.js server:
transmissions— deduplicated by content hashobservations— per-observer sightings withobserver_idx(rowid reference)nodes— mesh nodes discovered from advertsobservers— MQTT feed sources
Both processes can write to the same DB concurrently (SQLite WAL mode).
What's Not Ported (Yet)
- Companion bridge format (Format 2 —
meshcore/advertisement, channel messages, etc.) - Channel key decryption (GRP_TXT encrypted payload decryption)
- WebSocket broadcast to browsers
- In-memory packet store
- Cache invalidation
These stay in the Node.js server for now.
Files
cmd/ingestor/
main.go — entry point, MQTT connect, message handler
decoder.go — MeshCore packet decoder (ported from decoder.js)
decoder_test.go — decoder tests (25 tests, golden fixtures)
db.go — SQLite writer (schema-compatible with db.js)
db_test.go — DB tests (schema validation, insert/upsert, E2E)
config.go — config struct + loader
util.go — shared utilities
go.mod / go.sum — Go module definition