Files
meshcore-analyzer/cmd/ingestor/README.md
T
Kpa-clawbot f4cf2acbc0 perf: cancelled writes + ingestor I/O + threshold tests (#1120 follow-up) (#1167)
Red commit: e964ec9c46 (CI run: pending —
workflow only triggers on PR open)

Partial fix for #1120 — finishes the four follow-up items left open
after PR #1123 (cancelled writes, ingestor I/O, threshold-flag tests,
docs).

## What's done

- **`cancelledWriteBytesPerSec`** — server `/proc/self/io` parser
handles `cancelled_write_bytes`; `/api/perf/io` exposes the per-second
rate; Perf page renders it next to Read/Write with ⚠️ when sustained >1
MB/s.
- **Ingestor `/proc/<pid>/io`** — `cmd/ingestor/stats_file.go` samples
its own `/proc/self/io` each tick and includes `procIO` in the snapshot.
The server's `/api/perf/io` reads it and surfaces `.ingestor`. Frontend
renders an `Ingestor process` Disk I/O block alongside the existing
`server process` block (issue mockup: "Both ingestor and server").
- **Threshold + anomaly tests** — `test-perf-disk-io-1120.js` now
asserts ⚠️ fires/suppresses on WAL>100MB, cache_hit<90%, and the
backfill-rate-vs-tx-rate guard with the `tx_inserted >= 100` baseline
floor. Drops the tautological `|| ... === false` short-circuits flagged
in MINOR m4.
- **Docs (m8)** — `config.example.json` adds `_comment_ingestorStats`
(env var, default path, shared-tmp security note);
`cmd/ingestor/README.md` adds `CORESCOPE_INGESTOR_STATS` to the env-var
table plus a `Stats file` section.

## What's NOT done (deferred)

m1 sync.Map → map+RWMutex, m2 perfIOMu rate caching, m3 negative
cacheSize translation, m5 deterministic-write test, m7 ctx-aware
shutdown — pure polish; will file a follow-up issue if the operator
wants them tracked.

## TDD

- Red: `e964ec9` — adds failing tests + stub field/handler shape
(cancelled missing from struct, ingestor stub returns nil, ingestor
procIO absent).
- Green: `1240703` — wires up the parser case, ingestor sampler,
frontend rendering, docs.

E2E assertion added: test-perf-disk-io-1120.js:108

---------

Co-authored-by: clawbot <clawbot@users.noreply.github.com>
Co-authored-by: Kpa-clawbot <bot@kpa-clawbot.local>
Co-authored-by: Kpa-clawbot <bot@kpa-clawbot>
2026-05-08 16:29:23 -07:00

149 lines
5.0 KiB
Markdown

# MeshCore MQTT Ingestor (Go)
Standalone MQTT ingestion service for CoreScope. Connects to MQTT brokers, decodes raw MeshCore packets, and writes to the same SQLite database used by the Node.js web server.
This is the first step of a larger Go rewrite — separating MQTT ingestion from the web server.
## Architecture
```
MQTT Broker(s) → Go Ingestor → SQLite DB ← Node.js Web Server
(this binary) (shared)
```
- **Single static binary** — no runtime dependencies, no CGO
- **SQLite** via `modernc.org/sqlite` (pure Go)
- **MQTT** via `github.com/eclipse/paho.mqtt.golang`
- Runs **alongside** the Node.js server — they share the DB file
- Does NOT serve HTTP/WebSocket — that stays in Node.js
## Build
Requires Go 1.22+.
```bash
cd cmd/ingestor
go build -o corescope-ingestor .
```
Cross-compile for Linux (e.g., for the production VM):
```bash
GOOS=linux GOARCH=amd64 go build -o corescope-ingestor .
```
## Run
```bash
./corescope-ingestor -config /path/to/config.json
```
The config file uses the same format as the Node.js `config.json`. The ingestor reads the `mqttSources` array (or legacy `mqtt` object) and `dbPath` fields.
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `DB_PATH` | SQLite database path | `data/meshcore.db` |
| `MQTT_BROKER` | Single MQTT broker URL (overrides config) | — |
| `MQTT_TOPIC` | MQTT topic (used with `MQTT_BROKER`) | `meshcore/#` |
| `CORESCOPE_INGESTOR_STATS` | Path to the per-second stats JSON file consumed by the server's `/api/perf/io` and `/api/perf/write-sources` endpoints (#1120) | `/tmp/corescope-ingestor-stats.json` |
### Stats file (`CORESCOPE_INGESTOR_STATS`)
Every second the ingestor publishes a JSON snapshot of its counters
(`tx_inserted`, `obs_inserted`, `walCommits`, `backfillUpdates.*`, etc.) plus
a `procIO` block sampled from `/proc/self/io` (read/write/cancelled bytes per
second + syscall counts). The server reads this file and surfaces the data on
the Perf page so operators can self-diagnose write-volume anomalies.
The writer uses `O_NOFOLLOW | O_CREAT | O_TRUNC` mode `0o600`, so a
pre-planted symlink at the path cannot be used to clobber an arbitrary file.
**Security note:** the default lives in `/tmp`, which is world-writable on
most hosts (sticky bit only protects deletion, not creation). On
shared/multi-tenant hosts, override `CORESCOPE_INGESTOR_STATS` to point at a
private directory (e.g. `/var/lib/corescope/ingestor-stats.json`) that only
the corescope user can write to.
### Minimal Config
```json
{
"dbPath": "data/meshcore.db",
"mqttSources": [
{
"name": "local",
"broker": "mqtt://localhost:1883",
"topics": ["meshcore/#"]
}
]
}
```
### Full Config (same as Node.js)
The ingestor reads these fields from the existing `config.json`:
- `mqttSources[]` — array of MQTT broker connections
- `name` — display name for logging
- `broker` — MQTT URL (`mqtt://`, `mqtts://`)
- `username` / `password` — auth credentials
- `topics` — array of topic patterns to subscribe
- `iataFilter` — optional regional filter
- `mqtt` — legacy single-broker config (auto-converted to `mqttSources`)
- `dbPath` — SQLite DB path (default: `data/meshcore.db`)
## Test
```bash
cd cmd/ingestor
go test -v ./...
```
## What It Does
1. Connects to configured MQTT brokers with auto-reconnect
2. Subscribes to mesh packet topics (e.g., `meshcore/+/+/packets`)
3. Receives raw hex packets via JSON messages (`{ "raw": "...", "SNR": ..., "RSSI": ... }`)
4. Decodes MeshCore packet headers, paths, and payloads (ported from `decoder.js`)
5. Computes content hashes (path-independent, SHA-256-based)
6. Writes to SQLite: `transmissions` + `observations` tables
7. Upserts `nodes` from decoded ADVERT packets (with validation)
8. Upserts `observers` from MQTT topic metadata
## Schema Compatibility
The Go ingestor creates the same v3 schema as the Node.js server:
- `transmissions` — deduplicated by content hash
- `observations` — per-observer sightings with `observer_idx` (rowid reference)
- `nodes` — mesh nodes discovered from adverts
- `observers` — MQTT feed sources
Both processes can write to the same DB concurrently (SQLite WAL mode).
## What's Not Ported (Yet)
- Companion bridge format (Format 2 — `meshcore/advertisement`, channel messages, etc.)
- Channel key decryption (GRP_TXT encrypted payload decryption)
- WebSocket broadcast to browsers
- In-memory packet store
- Cache invalidation
These stay in the Node.js server for now.
## Files
```
cmd/ingestor/
main.go — entry point, MQTT connect, message handler
decoder.go — MeshCore packet decoder (ported from decoder.js)
decoder_test.go — decoder tests (25 tests, golden fixtures)
db.go — SQLite writer (schema-compatible with db.js)
db_test.go — DB tests (schema validation, insert/upsert, E2E)
config.go — config struct + loader
util.go — shared utilities
go.mod / go.sum — Go module definition
```